Package org.snpeff.fileIterator
Class FastaFileIterator
- java.lang.Object
-
- org.snpeff.fileIterator.FileIterator<java.lang.String>
-
- org.snpeff.fileIterator.FastaFileIterator
-
- All Implemented Interfaces:
java.lang.Iterable<java.lang.String>
,java.util.Iterator<java.lang.String>
public class FastaFileIterator extends FileIterator<java.lang.String>
Opens a fasta file and iterates over all fasta sequences in the file- Author:
- pcingola
-
-
Field Summary
Fields Modifier and Type Field Description static char[]
TRANSCRIPT_ID_SEPARATORS
static java.lang.String
TRANSCRIPT_ID_SEPARATORS_REGEX
-
Constructor Summary
Constructors Constructor Description FastaFileIterator(java.lang.String fastaFileName)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.List<java.lang.String>
fastaHeader2Ids()
Try to parse IDs from a fasta headerjava.lang.String
getHeader()
Current sequence headerjava.lang.String
getName()
Sequence name (first 'word') It extracts the characters after the leading '>' and before the first space, then removes leading 'chr', 'chr:', etc.java.lang.String
getTranscriptId()
Get transcript name from FASTA header (ENSEMBL protein files) Format example: '>ENSP00000356130 pep:known chromosome:GRCh37:1:205111633:205180694:-1 gene:ENSG00000133059 transcript:ENST00000367162'protected java.lang.String
readNext()
Read a sequence from the file-
Methods inherited from class org.snpeff.fileIterator.FileIterator
close, countNewLineChars, getFilePointer, getLine, getLineNum, guessNewLineChars, hasNext, hasSeek, init, isDebug, iterator, load, next, readLine, ready, remove, seek, setAutoClose, setDebug, setVerbose, toString
-
-
-
-
Method Detail
-
fastaHeader2Ids
public java.util.List<java.lang.String> fastaHeader2Ids()
Try to parse IDs from a fasta header
-
getHeader
public java.lang.String getHeader()
Current sequence header
-
getName
public java.lang.String getName()
Sequence name (first 'word') It extracts the characters after the leading '>' and before the first space, then removes leading 'chr', 'chr:', etc.
-
getTranscriptId
public java.lang.String getTranscriptId()
Get transcript name from FASTA header (ENSEMBL protein files) Format example: '>ENSP00000356130 pep:known chromosome:GRCh37:1:205111633:205180694:-1 gene:ENSG00000133059 transcript:ENST00000367162'
-
readNext
protected java.lang.String readNext()
Read a sequence from the file- Specified by:
readNext
in classFileIterator<java.lang.String>
-
-