API¶
seqtools.utils¶
-
seqtools.utils.
fileOpen
(fname, mode='rt', encoding='latin-1')[source]¶ Open a file, including gzip files
Parameters: - fname – The filename to open. Gzip files are distinguished by ending in ‘.gz’
- mode – File mode for opening [default ‘r’]
Returns: An open file handle
Note: the gzip module in python is REALLY slow, so this function uses subprocess and command-line gzip instead.
-
seqtools.utils.
revcomp
(sequence)[source]¶ Reverse complement a string
Parameters: sequence – The DNA string (all caps) This function includes a caching mechanism, so watch memory usage!
-
seqtools.utils.
sortVcfBySequence
(vcf, seqnames, seqmap=None)[source]¶ Sort a tabixed VCF file by sequence names
>>> import vcf >>> v = vcf.Reader('vcf.gz') >>> from seqtools.utils import sortVcfBySequence >>> w = vcf.Writer(open('sorted.vcf','w'),v) >>> faifile = open('ucsc.hg19.fasta.fai') >>> seqnames = [x.split(' ')[0] for x in faifile] >>> for rec in sortVcfBySequence(v,seqnames): w.write_record(rec) >>> w.close() >>> v.close()
seqtools.fastq¶
-
class
seqtools.fastq.
FastqRecord
(header, sequence, line3, quality)[source]¶ Very simple fastq class containing header, sequence, line3, and quality as strings
-
class
seqtools.fastq.
FastqRecord
(header, sequence, line3, quality)[source] Very simple fastq class containing header, sequence, line3, and quality as strings
seqtools.varscan¶
parse VCF output from VarScan and fix the ALT column to adhere with VCF specifications
-
seqtools.varscan.
fixLine
(line)[source]¶ Fix a varscan VCF line
Prints the output to stdout. Fixes the ALT column and also fixes the FREQ field to be a floating point value, easier for filtering.
Parameters: line – a pre-split and stripped varscan line
-
seqtools.varscan.
fixVarscanVcfFile
(iterable)[source]¶ Takes an interator over a varscan VCF file and returns an iterator over fixed VCF lines, including header.
Parameters: iterable – any iterable of the VCF lines Returns: An iterator over fixed VCF lines Usage is like so:
>>> from seqtools.varscan import fixVarscanVcfFile >>> varscan = fixVarscanVcfFile(open('filename.vcf','r')) >>> for line in varscan: print line
seqtools.vcf¶
seqtools.demultiplexer¶
seqtools.strucvar¶
-
seqtools.strucvar.crest.
crestLineToBedLines
(crestline, extrastring=None)[source]¶ Takes a line from a CREST file and turns it into two BED lines
Parameters: - crestline – a single string representing the CREST output
- extrastring – a single string to concatenate to the bed output, useful for including sample information, etc.
Returns: a string containing the two bed lines