dynast.preprocessing.coverage
Module Contents
Functions
|
Read coverage CSV as a dictionary. |
|
Calculate converage for a specific contig. This function is designed to |
|
Calculate coverage of each genomic position per barcode. |
Attributes
- dynast.preprocessing.coverage.COVERAGE_PARSER
- dynast.preprocessing.coverage.read_coverage(coverage_path)
Read coverage CSV as a dictionary.
- Parameters
coverage_path (str) – path to coverage CSV
- Returns
coverage as a nested dictionary
- Return type
dict
- dynast.preprocessing.coverage.calculate_coverage_contig(counter, lock, bam_path, contig, indices, alignments=None, umi_tag=None, barcode_tag=None, gene_tag='GX', barcodes=None, temp_dir=None, update_every=50000, velocity=True)
Calculate converage for a specific contig. This function is designed to be called as a separate process.
- Parameters
counter (multiprocessing.Value) – counter that keeps track of how many reads have been processed
lock (multiprocessing.Lock) – semaphore for the counter so that multiple processes do not modify it at the same time
bam_path (str) – path to alignment BAM file
contig (str) – only reads that map to this contig will be processed
indices (list) – genomic positions to consider
alignments (set, optional) – set of (read_id, alignment_index) tuples to process. All alignments are processed if this option is not provided.
umi_tag (str, optional) – BAM tag that encodes UMI, if not provided, NA is output in the umi column, defaults to None
barcode_tag (str, optional) – BAM tag that encodes cell barcode, if not provided, NA is output in the barcode column, defaults to None
gene_tag (str, optional) – BAM tag that encodes gene assignment, defaults to GX
barcodes (list, optional) – list of barcodes to be considered. All barcodes are considered if not provided, defaults to None
temp_dir (str, optional) – path to temporary directory, defaults to None
update_every (int, optional) – update the counter every this many reads, defaults to 30000
velocity (bool, optional) – whether or not velocities were assigned
- Returns
coverag
- Return type
dict
- dynast.preprocessing.coverage.calculate_coverage(bam_path, conversions, coverage_path, alignments=None, umi_tag=None, barcode_tag=None, gene_tag='GX', barcodes=None, temp_dir=None, velocity=True)
Calculate coverage of each genomic position per barcode.
- Parameters
bam_path (str) – path to alignment BAM file
conversions (dictionary) – dictionary of contigs as keys and sets of genomic positions as values that indicates positions where conversions were observed
coverage_path (str) – path to write coverage CSV
alignments (set, optional) – set of (read_id, alignment_index) tuples to process. All alignments are processed if this option is not provided.
umi_tag (str, optional) – BAM tag that encodes UMI, if not provided, NA is output in the umi column, defaults to None
barcode_tag (str, optional) – BAM tag that encodes cell barcode, if not provided, NA is output in the barcode column, defaults to None
gene_tag (str, optional) – BAM tag that encodes gene assignment, defaults to GX
barcodes (list, optional) – list of barcodes to be considered. All barcodes are considered if not provided, defaults to None
temp_dir (str, optional) – path to temporary directory, defaults to None
velocity (bool, optional) – whether or not velocities were assigned
- Returns
coverage CSV path
- Return type
str