`dynast.estimation.p_e`

Module Contents

Functions

`read_p_e`(p_e_path, group_by=None)	Read p_e CSV as a dictionary, with group_by columns as keys.
`estimate_p_e_control`(df_counts, p_e_path, conversions=frozenset([('TC', )]))	Estimate background mutation rate of unlabeled RNA for a control sample
`estimate_p_e`(df_counts, p_e_path, conversions=frozenset([('TC', )]), group_by=None)	Estimate background mutation rate of unabeled RNA by calculating the
`estimate_p_e_nasc`(df_rates, p_e_path, group_by=None)	Estimate background mutation rate of unabeled RNA by calculating the

dynast.estimation.p_e.read_p_e(p_e_path, group_by=None)

Read p_e CSV as a dictionary, with group_by columns as keys.

Parameters

p_e_path (str) – path to CSV containing p_e values
group_by (list, optional) – columns to group by, defaults to None

Returns

dictionary with group_by columns as keys (tuple if multiple)

Return type

dictionary

dynast.estimation.p_e.estimate_p_e_control(df_counts, p_e_path, conversions=frozenset([('TC',)]))

Estimate background mutation rate of unlabeled RNA for a control sample by simply calculating the average mutation rate.

Parameters

df_counts (pandas.DataFrame) – Pandas dataframe containing number of each conversion and nucleotide content of each read
p_e_path (str) – path to output CSV containing p_e estimates
conversions (list, optional) – conversion(s) in question, defaults to frozenset([(‘TC’,)])

Returns

path to output CSV containing p_e estimates

Return type

str

dynast.estimation.p_e.estimate_p_e(df_counts, p_e_path, conversions=frozenset([('TC',)]), group_by=None)

Estimate background mutation rate of unabeled RNA by calculating the average mutation rate of all three nucleotides other than conversion[0].

Parameters

df_counts (pandas.DataFrame) – Pandas dataframe containing number of each conversion and nucleotide content of each read
p_e_path (str) – path to output CSV containing p_e estimates
conversions (list, optional) – conversion(s) in question, defaults to frozenset([(‘TC’,)])
group_by (list, optional) – columns to group by, defaults to None

Returns

path to output CSV containing p_e estimates

Return type

str

dynast.estimation.p_e.estimate_p_e_nasc(df_rates, p_e_path, group_by=None)

Estimate background mutation rate of unabeled RNA by calculating the average CT and GA mutation rates. This function imitates the procedure implemented in the NASC-seq pipeline (DOI: 10.1038/s41467-019-11028-9).

Parameters

df_counts (pandas.DataFrame) – Pandas dataframe containing number of each conversion and nucleotide content of each read
p_e_path (str) – path to output CSV containing p_e estimates
group_by (list, optional) – columns to group by, defaults to None

Returns

path to output CSV containing p_e estimates

Return type

str

dynast.estimation.p_e

Module Contents

Functions

`dynast.estimation.p_e`