`dynast.preprocessing.aggregation`

Module Contents

`read_rates`(rates_path)	Read mutation rates CSV as a pandas dataframe.
`read_aggregates`(aggregates_path)	Read aggregates CSV as a pandas dataframe.
`merge_aggregates`(*dfs)	Merge multiple aggregate dataframes into one.
`calculate_mutation_rates`(df_counts, rates_path, group_by=None)	Calculate mutation rate for each pair of bases.
`aggregate_counts`(df_counts, aggregates_path, conversions=frozenset([('TC', )]))	Aggregate conversion counts for each pair of bases.

dynast.preprocessing.aggregation.read_rates(rates_path)

Read mutation rates CSV as a pandas dataframe.

dynast.preprocessing.aggregation.read_aggregates(aggregates_path)

Read aggregates CSV as a pandas dataframe.

dynast.preprocessing.aggregation.merge_aggregates(*dfs)

Merge multiple aggregate dataframes into one.

Parameters

*dfs –

dataframes to merge

Returns

merged dataframe

Return type

pandas.DataFrame

dynast.preprocessing.aggregation.calculate_mutation_rates(df_counts, rates_path, group_by=None)

Calculate mutation rate for each pair of bases.

Parameters

df_counts (pandas.DataFrame) – counts dataframe, with complemented reverse strand bases
rates_path (str) – path to write rates CSV
group_by (list) – column(s) to group calculations by, defaults to None, which combines all rows

Returns

path to rates CSV

Return type

str

dynast.preprocessing.aggregation.aggregate_counts(df_counts, aggregates_path, conversions=frozenset([('TC',)]))

Aggregate conversion counts for each pair of bases.

Parameters

df_counts (pandas.DataFrame) – counts dataframe, with complemented reverse strand bases
aggregates_path (str) – path to write aggregate CSV
conversions (list, optional) – conversion(s) in question, defaults to frozenset([(‘TC’,)])

Returns

path to aggregate CSV that was written

Return type

str