dynast.preprocessing.aggregation
Module Contents
Functions
|
Read mutation rates CSV as a pandas dataframe. |
|
Read aggregates CSV as a pandas dataframe. |
|
Merge multiple aggregate dataframes into one. |
|
Calculate mutation rate for each pair of bases. |
|
Aggregate conversion counts for each pair of bases. |
- dynast.preprocessing.aggregation.read_rates(rates_path)
Read mutation rates CSV as a pandas dataframe.
- Parameters
rates_path (str) – path to rates CSV
- Returns
rates dataframe
- Return type
pandas.DataFrame
- dynast.preprocessing.aggregation.read_aggregates(aggregates_path)
Read aggregates CSV as a pandas dataframe.
- Parameters
aggregates_path (str) – path to aggregates CSV
- Returns
aggregates dataframe
- Return type
pandas.DataFrame
- dynast.preprocessing.aggregation.merge_aggregates(*dfs)
Merge multiple aggregate dataframes into one.
- Parameters
*dfs –
dataframes to merge
- Returns
merged dataframe
- Return type
pandas.DataFrame
- dynast.preprocessing.aggregation.calculate_mutation_rates(df_counts, rates_path, group_by=None)
Calculate mutation rate for each pair of bases.
- Parameters
df_counts (pandas.DataFrame) – counts dataframe, with complemented reverse strand bases
rates_path (str) – path to write rates CSV
group_by (list) – column(s) to group calculations by, defaults to None, which combines all rows
- Returns
path to rates CSV
- Return type
str
- dynast.preprocessing.aggregation.aggregate_counts(df_counts, aggregates_path, conversions=frozenset([('TC',)]))
Aggregate conversion counts for each pair of bases.
- Parameters
df_counts (pandas.DataFrame) – counts dataframe, with complemented reverse strand bases
aggregates_path (str) – path to write aggregate CSV
conversions (list, optional) – conversion(s) in question, defaults to frozenset([(‘TC’,)])
- Returns
path to aggregate CSV that was written
- Return type
str