dynast.benchmarking.simulation

Module Contents

Functions

generate_sequence(k, seed=None)

Generate a random genome sequence of length k.

simulate_reads(sequence, p_e, p_c, pi, l=100, n=100, seed=None)

Simulate n reads of length l from a sequence.

initializer(model)

estimate(df_counts, p_e, p_c, pi, estimate_p_e=False, estimate_p_c=False, estimate_pi=True, model=None, nasc=False)

_simulate(p_e, p_c, pi, sequence=None, k=10000, l=100, n=100, estimate_p_e=False, estimate_p_c=False, estimate_pi=True, seed=None, model=None, nasc=False)

simulate(p_e, p_c, pi, sequence=None, k=10000, l=100, n=100, n_runs=16, n_threads=8, estimate_p_e=False, estimate_p_c=False, estimate_pi=True, model=None, nasc=False)

simulate_batch(p_e, p_c, pi, l, n, estimate_p_e, estimate_p_c, estimate_pi, n_runs, n_threads, model, nasc=False)

Helper function to run simulations in batches.

plot_estimations(X, Y, n_runs, means, truth, ax=None, box=True, tick_decimals=1, title=None, xlabel=None, ylabel=None)

Attributes

__model

_pi_model

dynast.benchmarking.simulation.generate_sequence(k, seed=None)

Generate a random genome sequence of length k.

Parameters
  • k (int) – length of the sequence

  • seed (int, optional) – random seed, defaults to None

Returns

a random sequence

Return type

str

dynast.benchmarking.simulation.simulate_reads(sequence, p_e, p_c, pi, l=100, n=100, seed=None)

Simulate n reads of length l from a sequence.

Parameters
  • sequence (str) – sequence to generate the reads from

  • p_e (float) – background specific mutation rate. This is the rate a specific base mutates to another specific base (i.e. T>C, A>G, …)

  • p_c (float) – T>C mutation rate in labeled RNA

  • pi (float) – fraction of labeled RNA

  • l (int, optional) – length of each read, defaults to 100

  • n (int, optional) – number of reads to simulate, defaults to 100

  • seed (int, optional) – random seed, defaults to None

Returns

a dataframe with each read as a row and the number of conversions and base content as the columns

Return type

pandas.DataFrame

dynast.benchmarking.simulation.__model
dynast.benchmarking.simulation._pi_model
dynast.benchmarking.simulation.initializer(model)
dynast.benchmarking.simulation.estimate(df_counts, p_e, p_c, pi, estimate_p_e=False, estimate_p_c=False, estimate_pi=True, model=None, nasc=False)
dynast.benchmarking.simulation._simulate(p_e, p_c, pi, sequence=None, k=10000, l=100, n=100, estimate_p_e=False, estimate_p_c=False, estimate_pi=True, seed=None, model=None, nasc=False)
dynast.benchmarking.simulation.simulate(p_e, p_c, pi, sequence=None, k=10000, l=100, n=100, n_runs=16, n_threads=8, estimate_p_e=False, estimate_p_c=False, estimate_pi=True, model=None, nasc=False)
dynast.benchmarking.simulation.simulate_batch(p_e, p_c, pi, l, n, estimate_p_e, estimate_p_c, estimate_pi, n_runs, n_threads, model, nasc=False)

Helper function to run simulations in batches.

dynast.benchmarking.simulation.plot_estimations(X, Y, n_runs, means, truth, ax=None, box=True, tick_decimals=1, title=None, xlabel=None, ylabel=None)