dynast.estimation.pi
Module Contents
Functions
|
Read pi CSV as a dictionary. |
|
Multiprocessing initializer. |
|
Calculate the mean of a beta distribution. |
|
Calculate the mode of a beta distribution. |
|
Given a guess of the mean of a beta distribution, calculate beta |
|
Run MCMC to estimate the fraction of labeled RNA. |
|
Estimate the fraction of labeled RNA. |
Attributes
- dynast.estimation.pi.read_pi(pi_path, group_by=None)
Read pi CSV as a dictionary.
- Parameters
pi_path (str) – path to CSV containing pi values
group_by (list, optional) – columns that were used to group estimation, defaults to
None
- Returns
dictionary with barcodes and genes as keys
- Return type
dictionary
- dynast.estimation.pi._model
- dynast.estimation.pi.initializer(model)
Multiprocessing initializer. https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor
This initializer performs a one-time expensive initialization for each process.
- dynast.estimation.pi.beta_mean(alpha, beta)
Calculate the mean of a beta distribution. https://en.wikipedia.org/wiki/Beta_distribution
- Parameters
alpha (float) – first parameter of the beta distribution
beta (float) – second parameter of the beta distribution
- Returns
mean of the beta distribution
- Return type
float
- dynast.estimation.pi.beta_mode(alpha, beta)
Calculate the mode of a beta distribution. https://en.wikipedia.org/wiki/Beta_distribution
When the distribution is bimodal (alpha, beta < 1), this function returns nan.
- Parameters
alpha (float) – first parameter of the beta distribution
beta (float) – second parameter of the beta distribution
- Returns
mode of the beta distribution
- Return type
float
- dynast.estimation.pi.guess_beta_parameters(guess, strength=5)
Given a guess of the mean of a beta distribution, calculate beta distribution parameters such that the distribution is skewed by some strength toward the guess.
- Parameters
guess (float) – guess of the mean of the beta distribution
strength (int) – strength of the skew, defaults to 5
- Returns
beta distribution parameters (alpha, beta)
- Return type
(float, float)
- dynast.estimation.pi.fit_stan_mcmc(values, p_e, p_c, guess=0.5, model=None, n_chains=1, n_warmup=1000, n_iters=1000, seed=None)
Run MCMC to estimate the fraction of labeled RNA.
- Parameters
values (numpy.ndarray) –
array of three columns encoding a sparse array in (row, column, value) format, zero-indexed, where
row: number of conversions column: nucleotide content value: number of reads
p_e (float) – average mutation rate in unlabeled RNA
p_c (float) – average mutation rate in labeled RNA
guess (float, optional) – guess for the fraction of labeled RNA, defaults to 0.5
model (pystan.StanModel, optional) – pyStan model to run MCMC with, defaults to None if not provided, will try to use the _model global variable
n_chains (int, optional) – number of MCMC chains, defaults to 1
n_warmup (int, optional) – number of warmup iterations, defaults to 1000
n_iters (int, optional) – number of MCMC iterations, excluding any warmups, defaults to 1000
seed (int, optional) – random seed used for MCMC, defaults to None
- Returns
(guess, alpha, beta, pi)
- Return type
(float, float, float, float)
- dynast.estimation.pi.estimate_pi(df_aggregates, p_e, p_c, pi_path, group_by=None, p_group_by=None, n_threads=8, threshold=16, seed=None, nasc=False, model=None)
Estimate the fraction of labeled RNA.
- Parameters
df_aggregates (pandas.DataFrame) – Pandas dataframe containing aggregate values
p_e (float) – average mutation rate in unlabeled RNA
p_c (float) – average mutation rate in labeled RNA
pi_path (str) – path to write pi estimates
group_by (list, optional) – columns that were used to group cells, defaults to
None
p_group_by (list, optional) – columns that p_e/p_c estimation was grouped by, defaults to None
n_threads (int, optional) – number of threads, defaults to 8
threshold (int, optional) – any conversion-content pairs with fewer than this many reads will not be processed, defaults to 16
seed (int, optional) – random seed, defaults to None
nasc (bool, optional) – flag to change behavior to match NASC-seq pipeline. Specifically, the mode of the estimated Beta distribution is used as pi, defaults to False
model (pystan.StanModel, optional) – pyStan model to run MCMC with, defaults to None if not provided, will try to compile the module manually
- Returns
path to pi output
- Return type
str