Recovers expression using the SAVER method.

saver(x, do.fast = TRUE, ncores = 1, size.factor = NULL,
  npred = NULL, pred.cells = NULL, pred.genes = NULL,
  pred.genes.only = FALSE, null.model = FALSE, mu = NULL,
  estimates.only = FALSE)

Arguments

x

An expression count matrix. The rows correspond to genes and the columns correspond to cells. Can be sparse.

do.fast

Approximates the prediction step. Default is TRUE.

ncores

Number of cores to use. Default is 1.

size.factor

Vector of cell size normalization factors. If x is already normalized or normalization is not desired, use size.factor = 1. Default uses mean library size normalization.

npred

Number of genes for regression prediction. Selects the top npred genes in terms of mean expression for regression prediction. Default is all genes.

pred.cells

Indices of cells to perform regression prediction. Default is all cells.

pred.genes

Indices of specific genes to perform regression prediction. Overrides npred. Default is all genes.

pred.genes.only

Return expression levels of only pred.genes. Default is FALSE (returns expression levels of all genes).

null.model

Whether to use mean gene expression as prediction.

mu

Matrix of prior means.

estimates.only

Only return SAVER estimates. Default is FALSE.

Value

If `estimates.only = TRUE`, then a matrix of SAVER estimates.

If `estimates.only = FALSE`, a list with the following components

estimate

Recovered (normalized) expression.

se

Standard error of estimates.

info

Information about dataset.

The info element is a list with the following components:
size.factor

Size factor used for normalization.

maxcor

Maximum absolute correlation for each gene. 2 if not calculated

lambda.max

Smallest value of lambda which gives the null model.

lambda.min

Value of lambda from which the prediction model is used

sd.cv

Difference in the number of standard deviations in deviance between the model with lowest cross-validation error and the null model

pred.time

Time taken to generate predictions.

var.time

Time taken to estimate variance.

maxcor

Maximum absolute correlation cutoff used to determine if a gene should be predicted.

lambda.coefs

Coefficients for estimating lambda with lowest cross-validation error.

total.time

Total time for SAVER estimation.

Details

The SAVER method starts by estimating the prior mean and variance for the true expression level for each gene and cell. The prior mean is obtained through predictions from a LASSO Poisson regression for each gene implemented using the glmnet package. Then, the variance is estimated through maximum likelihood assuming constant variance, Fano factor, or coefficient of variation variance structure for each gene. The posterior distribution is calculated and the posterior mean is reported as the SAVER estimate.

Examples

data("linnarsson")
# NOT RUN { system.time(linnarsson_saver <- saver(linnarsson, ncores = 12)) # }
# predictions for top 5 highly expressed genes
# NOT RUN { saver2 <- saver(linnarsson, npred = 5) # }
# predictions for certain genes
# NOT RUN { genes <- c("Thy1", "Mbp", "Stim2", "Psmc6", "Rps19") genes.ind <- which(rownames(linnarsson) %in% genes) saver3 <- saver(linnarsson, pred.genes = genes.ind) # }
# return only certain genes
# NOT RUN { saver4 <- saver(linnarsson, pred.genes = genes.ind, pred.genes.only = TRUE) # }