This function projects a query SingleCellExperiment
object onto the SIR (supervised independent
component) space of a reference SingleCellExperiment
object. The SVD of the reference data is
computed on conditional means per cell type, and the query data is projected based on these reference
components.
projectSIR(
query_data,
reference_data,
query_cell_type_col,
ref_cell_type_col,
cell_types = NULL,
multiple_cond_means = TRUE,
assay_name = "logcounts",
cumulative_variance_threshold = 0.7,
n_neighbor = 1
)
A SingleCellExperiment
object containing numeric expression matrix for the query cells.
A SingleCellExperiment
object containing numeric expression matrix for the reference cells.
A character string specifying the column in the colData
of query_data
that identifies the cell types.
A character string specifying the column in the colData
of reference_data
that identifies the cell types.
A character vector of cell types for which to compute conditional means in the reference data.
A logical value indicating whether to compute multiple conditional means per cell type
(through PCA and clustering). Defaults to TRUE
.
A character string specifying the assay name on which to perform computations. Defaults to "logcounts"
.
A numeric value between 0 and 1 specifying the variance threshold for PCA
when computing multiple conditional means. Defaults to 0.7
.
An integer specifying the number of nearest neighbors for clustering when computing multiple
conditional means. Defaults to 1
.
A list containing:
A matrix of the conditional means computed for the reference data.
The rotation matrix obtained from the SVD of the conditional means.
A data.frame
containing the SIR projections for both the reference and query datasets.
The percentage of variance explained by each component of the SIR projection.
The genes used for the projection (SVD) must be present in both the reference and query datasets. The function first computes conditional means for each cell type in the reference data, then performs SVD on these conditional means to obtain the rotation matrix used for projecting both the reference and query datasets. The query data is centered and scaled based on the reference data.