This function generates a ggplot2 boxplot visualization of principal components (PCs) for different cell types across two datasets (query and reference).

boxplotPCA(
  query_data,
  reference_data,
  query_cell_type_col,
  ref_cell_type_col,
  cell_types = NULL,
  pc_subset = 1:5,
  assay_name = "logcounts"
)

Arguments

query_data

A SingleCellExperiment object containing numeric expression matrix for the query cells.

reference_data

A SingleCellExperiment object containing numeric expression matrix for the reference cells.

query_cell_type_col

The column name in the colData of query_data that identifies the cell types.

ref_cell_type_col

The column name in the colData of reference_data that identifies the cell types.

cell_types

A character vector specifying the cell types to include in the plot. If NULL, all cell types are included.

pc_subset

A numeric vector specifying which principal components to include in the plot. Default is PC1 to PC5.

assay_name

Name of the assay on which to perform computations. Default is "logcounts".

Value

A ggplot object representing the boxplots of specified principal components for the given cell types and datasets.

Details

The function boxplotPCA is designed to provide a visualization of principal component analysis (PCA) results. It projects the query dataset onto the principal components obtained from the reference dataset. The results are then visualized as boxplots, grouped by cell types and datasets (query and reference). This allows for a comparative analysis of the distributions of the principal components across different cell types and datasets. The function internally calls projectPCA to perform the PCA projection. It then reshapes the output data into a long format suitable for ggplot2 plotting.

Examples

# Load data
data("reference_data")
data("query_data")

# Plot the PC data
pc_plot <- boxplotPCA(query_data = query_data,
                      reference_data = reference_data,
                      cell_types = c("CD4", "CD8", "B_and_plasma", "Myeloid"),
                      query_cell_type_col = "SingleR_annotation",
                      ref_cell_type_col = "expert_annotation",
                      pc_subset = 1:6)
pc_plot