anomaly_output <- detectAnomaly(
query_data = dss9_data,
reference_data = healthy_data,
query_cell_type_col = "tier2_merged",
ref_cell_type_col = "tier2_merged",
cell_types = "Fibroblast",
pc_subset = 1:3,
n_tree = 500,
anomaly_threshold = 0.5,
assay_name = "logcounts",
max_cells_query = NULL,
max_cells_ref = NULL
)
# Extract anomalous cell barcodes
anomalous_barcodes <- names(anomaly_output[["Fibroblast"]][["query_anomaly"]][
anomaly_output[["Fibroblast"]][["query_anomaly"]] == TRUE
])MERFISH Colitis Analysis
Overview
This page demonstrates the application of scDiagnostics to the spatially-resolved MERFISH data from DSS-induced colitis timecourse, comparing imaging-based single-cell measurements from Day 9 (peak induction) to the healthy baseline at Day0. We use a combination of project mapping in reduced dimensional space, anomaly detection, neighborhood enrichment analysis, and cell-cell interaction analysis to illustrate how scDiagnostics can facilitate the detection and characterization of a disease-associate cell state that would be missed by standard reference-based annotation tools.
Dataset Context
- Healthy Reference: 27,140 cells from healthy colon (day 0)
- DSS9 Query: 29,040 cells from inflamed colon (day 9, peak inflammation)
- Technology: MERFISH (943-gene targeted panel)
- Focus: Fibroblast lineage and extracellular matrix (ECM) homeostasis
Reproducibility Across Annotation Methods
Key feature: All results below are reproducible with any annotation method.
Results shown use author-provided tier2_merged annotations (tier2_merged), but identical analyses can be generated by changing the query_cell_type_col parameter to any of the following annotation systems:
"azimuth_celltype_l1_merged"— Azimuth weighted k-NN mapping"singler_annotations_merged"— SingleR correlation-based annotations"celltypist_predicted_labels_merged"— CellTypist machine learning predictions"scvi_prediction_merged"— scArches deep learning transfer
We consistently observe the same patterns across all four methods:
- High anomaly rates in inflamed fibroblasts (35-40%)
- ECM gene suppression in anomalous cells
- Spatial enrichment near inflammatory sites
- Enhanced immune signaling in anomalous populations
This consistency across independent annotation methods confirms that spatial and transcriptomic patterns reflect genuine inflammatory remodeling, not annotation-method artifacts.
Analysis Components
1. Spatial Cell Type and Fibroblast State Maps (Figure A)
Script: R/merfish/Spatial_Plots.R
Three complementary spatial visualizations of the DSS9 inflamed colon tissue:
A) Cell Type Distribution
- All 15 cell types mapped to their spatial coordinates
- Major anatomical organization visible: epithelium (top), lamina propria (middle), muscle (bottom)
- Inflamed states (reds/purples) preferentially in upper tissue layers
- Healthy immune cells (orange, yellow) scattered throughout
B) Fibroblast Inflammation State
- Binary: Light pink (Fibroblast) vs. dark red (Inflamed Fibroblast)
- Inflamed fibroblasts form contiguous spatial domain, not scattered
- Preferential positioning beneath epithelium and within damaged regions
- Clear spatial clustering indicates coordinated inflammatory response
C) ECM Homeostasis Score (Continuous)
- Gradient: Dark purple (low ECM) to yellow (high ECM)
- Healthy tissue: predominantly high ECM scores (yellow-orange)
- Inflamed regions: dramatic ECM suppression (purple)
- ECM loss strongest precisely where inflamed fibroblasts accumulate
- Spatial pattern confirms molecular dysregulation co-localizes with inflammatory cells
Key observations: - Inflammation forms spatial domain, not random distribution - ECM suppression is localized and tissue-level phenomenon - Spatial architecture preserved despite disease (epithelium, immune infiltrate, stromal cells still organized)
2. PCA with ECM Score and Anomaly Mapping (Figure B)
Script: R/merfish/PCA_ECM_Plot.R and R/merfish/PCA_Anomaly_Plot.R
Left panels: PCA pairs plot projecting fibroblasts into healthy reference PC space with ECM score overlay:
- Diagonal (density): Reference cells (blue) tightly clustered; Query cells (purple) show broader, shifted distribution
- Lower scatterplots: Query and reference cells colored by ECM score
- Reference cells: predominantly high ECM (yellow)
- Query cells: predominantly low ECM (purple), especially those distant in PC space
- Reference cells (hollow circles) occupy PC origin; Query cells (filled circles) scattered outward
- Strong inverse relationship: greater PC distance = lower ECM score
Right panel: Spatial map showing anomaly status overlay:
- Black dots: Non-anomalous fibroblasts (structurally similar to reference)
- Red dots: Anomalous fibroblasts (flagged by isolation forest)
- Anomalies concentrated in epithelial-adjacent region
- Non-anomalous cells more widely distributed through tissue
- Spatial clustering suggests anomalies are genuine inflammatory response, not random noise
Performance metrics: - Sensitivity (detecting true anomalies): 78% - Specificity (not flagging healthy): 96% - Overall accuracy: 88%
Parameters: - query_data, reference_data — Spatial experiment objects - query_cell_type_col, ref_cell_type_col — Column names (e.g., “tier2_merged”, “azimuth_celltype_l1_merged”) - cell_types — Specific cell type to analyze - pc_subset — Principal components (1:3 captures ~80% variance) - n_tree — Isolation forest trees (500 = balance accuracy/speed) - anomaly_threshold — Score cutoff (0.5 = median)
Output: Boolean vector of anomalous cells; spatial coordinates preserved for spatial analysis
Flexibility note: You can change query_cell_type_col to any annotation system in the colData: - "tier2_merged" (shown here, ground truth) - "azimuth_celltype_l1_merged" - "singler_annotations_merged" - "celltypist_predicted_labels_merged" - "scvi_prediction_merged"
We observe consistent anomaly detection patterns across methods, confirming results reflect genuine inflammatory remodeling.
3. Spatial Enrichment Analysis (Figure C)
Script: R/merfish/Spatial_Enrichment_Plot.R
Tests whether anomalous fibroblasts preferentially locate near inflammatory signals by measuring distance to three target populations:
Three panels showing distance-based enrichment:
Left - Neutrophils (Acute Infiltrate):
- Strong enrichment: Anomalous fibroblasts significantly enriched within 0-25 μm of neutrophils (red bar)
- Enrichment drops sharply at 25-50 μm
- Non-anomalous fibroblasts show minimal preference (green bars)
- Wilcoxon p < 2.2e-16 (highly significant)
Center - Inflamed Epithelium (Damage Signal):
- Striking enrichment: Anomalous fibroblasts massively enriched immediately adjacent to damaged epithelium (0-25 μm)
- Non-anomalous cells depleted in this region
- Difference diminishes beyond 50 μm
- Wilcoxon p = 3.4e-04
Right - Stem/Crypt Base (Regeneration):
- Moderate enrichment: Anomalous fibroblasts enriched at 100-150 μm (deeper tissue)
- Suggests involvement in regenerative signaling, not just inflammation
- Wilcoxon p = 1.4e-08
Interpretation: Anomalous fibroblasts are not randomly distributed but spatially organized in functional microenvironments. They preferentially locate at damage sites (acute) and regenerative zones (tissue repair).
4. Ligand-Receptor Interaction Networks (Figure D)
Script: R/merfish/Network_Plot.R
Maps functional fibroblast-immune interactions via six curated ligand-receptor pairs:
Three network comparisons:
Left - All Fibroblasts (Average Signal):
- Baseline ligand-receptor engagement
- Moderate connection thickness reflects average expression
- Shows global fibroblast functionality
Center - Non-Anomalous Fibroblasts:
- Reduced signal intensity (thinner lines) compared to average
- Minimal engagement with inflammatory actors (Neutrophils, Inflamed Epithelium)
- Strong connection to stem/crypt base (regeneration)
- Represents “resting” fibroblast state
Right - Anomalous Fibroblasts (Key Finding):
- Dramatically enhanced signaling (thick red lines)
- IL1B-IL1R1 highly activated (IL1β: pro-inflammatory priming)
- TGFB1-TGFBR2 strongly engaged (TGFβ: tissue remodeling)
- CXCL5-CXCR2 upregulated (neutrophil recruitment)
- Rspo1-Lgr5 maintained (stem cell support during repair)
- Wnt5a-Fzd5 active (non-canonical Wnt signaling)
Biological interpretation: Anomalous fibroblasts show functional switch from resting state to pro-inflammatory, pro-regenerative phenotype. They actively signal to immune cells, epithelium, and stem cells—consistent with tissue remodeling in chronic inflammation.
Integrated Interpretation
Multi-level evidence for disease-associated fibroblast state:
- Spatial organization (Fig A) — Forms coherent domain, not random
- Transcriptomic divergence (Fig B) — Clear PCA separation + ECM suppression
- Positional logic (Fig C) — Enriched at epithelial damage and immune infiltrate
- Functional role (Fig D) — Enhanced inflammatory and regenerative signaling
Why these are likely not misclassified cells:
- Spatial clustering indicates coordinated response to local signals
- ECM suppression is an established signal of adaptation to chronic inflammation (reduced structural protein production)
- Proximity to damaged tissue suggests active participation in wound response
- Inferred cell-cell interaction network shows increased interaction with inflammatory milieu
Figures Generated
spatial_plots.png— Three-panel spatial visualization (Figure A)pca_ecm.png— PCA projection with ECM score overlay (Figure B, left)pca_anomaly.png— Spatial map of anomaly status (Figure B, right)spatial_enrichment_analysis.png— Distance enrichment bar plots (Figure C)network_analysis.png— Ligand-receptor interaction networks (Figure D)