MERFISH Colitis Analysis

Overview

This page demonstrates the application of scDiagnostics to the spatially-resolved MERFISH data from DSS-induced colitis timecourse, comparing imaging-based single-cell measurements from Day 9 (peak induction) to the healthy baseline at Day0. We use a combination of project mapping in reduced dimensional space, anomaly detection, neighborhood enrichment analysis, and cell-cell interaction analysis to illustrate how scDiagnostics can facilitate the detection and characterization of a disease-associate cell state that would be missed by standard reference-based annotation tools.

Dataset Context

  • Healthy Reference: 27,140 cells from healthy colon (day 0)
  • DSS9 Query: 29,040 cells from inflamed colon (day 9, peak inflammation)
  • Technology: MERFISH (943-gene targeted panel)
  • Focus: Fibroblast lineage and extracellular matrix (ECM) homeostasis

Reproducibility Across Annotation Methods

Key feature: All results below are reproducible with any annotation method.

Results shown use author-provided tier2_merged annotations (tier2_merged), but identical analyses can be generated by changing the query_cell_type_col parameter to any of the following annotation systems:

  • "azimuth_celltype_l1_merged" — Azimuth weighted k-NN mapping
  • "singler_annotations_merged" — SingleR correlation-based annotations
  • "celltypist_predicted_labels_merged" — CellTypist machine learning predictions
  • "scvi_prediction_merged" — scArches deep learning transfer

We consistently observe the same patterns across all four methods:

  • High anomaly rates in inflamed fibroblasts (35-40%)
  • ECM gene suppression in anomalous cells
  • Spatial enrichment near inflammatory sites
  • Enhanced immune signaling in anomalous populations

This consistency across independent annotation methods confirms that spatial and transcriptomic patterns reflect genuine inflammatory remodeling, not annotation-method artifacts.

Analysis Components

1. Spatial Cell Type and Fibroblast State Maps (Figure A)

Script: R/merfish/Spatial_Plots.R

Three complementary spatial visualizations of the DSS9 inflamed colon tissue:

A) Cell Type Distribution

  • All 15 cell types mapped to their spatial coordinates
  • Major anatomical organization visible: epithelium (top), lamina propria (middle), muscle (bottom)
  • Inflamed states (reds/purples) preferentially in upper tissue layers
  • Healthy immune cells (orange, yellow) scattered throughout

B) Fibroblast Inflammation State

  • Binary: Light pink (Fibroblast) vs. dark red (Inflamed Fibroblast)
  • Inflamed fibroblasts form contiguous spatial domain, not scattered
  • Preferential positioning beneath epithelium and within damaged regions
  • Clear spatial clustering indicates coordinated inflammatory response

C) ECM Homeostasis Score (Continuous)

  • Gradient: Dark purple (low ECM) to yellow (high ECM)
  • Healthy tissue: predominantly high ECM scores (yellow-orange)
  • Inflamed regions: dramatic ECM suppression (purple)
  • ECM loss strongest precisely where inflamed fibroblasts accumulate
  • Spatial pattern confirms molecular dysregulation co-localizes with inflammatory cells

Key observations: - Inflammation forms spatial domain, not random distribution - ECM suppression is localized and tissue-level phenomenon - Spatial architecture preserved despite disease (epithelium, immune infiltrate, stromal cells still organized)

2. PCA with ECM Score and Anomaly Mapping (Figure B)

Script: R/merfish/PCA_ECM_Plot.R and R/merfish/PCA_Anomaly_Plot.R

Left panels: PCA pairs plot projecting fibroblasts into healthy reference PC space with ECM score overlay:

  • Diagonal (density): Reference cells (blue) tightly clustered; Query cells (purple) show broader, shifted distribution
  • Lower scatterplots: Query and reference cells colored by ECM score
    • Reference cells: predominantly high ECM (yellow)
    • Query cells: predominantly low ECM (purple), especially those distant in PC space
  • Reference cells (hollow circles) occupy PC origin; Query cells (filled circles) scattered outward
  • Strong inverse relationship: greater PC distance = lower ECM score

Right panel: Spatial map showing anomaly status overlay:

  • Black dots: Non-anomalous fibroblasts (structurally similar to reference)
  • Red dots: Anomalous fibroblasts (flagged by isolation forest)
  • Anomalies concentrated in epithelial-adjacent region
  • Non-anomalous cells more widely distributed through tissue
  • Spatial clustering suggests anomalies are genuine inflammatory response, not random noise

Performance metrics: - Sensitivity (detecting true anomalies): 78% - Specificity (not flagging healthy): 96% - Overall accuracy: 88%

anomaly_output <- detectAnomaly(
    query_data = dss9_data,
    reference_data = healthy_data,
    query_cell_type_col = "tier2_merged",
    ref_cell_type_col = "tier2_merged",
    cell_types = "Fibroblast",
    pc_subset = 1:3,
    n_tree = 500,
    anomaly_threshold = 0.5,
    assay_name = "logcounts",
    max_cells_query = NULL,
    max_cells_ref = NULL
)

# Extract anomalous cell barcodes
anomalous_barcodes <- names(anomaly_output[["Fibroblast"]][["query_anomaly"]][
    anomaly_output[["Fibroblast"]][["query_anomaly"]] == TRUE
])

Parameters: - query_data, reference_data — Spatial experiment objects - query_cell_type_col, ref_cell_type_col — Column names (e.g., “tier2_merged”, “azimuth_celltype_l1_merged”) - cell_types — Specific cell type to analyze - pc_subset — Principal components (1:3 captures ~80% variance) - n_tree — Isolation forest trees (500 = balance accuracy/speed) - anomaly_threshold — Score cutoff (0.5 = median)

Output: Boolean vector of anomalous cells; spatial coordinates preserved for spatial analysis

Flexibility note: You can change query_cell_type_col to any annotation system in the colData: - "tier2_merged" (shown here, ground truth) - "azimuth_celltype_l1_merged" - "singler_annotations_merged" - "celltypist_predicted_labels_merged" - "scvi_prediction_merged"

We observe consistent anomaly detection patterns across methods, confirming results reflect genuine inflammatory remodeling.

3. Spatial Enrichment Analysis (Figure C)

Script: R/merfish/Spatial_Enrichment_Plot.R

Tests whether anomalous fibroblasts preferentially locate near inflammatory signals by measuring distance to three target populations:

Three panels showing distance-based enrichment:

Left - Neutrophils (Acute Infiltrate):

  • Strong enrichment: Anomalous fibroblasts significantly enriched within 0-25 μm of neutrophils (red bar)
  • Enrichment drops sharply at 25-50 μm
  • Non-anomalous fibroblasts show minimal preference (green bars)
  • Wilcoxon p < 2.2e-16 (highly significant)

Center - Inflamed Epithelium (Damage Signal):

  • Striking enrichment: Anomalous fibroblasts massively enriched immediately adjacent to damaged epithelium (0-25 μm)
  • Non-anomalous cells depleted in this region
  • Difference diminishes beyond 50 μm
  • Wilcoxon p = 3.4e-04

Right - Stem/Crypt Base (Regeneration):

  • Moderate enrichment: Anomalous fibroblasts enriched at 100-150 μm (deeper tissue)
  • Suggests involvement in regenerative signaling, not just inflammation
  • Wilcoxon p = 1.4e-08

Interpretation: Anomalous fibroblasts are not randomly distributed but spatially organized in functional microenvironments. They preferentially locate at damage sites (acute) and regenerative zones (tissue repair).

4. Ligand-Receptor Interaction Networks (Figure D)

Script: R/merfish/Network_Plot.R

Maps functional fibroblast-immune interactions via six curated ligand-receptor pairs:

Three network comparisons:

Left - All Fibroblasts (Average Signal):

  • Baseline ligand-receptor engagement
  • Moderate connection thickness reflects average expression
  • Shows global fibroblast functionality

Center - Non-Anomalous Fibroblasts:

  • Reduced signal intensity (thinner lines) compared to average
  • Minimal engagement with inflammatory actors (Neutrophils, Inflamed Epithelium)
  • Strong connection to stem/crypt base (regeneration)
  • Represents “resting” fibroblast state

Right - Anomalous Fibroblasts (Key Finding):

  • Dramatically enhanced signaling (thick red lines)
  • IL1B-IL1R1 highly activated (IL1β: pro-inflammatory priming)
  • TGFB1-TGFBR2 strongly engaged (TGFβ: tissue remodeling)
  • CXCL5-CXCR2 upregulated (neutrophil recruitment)
  • Rspo1-Lgr5 maintained (stem cell support during repair)
  • Wnt5a-Fzd5 active (non-canonical Wnt signaling)

Biological interpretation: Anomalous fibroblasts show functional switch from resting state to pro-inflammatory, pro-regenerative phenotype. They actively signal to immune cells, epithelium, and stem cells—consistent with tissue remodeling in chronic inflammation.

Integrated Interpretation

Multi-level evidence for disease-associated fibroblast state:

  1. Spatial organization (Fig A) — Forms coherent domain, not random
  2. Transcriptomic divergence (Fig B) — Clear PCA separation + ECM suppression
  3. Positional logic (Fig C) — Enriched at epithelial damage and immune infiltrate
  4. Functional role (Fig D) — Enhanced inflammatory and regenerative signaling

Why these are likely not misclassified cells:

  • Spatial clustering indicates coordinated response to local signals
  • ECM suppression is an established signal of adaptation to chronic inflammation (reduced structural protein production)
  • Proximity to damaged tissue suggests active participation in wound response
  • Inferred cell-cell interaction network shows increased interaction with inflammatory milieu

Figures Generated

  • spatial_plots.png — Three-panel spatial visualization (Figure A)
  • pca_ecm.png — PCA projection with ECM score overlay (Figure B, left)
  • pca_anomaly.png — Spatial map of anomaly status (Figure B, right)
  • spatial_enrichment_analysis.png — Distance enrichment bar plots (Figure C)
  • network_analysis.png — Ligand-receptor interaction networks (Figure D)