gene Expression Analysis Resource

Citation

gEAR: Gene Expression Analysis Resource portal for community-driven, multi-omic data exploration.
Orvis J, et al. Nat Methods. 2021 Jun 25.
doi: 10.1038/s41592-021-01200-9
PMID: 34172972

NeMO Analytics: A Curated Compendium of Transcriptomic Data for the Exploration of Neocortical Development

Shreyash Sonthalia, Brian Herb, Ricky S. Adkins, Joshua Orvis, Guangyan Li, Xiangyu Liao, Qingjie Yu, Xoel Mato Blanco, Alex Casella, Jinrui Liu, Genevieve Stein-O’Brien, Brian Caffo, Ronna Hertzano, Anup Mahurkar, Jin-Chong Xu, Jesse Gillis, Jonathan Werner, Shaojie Ma, Suel-Kee Kim, Nicola Micali, Nenad Sestan, Pasko Rakic, Gabriel Santpere, Seth A. Ament, Carlo Colantuoni

To enable researchers to more fully harness the collective discovery potential of multi-omic data in the public domain, we have assembled gene-level transcriptomic data from ~200 studies of neocortical development and in vitro models. Applying joint matrix decomposition to mouse, macaque and human data, we define transcriptome dynamics that are conserved across neocortical neurogenesis and identify a program that emerges in ventricular progenitors, is later expressed in neurogenic outer, or basal, radial glia of primates, but is limited to gliogenic precursors in the rodent. Decomposition of adult human neocortical data identified layer-specific signatures in excitatory neurons, enabling the charting of their developmental emergence and protracted maturation, which is in stark contrast to the early peaking expression of layer-defining transcription factors. Interrogation of data from cerebral organoids demonstrated that while broad elements of in vivo development are recapitulated in vitro, many layer-specific transcriptomic programs in neuronal maturation are absent. We invite cell biologists without coding expertise to use NeMO Analytics in their research and to fuel it with their own emerging data.

Joint decomposition of scRNA-seq data in the NeMO Analytics neocortical development data collection

Joint decomposition of scRNA-seq data in the NeMO Analytics neocortical development data collection

Schematic of NeMO Analytics data resources employed in conjunction with joint decomposition and transfer learning approaches. This is an outline of specific analyses in this report as well as a description of a general approach we invite others to take on. Analysis-ready datasets and detailed sample metadata can be downloaded from NeMO Analytics and analyzed offline. Elements learned from joint decomposition can then be uploaded to NeMO Analytics to explore their dynamics across the broad data collection. In this flow, the offline analysis could be any exploratory technique applied to mutli-omics data matrices that produces gene signatures in the form of simple lists or quantitative loadings, e.g. PCA, clustering, or differential expression analysis.

Exploring data collections and gene signatures with projection analysis

The central notion behind our approach is to:

  1. Create a comprehensive compendium of transcriptomic datasets focused on neocortical development: "Transcriptomic data collections" below. And:
  2. Apply diverse matrix decomposition approaches (SJD) to define robust transcriptomic dynamics that are shared across carefully selected datasets: "Gene signatures" below. Followed by:
  3. Molecular dissection of the broad transcriptomic data collections (1) by projection into gene signatures (2): Table of ”Projection links“ below.

1. Transcriptomic data collections (profiles) focused on the neocortex in NeMO Analytics

Figure 1: scRNA-seq in mouse, macaque and human mid-gestational neocortex

Figure 2: Bulk, single-cell and spatial transcriptomic data in mouse, macaque and human mid-gestational neocortex

NeocortexDevoHsInVivo: in vivo data from the developing human neocortex

NeocortexDevoHsInVitro: cerebral organoids neural differentiation in human stem cell models of the neocortex

NeocortexDevoMmInVivo: in vivo data from the developing mouse neocortex

NeocortexDevoNHPInVivo: in vivo data from the developing macaque, marmoset, and chimpanzee neocortex

NeocortexEvoDevo: selected subset of in vivo datasets from mouse, macaque and human developing neocortex

AdultNeoctxLayers: in vivo data from the mature human neocortex and spatial transcriptomics across species

MammalianEmbryo: mammalian embryo development and the emergence of the neural lineage and telencephalon

2. Gene signatures derived from joint matrix decomposition of neocortical data

MammCtxDev.jNMF.p7: Seven patterns defining conserved transcriptomic dynamics across scRNA-seq from mouse, macaque, and human neocortical neurogenesis (Figures 1 & 2, Sonthalia et al, where p5=RG/cycling, p4=IPC, p7=nascent neuron, p2=maturing neuron).

MammCtxDev.jNMF.p40: Forty patterns defining conserved transcriptomic dynamics across scRNA-seq from mouse, macaque, and human neocortical neurogenesis (Figure 3, Sonthalia et al, where p27=oRG).

HsCtxLayer.jNMF.p20: Twenty patterns defining shared transcriptomic signatures in snRNA-seq from adult human neocortical neurons from layer-specific microdissections across 8 neocortical regions and 5 donors (Figures 4 & 5, Sonthalia et al, where p1=subplate/L6b, p9=L6CT, p15=L6IT, p13=L5/6NP, p6=L5IT, p19=L4primate-specific, p4=L4conserved, p17=L2/3, p7=L2).

CellCycleGenes: Two groups of genes - those expressed specifically in S- or G2M-phase of the cell cycle (from Seurat).

CtxDiseaseGeneLists: Many neocortical disease risk gene lists (from MatoBlanco et al 2024).

3. Projection links to explore gene signatures across the NeMO Analytics data collections