Citation

gEAR: Gene Expression Analysis Resource portal for community-driven, multi-omic data exploration.
Orvis J, et al. Nat Methods. 2021 Jun 25.
doi: 10.1038/s41592-021-01200-9
PMID: 34172972

A Curated Compendium of Transcriptomic Data for the Exploration of Neocortical Development

Shreyash Sonthalia, Ricky S. Adkins, Joshua Orvis, Guangyan Li, Xoel Mato Blanco, Alex Casella, Jinrui Liu, Genevieve Stein-O’Brien, Brian Caffo, Ronna Hertzano, Anup Mahurkar, Jesse Gillis, Jonathan Werner, Shaojie Ma, Nicola Micali, Nenad Sestan, Pasko Rakic, Gabriel Santpere, Seth A. Ament, Carlo Colantuoni

Vast quantities of multi-omic data have been produced to characterize the development and diversity of cell types in the cerebral cortex of humans and other mammals. To more fully harness the collective discovery potential of these data, we have assembled gene-level transcriptomic data from 188 published studies of neocortical development, including the transcriptomes of >33 million single-cells, extensive spatial transcriptomic experiments and RNA sequencing of sorted cells and bulk tissues. Applying joint matrix decomposition to mouse, macaque and human data in this collection, we defined transcriptome dynamics that are conserved across mammalian neurogenesis and which elucidate the evolution of outer, or basal, radial glial cells. Decomposition of adult human neocortical data identified layer-specific signatures in mature neurons and, in combination with transfer learning methods in NeMO Analytics, enabled the charting of their early developmental emergence and protracted maturation across years of postnatal life. Interrogation of data from cerebral organoids demonstrated that while many molecular elements of in vivo development are recapitulated in vitro, specific transcriptomic programs in neuronal maturation are absent. We invite computational biologists and cell biologists without coding expertise to use NeMO Analytics and to fuel it with emerging data.

Joint decomposition of scRNA-seq data in the NeMO Analytics neocortical development data collection

Joint decomposition of scRNA-seq data in the NeMO Analytics neocortical development data collection

Schematic of NeMO Analytics data resources employed in conjunction with joint decomposition and transfer learning approaches. This is an outline of specific analyses in this report as well as a description of a general approach we invite others to take on. Analysis-ready datasets and detailed sample metadata can be downloaded from NeMO Analytics and analyzed offline. Elements learned from joint decomposition can then be uploaded to NeMO Analytics to explore their dynamics across the broad data collection. In this flow, the offline analysis could be any exploratory technique applied to mutli-omics data matrices that produces gene signatures in the form of simple lists or quantitative loadings, e.g. PCA, clustering, or differential expression analysis.

Exploring data collections and gene signatures with projection analysis

The central notion behind our approach is to:

  1. Create a comprehensive compendium of transcriptomic datasets focused on neocortical development: "Transcriptomic data collections" below. And:
  2. Apply diverse matrix decomposition approaches (SJD) to define robust transcriptomic dynamics that are shared across carefully selected datasets: "Gene signatures" below. Followed by:
  3. Molecular dissection of the broad transcriptomic data collections (1) by projection into gene signatures (2): Table of ”Projection links“ below.

1. Transcriptomic data collections (profiles) focused on the neocortex in NeMO Analytics

Figure 1: scRNA-seq in mouse, macaque and human mid-gestational neocortex

Figure 2: Bulk, single-cell and spatial transcriptomic data in mouse, macaque and human mid-gestational neocortex

NeocortexDevoHsInVivo: in vivo data from the developing human neocortex

NeocortexDevoHsInVitro: cerebral organoids neural differentiation in human stem cell models of the neocortex

NeocortexDevoMmInVivo: in vivo data from the developing mouse neocortex

NeocortexDevoNHPInVivo: in vivo data from the developing macaque, marmoset, and chimpanzee neocortex

NeocortexEvoDevo: selected subset of in vivo datasets from mouse, macaque and human developing neocortex

AdultNeoctxLayers: in vivo data from the mature human neocortex and spatial transcriptomics across species

MammalianEmbryo: mammalian embryo development and the emergence of the neural lineage and telencephalon

2. Gene signatures derived from joint matrix decomposition of neocortical data

MammCtxDev.jNMF.p7: Seven patterns defining conserved transcriptomic dynamics across scRNA-seq from mouse, macaque, and human neocortical neurogenesis (Figures 1 & 2, Sonthalia et al, where p5=RG/cycling, p4=IPC, p7=nascent neuron, p2=maturing neuron).

MammCtxDev.jNMF.p40: Forty patterns defining conserved transcriptomic dynamics across scRNA-seq from mouse, macaque, and human neocortical neurogenesis (Figure 3, Sonthalia et al, where p27=oRG).

HsCtxLayer.jNMF.p20: Twenty patterns defining shared transcriptomic signatures in snRNA-seq from adult human neocortical neurons from layer-specific microdissections across 8 neocortical regions and 5 donors (Figures 4 & 5, Sonthalia et al, where p1=subplate/L6b, p9=L6CT, p15=L6IT, p13=L5/6NP, p6=L5IT, p19=L4primate-specific, p4=L4conserved, p17=L2/3, p7=L2).

CellCycleGenes: Two groups of genes - those expressed specifically in S- or G2M-phase of the cell cycle (from Seurat).

CtxDiseaseGeneLists: Many neocortical disease risk gene lists (from MatoBlanco et al 2024).

3. Projection links to explore gene signatures across the NeMO Analytics data collections