DropViz

Exploring the Mouse Brain through Single Cell Expression Profiles

The mammalian brain is composed of a vast (and unknown) number of specialized cell types whose complex interactions underlie behavior. To better appreciate these cellular specializations — and the genes that cells of diverse types use to perform their jobs — we used Drop-seq to analyze 690,000 individual cells from nine different regions of the adult mouse brain. We identified 1.45 billion RNA transcripts among these cells, then used the cells' patterns of RNA expression to classify them into transcriptionally distinct groups of cells (clusters and subclusters). Most of these clusters and subclusters correspond to discrete cell types; some to dynamic cell states; and many, we expect, to aspects of cell biology that remain to be discovered.

We created this web-based data resource so that the data and analyses could benefit other scientists' work, while we work on finalizing analyses and a manuscript. Detailed methods will be available in that manuscript, and the underlying data sets will be available with the publication.

Explore Cell Types

The expression profile for each cell was grouped into clusters and sub-clusters through a semi-automated process and the results were projected onto 2-D t-SNE plots to visually approximate the expression relationships amongst individual cells. Gene expression levels of query genes can be overlayed on any subset of data.

Example: The relative expression profile of Sepp1 among sub-types of oligodendrocytes in frontal cortex and hippocampus

Compare and Discover Genes

The relative pairwise gene expression between cell types within a region or across different regions can be quickly computed for 10,000s of genes and ranked to identify distinguishing marker genes. Additional user selected genes can be added to the comparison set.

Example: The set of genes overexpressed at least five-fold in Oligodendrocytes versus Polydendrocytes within the Ento Peduncular.

Search By Gene

Enter one or more genes of interest to immediately see the relative gene expression across 262 cell types.

Example: The twenty cell types with the highest expression of S100a6 across all brain regions.

We hope you find this data resource useful. Please write us with experiences and suggestions. We would love stories about how people are using it in their work.


Drop-seq

Drop-seq is a technology we developed for highly parallel analysis of RNA expression in thousands of individual cells. Drop-seq works by encapsulating individual cells into vast numbers of nanoliter-sized droplets, together with DNA-barcoded beads that uniquely identify the droplets. We described Drop-seq in a paper in 2015, and subsequent commercial technologies have been built on this approach. We have made Drop-seq open-source, and hundreds of labs have adopted it; detailed protocols and software are available on the Drop-seq web site.

Summary

The mammalian brain is composed of a large, unknown number of specialized cell types whose complex interactions drive behavior. Unbiased, large-scale gene expression profiling in individual cells offers an opportunity to systematically characterize cellular specialization within the nervous system. We used Drop-seq, which enables unbiased assessment of gene expression from thousands of individual cells, to generate profiles from 690,000 cells in nine tissues of the adult mouse brain: frontal cortex (156,000), posterior cortex (99,000), striatum (77,000), cerebellum (26,000), hippocampus (113,000), substantia nigra/ventral tegmental area (44,000), globus pallidus externus/nucleus basalis (66,000), thalamus (89,000), and entopeduncular nucleus (19,000). To identify distinct patterns of co-expression, we performed independent component analysis (ICA) on each of the nine individual tissue datasets. ICA identified both biologically derived signals—including those corresponding to canonical cell type distinctions and those with spatially graded patterns—as well as to technical signals—including those that were highly replicate-dependent, defined cell "doublets", or were correlated with tissue processing. We retained 601 non-technical ICs for use in graph-based clustering, partitioning cells into 565 distinct groups. These populations encompassed all known major cell classes in the nervous system, including neurons, astrocytes, microglia, oligodendrocytes and polydendrocytes, endothelial cells, and cells of the choroid plexus. Our analysis uncovers numerous undescribed cell types and states across the nervous system.

Citation

Saunders, A.*, Macosko, E.*, Wysoker, A., Goldman, M., Krienen, F., Bien, E., Baum, M., Wang, S., Goeva, A., Nemesh, J., Kamitaki, N., Brumbaugh, S., Kulp, D. and McCarroll, S. A Single-Cell Atlas of Cell Types, States, and Other Transcriptional Patterns from Nine Regions of the Adult Mouse Brain. Submitted.


Funding and Support

This research was performed in the McCarroll and Macosko Labs at Harvard Medical School and the Broad Institute's Stanley Center for Psychiatric Research.

Stanley Center Harvard Medical School

Parameters

Configuration Panels

Query

The configuration panel allows for filtering, comparisons and display adjustments. The Query panel accepts entry of one or more gene symbols. If entered, then the plots on the right will display the expression with respect to the entered genes. The panel provides auto-complete for a large set of commonly used genes, but other named genes can also be entered.

In addition to gene entry, the Query panel allows filtering of the dataset to focus on a subset of the regions, classes, clusters or subclusters. For example, you can limit your search to only "Hippocampus" or to only "Neurons" or to a specific named cluster or subcluster. Choose the t-SNE display on the right to easily visualize what data matches your filtering criteria.

Clusters

The Clusters panel accepts entry of "target" and "comparison" clusters or subclusters (depending on the current plot display). If the meta-group option is enabled, then more than one cluster or subcluster may be combined as the target or comparison. A meta-group sums the expression levels across the chosen clusters or subclusters. Once target and comparisons are selected, then a table is displayed in the bottom right of those genes that are differentially expressed between the two groups. Multiple parameters are provided to modify the classification criteria. Choose the Scatter panel on the right to visualize how parameter changes affect classification.

Display

The Display panel provides parameters for changing some of the plotting parameters. Only parameters that are relevant to the currently displayed plot on the right are available to the user.

Compare Clusters

Compare Subclusters

Differential Expression Criteria

Level Plot Settings

Heat Map Settings

Gene Search Settings

Cell Expression Settings

t-SNE Plot Settings

Scatter Plot Settings

Labels

(Common names are interpretive best guesses)

Cluster Levels

The plot displays the relative expression of the selected gene(s) among clusters. The error bars represent binomially distributed sampling noise given the number of cells in each cluster; it does not represent heterogeneity in expression among cells within the cluster. Only clusters filtered in the Query panel are displayed. The plot is limited to only two entered genes. If more than two genes are chosen, then the display switches to a heatmap-like table.

Cluster Levels (Heatmap Table)

The table displays the relative expression of the selected gene(s) in all clusters using a shaded dot. Darker dots represent higher gene expression. Only clusters filtered in the ',b('Query'),'panel are displayed. Mouseover dots to display numeric values or choose the download icon to retrieve the full table data in R. The table is displayed when more than two genes are chosen.

Help for t-SNE plot of clusters in global region space.

Each point represents a single cell. Each cell is associated with a gene expression vector. This high-dimensional data is reduced using a set of automated independent components and projected onto two dimensions using t-SNE ('global space', i.e. representing all cells from a brain region/tissue). The cluster classifications are derived from Louvain clustering of the ICs.

Points are generally sub-sampled to improve display and speed rendering. Sampling can be controlled using the Display panel on the left. 'Bag' plots show the distribution of all points similar to a one-dimensional box plot. The darker region represents 50% of cells. The lighter region represents all points except outliers.

Clusters are highlighted in different colors based on the filtering choices in the Query section in the left panel. Labels and other display features can be customized in the Display panel

If a row in the differentially expressed genes table, below, or one or more genes are entered by name in the Query panel to the left, then the selected genes' expression levels will be displayed in black (one row of t-SNE plots per gene) and the mean expression level for that gene in the subcluster is displayed by a color gradient or transparency, depending on settings.

Cluster scatter plot

Each point is the mean log normalized transcript count among all cells in the target and comparison clusters (or region). Large points meet the fold ratio and transcript amount criteria in the Clusters panel. Points are shaded according to their significance. Selected rows in the table of differentially expressed genes are displayed in green or red depending on whether they pass the criteria.

Filtered Table of Clusters

The table displays all matching clusters based on the filtering parameters set in the ',b('Query') panel on the left. Other plots and display outputs are limited to the clusters listed here.


Differentially expressed genes in clusters

Select a 'target' cluster in the left Clusters panel to display those genes that are over-expressed in that cluster with respect to the remaining cells in the region or a chosen comparison cluster. Adjust filter criteria using the Clusters panel.

Any manually added genes are always displayed in the table and colored green if the expression criteria is met and colored red otherwise.

One or more rows can be selected to display gene expression in the t-SNE and scatter plots.

Subcluster Levels

The plot displays the relative expression of the selected gene(s) among subclusters. The error bars represent binomially distributed sampling noise given the number of cells in each subcluster; it does not represent heterogeneity in expression among cells within the subcluster. Only subclusters filtered in the Query panel are displayed. The plot is limited to only two entered genes. If more than two genes are chosen, then the display switches to a heatmap-like table.

Subluster Levels (Heatmap Table)

The table displays the relative expression of the selected gene(s) in all subclusters using a shaded dot. Darker dots represent higher gene expression. Only subclusters filtered in the Query panel are displayed. Mouseover dots to display numeric values or choose the download icon to retrieve the full table data in R. The table is displayed when more than two genes are chosen.

Help for t-SNE plot of subclusters in global region space.

This plot is like the t-SNE plot of clusters in global region space, but displays the subcluster labels for each cell.

Help for t-SNE plot of subclusters in local cluster space.

Each point represents a single cell. Each cell is associated with a gene expression vector. This high-dimensional data within a cluster is reduced using a set of curated independent components and projected onto two dimensions using t-SNE ('local cluster space'). The subcluster classifications are derived from Louvain clustering using a subset of the ICs.

Subcluster regions are highlighted in different colors based on the filtering choices in the Query section in the left panel. All points in the corresponding cluster are displayed with points outside of the chosen subcluster(s) shown in gray and all points in the chosen subclusters displayed in color. (There is no subsampling for subcluster displays.)

If a row in the differentially expressed genes table, below, or one or more genes are entered by name in the Query panel to the left, then the selected genes' expression levels will be displayed in black (one row of t-SNE plots per gene) and the mean expression level for that gene in the subcluster is displayed by a color gradient or transparency, depending on settings.

Labels and other display features can be customized in the Display panel

Subcluster scatter plot

Each point is the mean log normalized transcript count among all cells in the target and comparison subclusters (or region). Large points meet the fold ratio and transcript amount criteria in the Clusters panel. Points are shaded according to their significance. Selected rows in the table of differentially expressed genes are displayed in green or red depending on whether they pass the criteria.

Filtered Table of Subclusters

The table displays all matching subclusters based on the filtering parameters set in the ',b('Query') panel on the left. Other plots and display outputs are limited to the subclusters listed here.


Differentially expressed genes in subclusters

Select a 'target' subcluster in the left Clusters panel to display those genes that are over-expressed in that subcluster with respect to the remaining cells in the region or a chosen comparison subcluster. Adjust filter criteria using the Clusters panel.

Any manually added genes are always displayed in the table and colored green if the expression criteria is met and colored red otherwise.

One or more rows can be selected to display gene expression in the t-SNE and scatter plots, above.


DropViz Team


  • Arpiar Saunders

    Postdoctoral Fellow

    Psychiatric illness and neural circuits

    Arpiar Saunders

    Postdoctoral Fellow


  • Evan Macosko

    Assistant Professor, Broad Institute

    Lab website

    Evan Macosko

    Assistant Professor, Broad Institute


  • Steve McCarroll

    Professor, Harvard

    Lab website

    Steve McCarroll

    Professor, Harvard


  • Laura Bortolin

    Research Associate

    Laura_Bortolin@hms.harvard.edu

    Single-cell analysis technology

    Laura Bortolin

    Research Associate
    Laura_Bortolin@hms.harvard.edu


  • Fenna Krienen

    Postdoctoral Fellow

    fenna_krienen@hms.harvard.edu

    Primate brain transcriptomics, development, evolution

    Fenna Krienen

    Postdoctoral Fellow
    fenna_krienen@hms.harvard.edu


  • James Nemesh

    Computational Biologist

    Data Janitor

    James Nemesh

    Computational Biologist


  • Matthew Baum

    Graduate student, Neurobiology

    Complement inhibitors in psychiatric illness

    Matthew Baum

    Graduate student, Neurobiology


  • Elizabeth Bien

    Scientific Staff

    Single-cell analysis technology

    Elizabeth Bien

    Scientific Staff


  • Melissa Goldman

    Scientific Staff

    Single-cell analysis technology

    Melissa Goldman

    Scientific Staff


  • Sara Brumbaugh

    App Designer

    Sara Brumbaugh

    App Designer


  • Nolan Kamitaki

    Computational Biologist

    Math, genomics, and biology

    Nolan Kamitaki

    Computational Biologist


  • Alec Wysoker

    Software Engineer

    Single-cell computational analysis

    Alec Wysoker

    Software Engineer


  • David Kulp

    Data Scientist

    DropViz app developer

Data Downloads


Metacells

Gene expression profiles of the 565 transcriptionally distinct cell populations identified across nine regions in the adult mouse brain. Each column is a "metacell." There is one metacell for every subcluster, which contains the aggregate UMI counts for all the single-cells that belong to that subcluster.

CSV: metacells.BrainCellAtlas_Saunders_version_2018.04.01.csv (40M)

R Data: metacells.BrainCellAtlas_Saunders_version_2018.04.01.rds (15M)

Annotations

Annotation file for the 565 atlas cell populations. Provides the tissue of origin, cell class, formal markers, formal full name and common name (anatomical best guess) for each metacell.

CSV: annotation.BrainCellAtlas_Saunders_version_2018.04.01.csv (54K)

Excel: annotation.BrainCellAtlas_Saunders_version_2018.04.01.xlsx (83K)

R Data: annotation.BrainCellAtlas_Saunders_version_2018.04.01.rds (12K)

Single Cell Suspension Protocol from Acute Adult Brain

Saunders_scBrainSuspensionProtocol_v1_180419.pdf

Video Tutorials


Feedback

We welcome any comments, bug reports, and feature requests. Please send all feedback to mouse.dropviz@gmail.com .