single_cell_data

`cell_metadata.csv.gz`

CSV file containing the metadata associated with each cell, including cell_id, cell area, transcript counts and protein content if available. This can be used to build single-cell analysis objects (Seurat/Scanpy) and perform downstream analyses. For more information, see data import.

name	type	description
`label`	str	Unique cell-id derived from segmentation mask
`cell_x/y`	float	x/y coordinate of the cell centroid
`nuclei_area`	int	Area of the nuclear segmentation
`nuclei_expanded_area`	int	Area of the expanded nuclear segmentation
`total_counts`	int	Total transcript counts
`log1p_total_counts`	float	Log number of total transcripts
`n_genes_by_counts`	int	Number of unique genes detected
`log1p_n_genes_by_counts`	float	Log number of unique genes detected
`*stain_intensity_mean`	float	Mean intensity for nuclear and cytoplasmic stains
`<protein>_intensity_mean`	float	Mean intensity for a given protein

`cell_by_transcript.csv.gz`

CSV file containing the cell x transcript matrix. Each entry in the table is the counts for a given transcript in a given cell.

`clustering_umap.csv.gz`

CSV file containing pre-computed cluster annotations and UMAP coordinates for each cell.

`dgex.csv.gz`

CSV file containing differential gene expression (DGEx) results for the pre-computed clusters from the clustering_umap table.

column	type	description
`names`	str	Gene symbol
`scores`	float	Z-score from Wilcoxon rank-sum test
`logfoldchanges`	float	LogFoldChange for the given cluster compared to all other clusters combined
`pvals`	float	P-value from Wilcoxon rank-sum test
`pvals_adj`	float	Adjusted P-value
`pct_nz_group`	float	Percentage of non-zero values in the given cluster.
`pct_nz_reference`	float	Percentage of non-zero values in all cells outside the given cluster.
`group`	int	Leiden cluster identity
`leiden_res`	str	Leiden clustering resolution that this entry is derived from (1 per gene per cluster)

`feature_matrix.h5`

H5 file containing the cell by transcript matrix with cell and transcript metadata. This file can be loaded into Python and used in scanpy or a number of other pipelines. See data import. The h5.obs table is equivalent to cell_metadata

`cell_by_protein.csv.gz`

CSV file containing cell x protein intensity matrix. Each entry in the table is the average protein intensity for a given protein in a cell.

`rna_protein_singlecell_correlation.csv`

CSV file containing a square correlation matrix between selected transcript counts and protein intensity means, computed per cell. Useful for quickly checking concordance between RNA and protein markers (e.g. CD4 counts vs CD4_intensity_mean).

`protein_singlecell_correlation.csv`

CSV file containing a protein-by-protein correlation matrix of per-cell intensity means, allowing assessment of protein marker co-expression.