Skip to content
This repository was archived by the owner on Oct 14, 2025. It is now read-only.

Commit 9f3c57e

Browse files
committed
update README with metadata description
1 parent 0d485ac commit 9f3c57e

File tree

2 files changed

+105
-0
lines changed

2 files changed

+105
-0
lines changed

README.Rmd

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,10 @@ title: "CuratedAtlasQueryR"
33
output: github_document
44
---
55

6+
`CuratedAtlasQuery` is a query interface that allow the programmatic exploration and retrieval of the harmonised, curated and reannotated CELLxGENE single-cell human cell atlas. Data can be retrieved at cell, sample, or dataset levels based on filtering criteria.
7+
8+
# Query interface
9+
610
```{r, include = FALSE}
711
# Note: knit this to the repo readme file using:
812
# rmarkdown::render("vignettes/readme.Rmd", output_format = "github_document", output_dir = getwd())
@@ -167,3 +171,36 @@ get_metadata() |>
167171
knitr::include_graphics("inst/NCAM1_figure.png")
168172
```
169173

174+
# Cell metadata
175+
176+
Dataset-specific columns (definitions available at cellxgene.cziscience.com)
177+
178+
`cell_count`, `collection_id`, `created_at.x`, `created_at.y`, `dataset_deployments`, `dataset_id`, `file_id`, `filename`, `filetype`, `is_primary_data.y`, `is_valid`, `linked_genesets`, `mean_genes_per_cell`, `name`, `published`, `published_at`, `revised_at`, `revision`, `s3_uri`, `schema_version`, `tombstone`, `updated_at.x`, `updated_at.y`, `user_submitted`, `x_normalization`
179+
180+
Sample-specific columns (definitions available at cellxgene.cziscience.com)
181+
182+
`.sample`, `.sample_name`, `age_days`, `assay`, `assay_ontology_term_id`, `development_stage`, `development_stage_ontology_term_id`, `ethnicity`, `ethnicity_ontology_term_id`, `experiment___`, `organism`, `organism_ontology_term_id`, `sample_placeholder`, `sex`, `sex_ontology_term_id`, `tissue`, `tissue_harmonised`, `tissue_ontology_term_id`, `disease`, `disease_ontology_term_id`, `is_primary_data.x`
183+
184+
Cell-specific columns (definitions available at cellxgene.cziscience.com)
185+
186+
`.cell`, `cell_type`, `cell_type_ontology_term_idm`, `cell_type_harmonised`, `confidence_class`, `cell_annotation_azimuth_l2`, `cell_annotation_blueprint_singler`
187+
188+
Through harmonisation and curation we introduced custom column, not present in the original CELLxGENE metadata
189+
190+
- `tissue_harmonised`: a coarser tissue name for better filtering
191+
- `age_days`: the number of days corresponding to the age
192+
- `cell_type_harmonised`: the consensus call identiti (for immune cells) using the original and three novel annotations using Seurat Azimuth and SingleR
193+
- `confidence_class`: an ordinal class of how confident `cell_type_harmonised` is. 1 is complete consensus, 2 is 3 out of four and so on.
194+
- `cell_annotation_azimuth_l2`: Azimuth cell annotation
195+
- `cell_annotation_blueprint_singler`: SingleR cell annotation using Blueprint reference
196+
- `cell_annotation_blueprint_monaco`: SingleR cell annotation using Monaco reference
197+
- `sample_id_db`: Sample subdivision for internal use
198+
- `file_id_db`: File subdivision for internal use
199+
- `.sample`: Sample ID
200+
- `.sample_name`: How samples were defined
201+
202+
# RNA abundance
203+
204+
The `raw` assay includes RNA abundance in the positive real scale (not transformed with non-linear functions, e.g. log sqrt). Originally CELLxGENE include a mix of scales and tranformations specified in the `x_normalization` column.
205+
206+
The `cpm` assay includes counts per million.

README.md

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,13 @@
11
CuratedAtlasQueryR
22
================
33

4+
`CuratedAtlasQuery` is a query interface that allow the programmatic
5+
exploration and retrieval of the harmonised, curated and reannotated
6+
CELLxGENE single-cell human cell atlas. Data can be retrieved at cell,
7+
sample, or dataset levels based on filtering criteria.
8+
9+
# Query interface
10+
411
<img src="inst/logo.png" width="120px" height="139px" />
512

613
## Load the package
@@ -233,3 +240,64 @@ get_metadata() |>
233240
```
234241

235242
<img src="inst/NCAM1_figure.png" width="629" />
243+
244+
# Cell metadata
245+
246+
Dataset-specific columns (definitions available at
247+
cellxgene.cziscience.com)
248+
249+
`cell_count`, `collection_id`, `created_at.x`, `created_at.y`,
250+
`dataset_deployments`, `dataset_id`, `file_id`, `filename`, `filetype`,
251+
`is_primary_data.y`, `is_valid`, `linked_genesets`,
252+
`mean_genes_per_cell`, `name`, `published`, `published_at`,
253+
`revised_at`, `revision`, `s3_uri`, `schema_version`, `tombstone`,
254+
`updated_at.x`, `updated_at.y`, `user_submitted`, `x_normalization`
255+
256+
Sample-specific columns (definitions available at
257+
cellxgene.cziscience.com)
258+
259+
`.sample`, `.sample_name`, `age_days`, `assay`,
260+
`assay_ontology_term_id`, `development_stage`,
261+
`development_stage_ontology_term_id`, `ethnicity`,
262+
`ethnicity_ontology_term_id`, `experiment___`, `organism`,
263+
`organism_ontology_term_id`, `sample_placeholder`, `sex`,
264+
`sex_ontology_term_id`, `tissue`, `tissue_harmonised`,
265+
`tissue_ontology_term_id`, `disease`, `disease_ontology_term_id`,
266+
`is_primary_data.x`
267+
268+
Cell-specific columns (definitions available at
269+
cellxgene.cziscience.com)
270+
271+
`.cell`, `cell_type`, `cell_type_ontology_term_idm`,
272+
`cell_type_harmonised`, `confidence_class`,
273+
`cell_annotation_azimuth_l2`, `cell_annotation_blueprint_singler`
274+
275+
Through harmonisation and curation we introduced custom column, not
276+
present in the original CELLxGENE metadata
277+
278+
- `tissue_harmonised`: a coarser tissue name for better filtering
279+
- `age_days`: the number of days corresponding to the age
280+
- `cell_type_harmonised`: the consensus call identiti (for immune cells)
281+
using the original and three novel annotations using Seurat Azimuth
282+
and SingleR
283+
- `confidence_class`: an ordinal class of how confident
284+
`cell_type_harmonised` is. 1 is complete consensus, 2 is 3 out of four
285+
and so on.
286+
- `cell_annotation_azimuth_l2`: Azimuth cell annotation
287+
- `cell_annotation_blueprint_singler`: SingleR cell annotation using
288+
Blueprint reference
289+
- `cell_annotation_blueprint_monaco`: SingleR cell annotation using
290+
Monaco reference
291+
- `sample_id_db`: Sample subdivision for internal use
292+
- `file_id_db`: File subdivision for internal use
293+
- `.sample`: Sample ID
294+
- `.sample_name`: How samples were defined
295+
296+
# RNA abundance
297+
298+
The `raw` assay includes RNA abundance in the positive real scale (not
299+
transformed with non-linear functions, e.g. log sqrt). Originally
300+
CELLxGENE include a mix of scales and tranformations specified in the
301+
`x_normalization` column.
302+
303+
The `cpm` assay includes counts per million.

0 commit comments

Comments
 (0)