@@ -36,6 +36,27 @@ library(CuratedAtlasQueryR)
3636``` r
3737metadata <- get_metadata()
3838metadata
39+ # > # Source: table</vast/scratch/users/milton.m/cache/R/CuratedAtlasQueryR/metadata.0.2.3.parquet> [?? x 56]
40+ # > # Database: DuckDB 0.7.1 [unknown@Linux 3.10.0-1160.88.1.el7.x86_64:R 4.2.1/:memory:]
41+ # > cell_ sample_ cell_…¹ cell_…² confi…³ cell_…⁴ cell_…⁵ cell_…⁶ sampl…⁷ _samp…⁸
42+ # > <chr> <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr>
43+ # > 1 AAAC… 689e2f… basal … basal_… 1 <NA> <NA> <NA> f297c7… D17PrP…
44+ # > 2 AAAC… 689e2f… basal … basal_… 1 <NA> <NA> <NA> f297c7… D17PrP…
45+ # > 3 AAAC… 689e2f… lumina… lumina… 1 <NA> <NA> <NA> 930938… D17PrP…
46+ # > 4 AAAC… 689e2f… lumina… lumina… 1 <NA> <NA> <NA> 930938… D17PrP…
47+ # > 5 AAAC… 689e2f… basal … basal_… 1 <NA> <NA> <NA> f297c7… D17PrP…
48+ # > 6 AAAC… 689e2f… basal … basal_… 1 <NA> <NA> <NA> f297c7… D17PrP…
49+ # > 7 AAAC… 689e2f… basal … basal_… 1 <NA> <NA> <NA> f297c7… D17PrP…
50+ # > 8 AAAC… 689e2f… basal … basal_… 1 <NA> <NA> <NA> f297c7… D17PrP…
51+ # > 9 AAAC… 689e2f… lumina… lumina… 1 <NA> <NA> <NA> 930938… D17PrP…
52+ # > 10 AAAC… 689e2f… basal … basal_… 1 <NA> <NA> <NA> f297c7… D17PrP…
53+ # > # … with more rows, 46 more variables: assay <chr>,
54+ # > # assay_ontology_term_id <chr>, file_id_db <chr>,
55+ # > # cell_type_ontology_term_id <chr>, development_stage <chr>,
56+ # > # development_stage_ontology_term_id <chr>, disease <chr>,
57+ # > # disease_ontology_term_id <chr>, ethnicity <chr>,
58+ # > # ethnicity_ontology_term_id <chr>, experiment___ <chr>, file_id <chr>,
59+ # > # is_primary_data_x <chr>, organism <chr>, organism_ontology_term_id <chr>, …
3960```
4061
4162The ` metadata ` variable can then be re-used for all subsequent queries.
@@ -93,8 +114,8 @@ single_cell_counts
93114# > assays(1): counts
94115# > rownames(36229): A1BG A1BG-AS1 ... ZZEF1 ZZZ3
95116# > rowData names(0):
96- # > colnames(1571): ACAGCCGGTCCGTTAA_F02526_1 GGGAATGAGCCCAGCT_F02526_1 ... TACAACGTCAGCATTG_SC84_1
97- # > CATTCGCTCAATACCG_F02526_1
117+ # > colnames(1571): ACAGCCGGTCCGTTAA_F02526_1 GGGAATGAGCCCAGCT_F02526_1 ...
118+ # > TACAACGTCAGCATTG_SC84_1 CATTCGCTCAATACCG_F02526_1
98119# > colData names(56): sample_ cell_type ... updated_at_y original_cell_id
99120# > reducedDimNames(0):
100121# > mainExpName: NULL
@@ -129,8 +150,8 @@ single_cell_counts
129150# > assays(1): cpm
130151# > rownames(36229): A1BG A1BG-AS1 ... ZZEF1 ZZZ3
131152# > rowData names(0):
132- # > colnames(1571): ACAGCCGGTCCGTTAA_F02526_1 GGGAATGAGCCCAGCT_F02526_1 ... TACAACGTCAGCATTG_SC84_1
133- # > CATTCGCTCAATACCG_F02526_1
153+ # > colnames(1571): ACAGCCGGTCCGTTAA_F02526_1 GGGAATGAGCCCAGCT_F02526_1 ...
154+ # > TACAACGTCAGCATTG_SC84_1 CATTCGCTCAATACCG_F02526_1
134155# > colData names(56): sample_ cell_type ... updated_at_y original_cell_id
135156# > reducedDimNames(0):
136157# > mainExpName: NULL
@@ -162,8 +183,8 @@ single_cell_counts
162183# > assays(1): cpm
163184# > rownames(1): PUM1
164185# > rowData names(0):
165- # > colnames(1571): ACAGCCGGTCCGTTAA_F02526_1 GGGAATGAGCCCAGCT_F02526_1 ... TACAACGTCAGCATTG_SC84_1
166- # > CATTCGCTCAATACCG_F02526_1
186+ # > colnames(1571): ACAGCCGGTCCGTTAA_F02526_1 GGGAATGAGCCCAGCT_F02526_1 ...
187+ # > TACAACGTCAGCATTG_SC84_1 CATTCGCTCAATACCG_F02526_1
167188# > colData names(56): sample_ cell_type ... updated_at_y original_cell_id
168189# > reducedDimNames(0):
169190# > mainExpName: NULL
@@ -287,18 +308,61 @@ data frame.
287308harmonised <- get_metadata() | > dplyr :: filter(tissue == " kidney blood vessel" )
288309unharmonised <- get_unharmonised_metadata(harmonised )
289310unharmonised
311+ # > # A tibble: 4 × 2
312+ # > file_id unharmonised
313+ # > <chr> <list>
314+ # > 1 63523aa3-0d04-4fc6-ac59-5cadd3e73a14 <tbl_dck_[,17]>
315+ # > 2 8fee7b82-178b-4c04-bf23-04689415690d <tbl_dck_[,12]>
316+ # > 3 dc9d8cdd-29ee-4c44-830c-6559cb3d0af6 <tbl_dck_[,14]>
317+ # > 4 f7e94dbb-8638-4616-aaf9-16e2212c369f <tbl_dck_[,14]>
290318```
291319
292320Notice that the columns differ between each dataset’s data frame:
293321
294322``` r
295- dplyr :: pull(unharmonised , unharmonised ) | > head(2 )
323+ dplyr :: pull(unharmonised ) | > head(2 )
324+ # > [[1]]
325+ # > # Source: SQL [?? x 17]
326+ # > # Database: DuckDB 0.7.1 [unknown@Linux 3.10.0-1160.88.1.el7.x86_64:R 4.2.1/:memory:]
327+ # > cell_ file_id donor…¹ donor…² libra…³ mappe…⁴ sampl…⁵ suspe…⁶ suspe…⁷ autho…⁸
328+ # > <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
329+ # > 1 4602… 63523a… 27 mon… a8536b… 5ddaea… GENCOD… 61bf84… cell d8a44f… Pelvic…
330+ # > 2 4602… 63523a… 27 mon… a8536b… 5ddaea… GENCOD… 61bf84… cell d8a44f… Pelvic…
331+ # > 3 4602… 63523a… 27 mon… a8536b… 5ddaea… GENCOD… 61bf84… cell d8a44f… Pelvic…
332+ # > 4 4602… 63523a… 27 mon… a8536b… 5ddaea… GENCOD… 61bf84… cell d8a44f… Pelvic…
333+ # > 5 4602… 63523a… 27 mon… a8536b… 5ddaea… GENCOD… 61bf84… cell d8a44f… Pelvic…
334+ # > 6 4602… 63523a… 27 mon… a8536b… 5ddaea… GENCOD… 61bf84… cell d8a44f… Pelvic…
335+ # > 7 4602… 63523a… 27 mon… a8536b… 5ddaea… GENCOD… 61bf84… cell d8a44f… Pelvic…
336+ # > 8 4602… 63523a… 27 mon… a8536b… 5ddaea… GENCOD… 61bf84… cell d8a44f… Pelvic…
337+ # > 9 4602… 63523a… 19 mon… 463181… 671785… GENCOD… 125234… cell c7485e… CD4 T …
338+ # > 10 4602… 63523a… 27 mon… a8536b… 5ddaea… GENCOD… 61bf84… cell d8a44f… Pelvic…
339+ # > # … with more rows, 7 more variables: cell_state <chr>,
340+ # > # reported_diseases <chr>, Short_Sample <chr>, Project <chr>,
341+ # > # Experiment <chr>, compartment <chr>, broad_celltype <chr>, and abbreviated
342+ # > # variable names ¹donor_age, ²donor_uuid, ³library_uuid,
343+ # > # ⁴mapped_reference_annotation, ⁵sample_uuid, ⁶suspension_type,
344+ # > # ⁷suspension_uuid, ⁸author_cell_type
345+ # >
346+ # > [[2]]
347+ # > # Source: SQL [?? x 12]
348+ # > # Database: DuckDB 0.7.1 [unknown@Linux 3.10.0-1160.88.1.el7.x86_64:R 4.2.1/:memory:]
349+ # > cell_ file_id orig.…¹ nCoun…² nFeat…³ seura…⁴ Project donor…⁵ compa…⁶ broad…⁷
350+ # > <chr> <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <chr> <chr>
351+ # > 1 1069 8fee7b… 4602ST… 16082 3997 25 Experi… Wilms3 non_PT Pelvic…
352+ # > 2 1214 8fee7b… 4602ST… 1037 606 25 Experi… Wilms3 non_PT Pelvic…
353+ # > 3 2583 8fee7b… 4602ST… 3028 1361 25 Experi… Wilms3 non_PT Pelvic…
354+ # > 4 2655 8fee7b… 4602ST… 1605 859 25 Experi… Wilms3 non_PT Pelvic…
355+ # > 5 3609 8fee7b… 4602ST… 1144 682 25 Experi… Wilms3 non_PT Pelvic…
356+ # > 6 3624 8fee7b… 4602ST… 1874 963 25 Experi… Wilms3 non_PT Pelvic…
357+ # > 7 3946 8fee7b… 4602ST… 1296 755 25 Experi… Wilms3 non_PT Pelvic…
358+ # > 8 5163 8fee7b… 4602ST… 11417 3255 25 Experi… Wilms3 non_PT Pelvic…
359+ # > 9 5446 8fee7b… 4602ST… 1769 946 19 Experi… Wilms2 lympho… CD4 T …
360+ # > 10 6275 8fee7b… 4602ST… 3750 1559 25 Experi… Wilms3 non_PT Pelvic…
361+ # > # … with more rows, 2 more variables: author_cell_type <chr>, Sample <chr>, and
362+ # > # abbreviated variable names ¹orig.ident, ²nCount_RNA, ³nFeature_RNA,
363+ # > # ⁴seurat_clusters, ⁵donor_id, ⁶compartment, ⁷broad_celltype
296364```
297365
298- \[\[ 1\]\]
299-
300- \[\[ 2\]\]
301-
302366# Cell metadata
303367
304368Dataset-specific columns (definitions available at
0 commit comments