Skip to content
This repository was archived by the owner on Oct 14, 2025. It is now read-only.

Commit 940dba1

Browse files
committed
Import dplyr in example
1 parent 45032f4 commit 940dba1

File tree

2 files changed

+126
-145
lines changed

2 files changed

+126
-145
lines changed

R/query.R

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -277,6 +277,7 @@ get_seurat <- function(...) {
277277
#' will not work.
278278
#' @export
279279
#' @examples
280+
#' library(dplyr)
280281
#' filtered_metadata <- get_metadata() |>
281282
#' filter(
282283
#' ethnicity == "African" &

README.md

Lines changed: 125 additions & 145 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
HCAquery
1+
readme
22
================
33

44
Load the package
@@ -16,182 +16,162 @@ Load the metadata
1616

1717
``` r
1818
get_metadata()
19+
#> # Source: table<metadata> [?? x 56]
20+
#> # Database: sqlite 3.40.0 [/vast/scratch/users/milton.m/cache/hca_harmonised/metadata.sqlite]
21+
#> .cell sampl…¹ .sample .samp…² assay assay…³ file_…⁴ cell_…⁵ cell_…⁶ devel…⁷ devel…⁸ disease disea…⁹ ethni…˟ ethni…˟ file_id is_pr…˟ organ…˟ organ…˟ sampl…˟ sex sex_o…˟ tissue
22+
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
23+
#> 1 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea… HsapDv… normal PATO:0… Europe… HANCES… 00d626… FALSE Homo s… NCBITa… <NA> male PATO:0… perip…
24+
#> 2 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea… HsapDv… normal PATO:0… Europe… HANCES… 00d626… FALSE Homo s… NCBITa… <NA> male PATO:0… perip…
25+
#> 3 AAACCT… 02eb2e… 5f20d7… D17PrP… 10x … EFO:00… 30f754… lumina… CL:000… 31-yea… HsapDv… normal PATO:0… Europe… HANCES… 00d626… FALSE Homo s… NCBITa… <NA> male PATO:0… perip…
26+
#> 4 AAACCT… 02eb2e… 5f20d7… D17PrP… 10x … EFO:00… 30f754… lumina… CL:000… 31-yea… HsapDv… normal PATO:0… Europe… HANCES… 00d626… FALSE Homo s… NCBITa… <NA> male PATO:0… perip…
27+
#> 5 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea… HsapDv… normal PATO:0… Europe… HANCES… 00d626… FALSE Homo s… NCBITa… <NA> male PATO:0… perip…
28+
#> 6 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea… HsapDv… normal PATO:0… Europe… HANCES… 00d626… FALSE Homo s… NCBITa… <NA> male PATO:0… perip…
29+
#> 7 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea… HsapDv… normal PATO:0… Europe… HANCES… 00d626… FALSE Homo s… NCBITa… <NA> male PATO:0… perip…
30+
#> 8 AAACGG… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea… HsapDv… normal PATO:0… Europe… HANCES… 00d626… FALSE Homo s… NCBITa… <NA> male PATO:0… perip…
31+
#> 9 AAACGG… 02eb2e… 5f20d7… D17PrP… 10x … EFO:00… 30f754… lumina… CL:000… 31-yea… HsapDv… normal PATO:0… Europe… HANCES… 00d626… FALSE Homo s… NCBITa… <NA> male PATO:0… perip…
32+
#> 10 AAACGG… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea… HsapDv… normal PATO:0… Europe… HANCES… 00d626… FALSE Homo s… NCBITa… <NA> male PATO:0… perip…
33+
#> # … with more rows, 33 more variables: tissue_ontology_term_id <chr>, tissue_harmonised <chr>, age_days <dbl>, dataset_id <chr>, collection_id <chr>, cell_count <int>,
34+
#> # dataset_deployments <chr>, is_primary_data.y <chr>, is_valid <int>, linked_genesets <int>, mean_genes_per_cell <dbl>, name <chr>, published <int>, revision <int>,
35+
#> # schema_version <chr>, tombstone <int>, x_normalization <chr>, created_at.x <dbl>, published_at <dbl>, revised_at <dbl>, updated_at.x <dbl>, filename <chr>, filetype <chr>,
36+
#> # s3_uri <chr>, user_submitted <int>, created_at.y <dbl>, updated_at.y <dbl>, cell_type_harmonised <chr>, confidence_class <dbl>, cell_annotation_azimuth_l2 <chr>,
37+
#> # cell_annotation_blueprint_singler <chr>, n_cell_type_in_tissue <int>, n_tissue_in_cell_type <int>, and abbreviated variable names ¹​sample_id_db, ²​.sample_name,
38+
#> # ³​assay_ontology_term_id, ⁴​file_id_db, ⁵​cell_type, ⁶​cell_type_ontology_term_id, ⁷​development_stage, ⁸​development_stage_ontology_term_id, ⁹​disease_ontology_term_id, ˟​ethnicity,
39+
#> # ˟​ethnicity_ontology_term_id, ˟​is_primary_data.x, ˟​organism, ˟​organism_ontology_term_id, ˟​sample_placeholder, ˟​sex_ontology_term_id
1940
```
2041

21-
## # Source: table<metadata> [?? x 56]
22-
## # Database: sqlite 3.39.3 [/vast/projects/RCP/human_cell_atlas/metadata.sqlite]
23-
## .cell sampl…¹ .sample .samp…² assay assay…³ file_…⁴ cell_…⁵ cell_…⁶ devel…⁷
24-
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
25-
## 1 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea…
26-
## 2 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea…
27-
## 3 AAACCT… 02eb2e… 5f20d7… D17PrP… 10x … EFO:00… 30f754… lumina… CL:000… 31-yea…
28-
## 4 AAACCT… 02eb2e… 5f20d7… D17PrP… 10x … EFO:00… 30f754… lumina… CL:000… 31-yea…
29-
## 5 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea…
30-
## 6 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea…
31-
## 7 AAACCT… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea…
32-
## 8 AAACGG… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea…
33-
## 9 AAACGG… 02eb2e… 5f20d7… D17PrP… 10x … EFO:00… 30f754… lumina… CL:000… 31-yea…
34-
## 10 AAACGG… 8a0fe0… 5f20d7… D17PrP… 10x … EFO:00… 1e334b… basal … CL:000… 31-yea…
35-
## # … with more rows, 46 more variables:
36-
## # development_stage_ontology_term_id <chr>, disease <chr>,
37-
## # disease_ontology_term_id <chr>, ethnicity <chr>,
38-
## # ethnicity_ontology_term_id <chr>, file_id <chr>, is_primary_data.x <chr>,
39-
## # organism <chr>, organism_ontology_term_id <chr>, sample_placeholder <chr>,
40-
## # sex <chr>, sex_ontology_term_id <chr>, tissue <chr>,
41-
## # tissue_ontology_term_id <chr>, tissue_harmonised <chr>, age_days <dbl>, …
42-
4342
Explore the HCA content
4443

4544
``` r
46-
get_metadata() |>
47-
distinct(tissue, file_id) |>
48-
count(tissue) |>
49-
arrange(desc(n))
45+
get_metadata() |>
46+
distinct(tissue, file_id) |>
47+
count(tissue) |>
48+
arrange(desc(n))
49+
#> # Source: SQL [?? x 2]
50+
#> # Database: sqlite 3.40.0 [/vast/scratch/users/milton.m/cache/hca_harmonised/metadata.sqlite]
51+
#> # Ordered by: desc(n)
52+
#> tissue n
53+
#> <chr> <int>
54+
#> 1 blood 47
55+
#> 2 heart left ventricle 46
56+
#> 3 cortex of kidney 31
57+
#> 4 renal medulla 29
58+
#> 5 lung 27
59+
#> 6 middle temporal gyrus 24
60+
#> 7 liver 24
61+
#> 8 kidney 19
62+
#> 9 intestine 18
63+
#> 10 thymus 17
64+
#> # … with more rows
5065
```
5166

52-
## # Source: SQL [?? x 2]
53-
## # Database: sqlite 3.39.3 [/vast/projects/RCP/human_cell_atlas/metadata.sqlite]
54-
## # Ordered by: desc(n)
55-
## tissue n
56-
## <chr> <int>
57-
## 1 blood 47
58-
## 2 heart left ventricle 46
59-
## 3 cortex of kidney 31
60-
## 4 renal medulla 29
61-
## 5 lung 27
62-
## 6 middle temporal gyrus 24
63-
## 7 liver 24
64-
## 8 kidney 19
65-
## 9 intestine 18
66-
## 10 thymus 17
67-
## # … with more rows
68-
6967
Query raw counts
7068

7169
``` r
72-
sce =
73-
get_metadata() |>
74-
filter(
75-
ethnicity == "African" &
76-
assay %LIKE% "%10x%" &
77-
tissue == "lung parenchyma" &
78-
cell_type %LIKE% "%CD4%"
79-
) |>
80-
81-
get_SingleCellExperiment()
82-
```
83-
84-
## Reading 1 files.
85-
86-
## .
70+
sce <-
71+
get_metadata() |>
72+
filter(
73+
ethnicity == "African" &
74+
assay %LIKE% "%10x%" &
75+
tissue == "lung parenchyma" &
76+
cell_type %LIKE% "%CD4%"
77+
) |>
78+
get_SingleCellExperiment()
79+
#> ℹ Realising metadata.
80+
#> ℹ Synchronising files
81+
#> ℹ Attaching metadata.
82+
#> ℹ Compiling Single Cell Experiment.
8783

88-
``` r
8984
sce
85+
#> class: SingleCellExperiment
86+
#> dim: 60661 1571
87+
#> metadata(0):
88+
#> assays(2): counts cpm
89+
#> rownames(60661): TSPAN6 TNMD ... RP11-175I6.6 PRSS43P
90+
#> rowData names(0):
91+
#> colnames(1571): ACAGCCGGTCCGTTAA_F02526 GGGAATGAGCCCAGCT_F02526 ... TACAACGTCAGCATTG_SC84 CATTCGCTCAATACCG_F02526
92+
#> colData names(55): sample_id_db .sample ... n_cell_type_in_tissue n_tissue_in_cell_type
93+
#> reducedDimNames(0):
94+
#> mainExpName: NULL
95+
#> altExpNames(0):
9096
```
9197

92-
## class: SingleCellExperiment
93-
## dim: 60661 1571
94-
## metadata(0):
95-
## assays(1): counts
96-
## rownames(60661): TSPAN6 TNMD ... RP11-175I6.6 PRSS43P
97-
## rowData names(0):
98-
## colnames(1571): ACAGCCGGTCCGTTAA_F02526 GGGAATGAGCCCAGCT_F02526 ...
99-
## TACAACGTCAGCATTG_SC84 CATTCGCTCAATACCG_F02526
100-
## colData names(55): sample_id_db .sample ... n_cell_type_in_tissue
101-
## n_tissue_in_cell_type
102-
## reducedDimNames(0):
103-
## mainExpName: NULL
104-
## altExpNames(0):
105-
10698
Query counts scaled per million. This is helpful if just few genes are
10799
of interest
108100

109101
``` r
110-
sce =
111-
get_metadata() |>
112-
filter(
113-
ethnicity == "African" &
114-
assay %LIKE% "%10x%" &
115-
tissue == "lung parenchyma" &
116-
cell_type %LIKE% "%CD4%"
117-
) |>
118-
119-
get_SingleCellExperiment(assay = "counts_per_million")
120-
```
121-
122-
## Reading 1 files.
123-
124-
## .
102+
sce <-
103+
get_metadata() |>
104+
filter(
105+
ethnicity == "African" &
106+
assay %LIKE% "%10x%" &
107+
tissue == "lung parenchyma" &
108+
cell_type %LIKE% "%CD4%"
109+
) |>
110+
get_SingleCellExperiment(assays = "cpm")
111+
#> ℹ Realising metadata.
112+
#> ℹ Synchronising files
113+
#> ℹ Attaching metadata.
114+
#> ℹ Compiling Single Cell Experiment.
125115

126-
``` r
127116
sce
117+
#> class: SingleCellExperiment
118+
#> dim: 60661 1571
119+
#> metadata(0):
120+
#> assays(1): cpm
121+
#> rownames(60661): TSPAN6 TNMD ... RP11-175I6.6 PRSS43P
122+
#> rowData names(0):
123+
#> colnames(1571): ACAGCCGGTCCGTTAA_F02526 GGGAATGAGCCCAGCT_F02526 ... TACAACGTCAGCATTG_SC84 CATTCGCTCAATACCG_F02526
124+
#> colData names(55): sample_id_db .sample ... n_cell_type_in_tissue n_tissue_in_cell_type
125+
#> reducedDimNames(0):
126+
#> mainExpName: NULL
127+
#> altExpNames(0):
128128
```
129129

130-
## class: SingleCellExperiment
131-
## dim: 60661 1571
132-
## metadata(0):
133-
## assays(1): counts_per_million
134-
## rownames(60661): TSPAN6 TNMD ... RP11-175I6.6 PRSS43P
135-
## rowData names(0):
136-
## colnames(1571): ACAGCCGGTCCGTTAA_F02526 GGGAATGAGCCCAGCT_F02526 ...
137-
## TACAACGTCAGCATTG_SC84 CATTCGCTCAATACCG_F02526
138-
## colData names(55): sample_id_db .sample ... n_cell_type_in_tissue
139-
## n_tissue_in_cell_type
140-
## reducedDimNames(0):
141-
## mainExpName: NULL
142-
## altExpNames(0):
143-
144130
Extract only a subset of genes:
145131

146132
``` r
147-
get_metadata() |>
133+
get_metadata() |>
148134
filter(
149-
ethnicity == "African" &
150-
assay %LIKE% "%10x%" &
151-
tissue == "lung parenchyma" &
135+
ethnicity == "African" &
136+
assay %LIKE% "%10x%" &
137+
tissue == "lung parenchyma" &
152138
cell_type %LIKE% "%CD4%"
153-
) |>
154-
get_SingleCellExperiment(genes = "PUM1")
139+
) |>
140+
get_SingleCellExperiment(features = "PUM1")
141+
#> ℹ Realising metadata.
142+
#> ℹ Synchronising files
143+
#> ℹ Attaching metadata.
144+
#> ℹ Compiling Single Cell Experiment.
145+
#> class: SingleCellExperiment
146+
#> dim: 1 1571
147+
#> metadata(0):
148+
#> assays(2): counts cpm
149+
#> rownames(1): PUM1
150+
#> rowData names(0):
151+
#> colnames(1571): ACAGCCGGTCCGTTAA_F02526 GGGAATGAGCCCAGCT_F02526 ... TACAACGTCAGCATTG_SC84 CATTCGCTCAATACCG_F02526
152+
#> colData names(55): sample_id_db .sample ... n_cell_type_in_tissue n_tissue_in_cell_type
153+
#> reducedDimNames(0):
154+
#> mainExpName: NULL
155+
#> altExpNames(0):
155156
```
156157

157-
## Reading 1 files.
158-
159-
## .
160-
161-
## class: SingleCellExperiment
162-
## dim: 1 1571
163-
## metadata(0):
164-
## assays(1): counts
165-
## rownames(1): PUM1
166-
## rowData names(0):
167-
## colnames(1571): ACAGCCGGTCCGTTAA_F02526 GGGAATGAGCCCAGCT_F02526 ...
168-
## TACAACGTCAGCATTG_SC84 CATTCGCTCAATACCG_F02526
169-
## colData names(55): sample_id_db .sample ... n_cell_type_in_tissue
170-
## n_tissue_in_cell_type
171-
## reducedDimNames(0):
172-
## mainExpName: NULL
173-
## altExpNames(0):
174-
175158
Extract the counts as a Seurat object:
176159

177160
``` r
178-
get_metadata() |>
161+
get_metadata() |>
179162
filter(
180-
ethnicity == "African" &
181-
assay %LIKE% "%10x%" &
182-
tissue == "lung parenchyma" &
163+
ethnicity == "African" &
164+
assay %LIKE% "%10x%" &
165+
tissue == "lung parenchyma" &
183166
cell_type %LIKE% "%CD4%"
184-
) |>
185-
get_seurat()
167+
) |>
168+
get_seurat()
169+
#> ℹ Realising metadata.
170+
#> ℹ Synchronising files
171+
#> ℹ Attaching metadata.
172+
#> ℹ Compiling Single Cell Experiment.
173+
#> Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
174+
#> An object of class Seurat
175+
#> 60661 features across 1571 samples within 1 assay
176+
#> Active assay: originalexp (60661 features, 0 variable features)
186177
```
187-
188-
## Reading 1 files.
189-
190-
## .
191-
192-
## Warning: Feature names cannot have underscores ('_'), replacing with dashes
193-
## ('-')
194-
195-
## An object of class Seurat
196-
## 60661 features across 1571 samples within 1 assay
197-
## Active assay: originalexp (60661 features, 0 variable features)

0 commit comments

Comments
 (0)