Skip to content
This repository was archived by the owner on Oct 14, 2025. It is now read-only.

Commit af560a4

Browse files
committed
R CMD check fixes
1 parent 54e59c4 commit af560a4

File tree

6 files changed

+69
-45
lines changed

6 files changed

+69
-45
lines changed

R/unharmonised.R

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,13 @@
44
#' make sense for these to live in the main metadata table. This function is a
55
#' utility that allows easy fetching of this data if necessary.
66
#'
7-
#' @param dataset_ids A character vector, where each entry is a dataset ID
7+
#' @param dataset_id A character vector, where each entry is a dataset ID
88
#' obtained from the `$file_id` column of the table returned from
99
#' [get_metadata()]
10+
#' @param cells An optional character vector of cell IDs. If provided, only
11+
#' metadata for those cells will be returned.
12+
#' @param conn An optional DuckDB connection object. If provided, it will re-use
13+
#' the existing connection instead of opening a new one.
1014
#' @param remote_url Optional character vector of length 1. An HTTP URL pointing
1115
#' to the root URL under which all the unharmonised dataset files are located.
1216
#' @param cache_directory Optional character vector of length 1. A file path on
@@ -17,12 +21,13 @@
1721
#' @importFrom DBI dbConnect
1822
#' @importFrom duckdb duckdb
1923
#' @importFrom dplyr tbl filter
24+
#' @importFrom rlang .data
2025
#' @return A named list, where each name is a dataset file ID, and each value is
2126
#' a "lazy data frame", ie a `tbl`.
2227
#' @examples
2328
#' dataset = "838ea006-2369-4e2c-b426-b2a744a2b02b"
2429
#' harmonised_meta = get_metadata() |> dplyr::filter(file_id == dataset) |> dplyr::collect()
25-
#' unharmonised_meta = get_unharmonised_metadata_list(dataset)
30+
#' unharmonised_meta = get_unharmonised_dataset(dataset)
2631
#' unharmonised_tbl = dplyr::collect(unharmonised_meta[[dataset]])
2732
#' dplyr::left_join(harmonised_meta, unharmonised_tbl, by=c("file_id", "cell_"))
2833
get_unharmonised_dataset = function(
@@ -41,7 +46,7 @@ get_unharmonised_dataset = function(
4146
progress(type = "down", con = stderr())
4247
)
4348
tbl(conn, local_path) |>
44-
filter(cell_ %in% cells)
49+
filter(.data$cell_ %in% cells)
4550
}
4651

4752
#' Returns unharmonised metadata for a metadata query
@@ -54,16 +59,17 @@ get_unharmonised_dataset = function(
5459
#' * `unharmonised`: a nested tibble, with one row per cell in the input `metadata`, containing unharmonised metadata
5560
#' @export
5661
#' @importFrom dplyr group_by summarise filter collect
62+
#' @importFrom rlang .data
5763
#' @examples
5864
#' harmonised <- get_metadata() |> dplyr::filter(tissue == "kidney blood vessel")
5965
#' unharmonised <- get_unharmonised_metadata(harmonised)
6066
get_unharmonised_metadata = function(metadata, ...){
6167
args = list(...)
6268
metadata |>
6369
collect() |>
64-
group_by(file_id) |>
70+
group_by(.data$file_id) |>
6571
summarise(
66-
unharmonised = list(dataset_id=file_id[[1]], cells=cell_, conn=dbplyr::remote_con(metadata)) |>
72+
unharmonised = list(dataset_id=.data$file_id[[1]], cells=.data$cell_, conn=dbplyr::remote_con(metadata)) |>
6773
c(args) |>
6874
do.call(get_unharmonised_dataset, args=_) |>
6975
list()

README.Rmd

Lines changed: 9 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -279,26 +279,22 @@ knitr::include_graphics("man/figures/HLA_A_tissue_plot.png")
279279

280280
Various metadata fields are *not* common between datasets, so it does not
281281
make sense for these to live in the main metadata table. However, we can
282-
obtain it using the `get_unharmonised_metadata()` function.
283-
284-
Note how this table has additional columns that are not in the normal metadata:
282+
obtain it using the `get_unharmonised_metadata()` function. This function
283+
returns a data frame with one row per dataset, including the `unharmonised`
284+
column which contains unharmnised metadata as a nested data frame.
285285

286286
```{r}
287-
dataset = "838ea006-2369-4e2c-b426-b2a744a2b02b"
288-
unharmonised_meta = get_unharmonised_metadata(dataset)
289-
unharmonised_tbl = dplyr::collect(unharmonised_meta[[dataset]])
290-
unharmonised_tbl
287+
harmonised <- get_metadata() |> dplyr::filter(tissue == "kidney blood vessel")
288+
unharmonised <- get_unharmonised_metadata(harmonised)
289+
unharmonised
291290
```
292291

293-
If we have metadata from the normal metadata table that is from a single dataset,
294-
we can even join this additional metadata into one big data frame:
292+
Notice that the columns differ between each dataset's data frame:
293+
295294
```{r}
296-
harmonised_meta = get_metadata() |> dplyr::filter(file_id == dataset) |> dplyr::collect()
297-
dplyr::left_join(harmonised_meta, unharmonised_tbl, by=c("file_id", "cell_"))
295+
dplyr::pull(unharmonised, unharmonised) |> head(2)
298296
```
299297

300-
301-
302298
# Cell metadata
303299

304300
Dataset-specific columns (definitions available at cellxgene.cziscience.com)

man/get_unharmonised_dataset.Rd

Lines changed: 11 additions & 5 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

man/get_unharmonised_metadata.Rd

Lines changed: 7 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

tests/testthat/test-query.R

Lines changed: 22 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -157,18 +157,29 @@ test_that("get_SingleCellExperiment() assigns the right cell ID to each cell", {
157157
)
158158
})
159159

160-
test_that("get_unharmonised_metadata works with one ID", {
160+
test_that("get_unharmonised_dataset works with one ID", {
161161
dataset_id = "838ea006-2369-4e2c-b426-b2a744a2b02b"
162-
unharmonised_meta = get_unharmonised_metadata(dataset_id)
163-
unharmonised_tbl = unharmonised_meta[[dataset_id]]
164-
165-
expect_type(unharmonised_meta, "list")
166-
expect_s3_class(unharmonised_tbl, "tbl")
162+
unharmonised_meta = get_unharmonised_dataset(dataset_id)
163+
164+
expect_s3_class(unharmonised_meta, "tbl")
167165
})
168166

169-
test_that("get_unharmonised_metadata works with multiple IDs", {
170-
dataset_ids = c("838ea006-2369-4e2c-b426-b2a744a2b02b", "83b9cb97-9ee4-404d-8cdf-ccede8235356")
171-
unharmonised_meta = get_unharmonised_metadata(dataset_ids)
167+
test_that("get_unharmonised_metadata() returns the appropriate data", {
168+
harmonised <- get_metadata() |> dplyr::filter(tissue == "kidney blood vessel")
169+
unharmonised <- get_unharmonised_metadata(harmonised)
172170

173-
expect_equal(names(unharmonised_meta), dataset_ids)
174-
})
171+
unharmonised |> is.data.frame() |> expect_true()
172+
expect_setequal(colnames(unharmonised), c("file_id", "unharmonised"))
173+
174+
# The number of cells in both harmonised and unharmonised should be the same
175+
expect_equal(
176+
dplyr::collect(harmonised) |> nrow(),
177+
unharmonised$unharmonised |> purrr::map_int(function(df) dplyr::tally(df) |> dplyr::pull(n)) |> sum()
178+
)
179+
180+
# The number of datasets in both harmonised and unharmonised should be the same
181+
expect_equal(
182+
harmonised |> dplyr::group_by(file_id) |> dplyr::n_groups(),
183+
nrow(unharmonised)
184+
)
185+
})

vignettes/Introduction.Rmd

Lines changed: 9 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -297,22 +297,20 @@ knitr::include_graphics("../man/figures/HLA_A_tissue_plot.png")
297297

298298
Various metadata fields are *not* common between datasets, so it does not
299299
make sense for these to live in the main metadata table. However, we can
300-
obtain it using the `get_unharmonised_metadata()` function.
301-
302-
Note how this table has additional columns that are not in the normal metadata:
300+
obtain it using the `get_unharmonised_metadata()` function. This function
301+
returns a data frame with one row per dataset, including the `unharmonised`
302+
column which contains unharmnised metadata as a nested data frame.
303303

304304
```{r}
305-
dataset = "838ea006-2369-4e2c-b426-b2a744a2b02b"
306-
unharmonised_meta = get_unharmonised_metadata(dataset)
307-
unharmonised_tbl = dplyr::collect(unharmonised_meta[[dataset]])
308-
unharmonised_tbl
305+
harmonised <- get_metadata() |> dplyr::filter(tissue == "kidney blood vessel")
306+
unharmonised <- get_unharmonised_metadata(harmonised)
307+
unharmonised
309308
```
310309

311-
If we have metadata from the normal metadata table that is from a single dataset,
312-
we can even join this additional metadata into one big data frame:
310+
Notice that the columns differ between each dataset's data frame:
311+
313312
```{r}
314-
harmonised_meta = get_metadata() |> dplyr::filter(file_id == dataset) |> dplyr::collect()
315-
dplyr::left_join(harmonised_meta, unharmonised_tbl, by=c("file_id", "cell_"))
313+
dplyr::pull(unharmonised, unharmonised) |> head(2)
316314
```
317315

318316
# Cell metadata

0 commit comments

Comments
 (0)