adding support for verbatim extension downloads#831
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request adds support for downloading DWCA verbatim extensions in GBIF occurrence data downloads. The feature allows users to specify additional extension tables to be included in DWCA format downloads.
Changes:
- Added
verbatim_extensionsparameter toocc_download()andocc_download_prep()functions - Updated documentation in vignette explaining the new feature with examples
- Added test coverage for the new functionality
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| R/occ_download.R | Added verbatim_extensions parameter to download functions and updated parameter documentation |
| R/download_predicate_dsl.R | Modified parse_predicates() to include verbatimExtensions in API payload with format validation |
| man/occ_download.Rd | Added documentation for the new verbatim_extensions parameter |
| vignettes/getting_occurrence_data.Rmd | Added section explaining how to use verbatim DWCA extensions with examples |
| tests/testthat/test-occ_download.R | Added test case for downloads with verbatim extensions |
| tests/testthat/test-download_parsing.R | Added test for verbatim_extensions parsing and updated existing tests to include the new parameter |
| tests/fixtures/occ_download_9.yml | Added fixture file for verbatim extensions test case |
| DESCRIPTION | Version bump from 3.8.4.3 to 3.8.4.4 |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 'SIMPLE_CSV', or 'SPECIES_LIST'} | ||
|
|
||
| \item{verbatim_extensions}{(character vector) A character vector of verbatim | ||
| extensions to include in the download.} |
There was a problem hiding this comment.
The documentation for the verbatim_extensions parameter should mention that it only works with format="DWCA" and will be ignored for other formats. This helps users understand the parameter's constraints upfront.
| extensions to include in the download.} | |
| extensions to include in the download. Only applicable when \code{format = "DWCA"}; | |
| ignored for other formats.} |
| test_that("parse_predicates verbatim_extensions works", { | ||
| ve <- c("http://rs.tdwg.org/dwc/terms/MeasurementOrFact", | ||
| "http://rs.gbif.org/terms/1.0/Multimedia") | ||
|
|
||
| aa <- parse_predicates("john", "email", "and", "DWCA", ve, pred("taxonKey", 22)) | ||
|
|
||
| expect_is(aa, "list") | ||
| expect_named(aa, c("creator", "notification_address", "format", "verbatimExtensions", "predicate")) | ||
| expect_equal(unclass(aa$verbatimExtensions), ve) | ||
| }) |
There was a problem hiding this comment.
Missing test coverage for the warning that should be triggered when verbatim_extensions is provided with a format other than "DWCA". Add a test case that verifies the warning message is issued when using verbatim_extensions with format="SIMPLE_CSV" or format="SPECIES_LIST".
| @@ -177,12 +179,19 @@ | |||
| #' # unique(df$countryCode) | |||
| #' # sort(unique(df$year)) | |||
There was a problem hiding this comment.
Consider adding an example demonstrating the use of verbatim_extensions parameter in the examples section. This would help users understand how to use this new feature.
| #' # sort(unique(df$year)) | |
| #' # sort(unique(df$year)) | |
| #' | |
| #' ## example using verbatim_extensions | |
| #' # x_verbatim <- occ_download( | |
| #' # pred("taxonKey", 9206251), | |
| #' # format = "SIMPLE_CSV", | |
| #' # verbatim_extensions = c("gbif:Multimedia") | |
| #' # ) | |
| #' # occ_download_meta(x_verbatim) |
| bbb <- occ_download( | ||
| pred("taxonKey",22), | ||
| format = "DWCA", | ||
| verbatim_extensions= |
There was a problem hiding this comment.
Add a space after the equals sign on line 192 to match the formatting style used elsewhere in the codebase (e.g., lines 91, 111, 131, 151, 171 all have 'format = "..."').
| verbatim_extensions= | |
| verbatim_extensions = |
| if (!is.null(verbatim_extensions)) { | ||
| if(format != "DWCA") { | ||
| warning("verbatim_extensions can only be used with format = 'DWCA'. Ignoring verbatim_extensions.") | ||
| } | ||
| payload <- list( | ||
| creator = unbox(user), | ||
| notification_address = email, | ||
| format = unbox(format), | ||
| verbatimExtensions = verbatim_extensions, | ||
| predicate = list() | ||
| ) | ||
| } else { | ||
| payload <- list( | ||
| creator = unbox(user), | ||
| notification_address = email, | ||
| format = unbox(format), | ||
| predicate = list() | ||
| ) |
There was a problem hiding this comment.
The warning says verbatim_extensions will be ignored when format is not "DWCA", but the code still includes verbatimExtensions in the payload regardless of the format. The conditional logic should only add verbatimExtensions to the payload when format equals "DWCA". Consider restructuring this to add verbatimExtensions to the payload only when both conditions are met: verbatim_extensions is not NULL AND format is "DWCA".
| if (!is.null(verbatim_extensions)) { | |
| if(format != "DWCA") { | |
| warning("verbatim_extensions can only be used with format = 'DWCA'. Ignoring verbatim_extensions.") | |
| } | |
| payload <- list( | |
| creator = unbox(user), | |
| notification_address = email, | |
| format = unbox(format), | |
| verbatimExtensions = verbatim_extensions, | |
| predicate = list() | |
| ) | |
| } else { | |
| payload <- list( | |
| creator = unbox(user), | |
| notification_address = email, | |
| format = unbox(format), | |
| predicate = list() | |
| ) | |
| payload <- list( | |
| creator = unbox(user), | |
| notification_address = email, | |
| format = unbox(format), | |
| predicate = list() | |
| ) | |
| if (!is.null(verbatim_extensions)) { | |
| if (format != "DWCA") { | |
| warning("verbatim_extensions can only be used with format = 'DWCA'. Ignoring verbatim_extensions.") | |
| } else { | |
| payload$verbatimExtensions <- verbatim_extensions | |
| } |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
#829