Skip to content

Issues filtering using bcdc_query_geodata() #362

@a-potapova

Description

@a-potapova

I am running into issues with filtering to download a subset of data using the bcdc_query_geodata() function. I first ran into the issue with the Freshwater Atlas Stream Network data (https://catalogue.data.gov.bc.ca/dataset/92344413-8035-4c08-b996-65a9b3f62fca) but I am also unable to filter the Local and Regional Greenspaces data used in the vignette on querying spatial data (https://catalogue.data.gov.bc.ca/dataset/6a2fea1b-0cc4-4fc2-8017-eaf755d516da).

The issue is that filtering using logical operators (==, >) is resulting in the following error message:

Error: Error in glue_sql2(sql_current_con(), "{.val x} {f} {.val y}") :
could not find function "glue_sql2"

Filtering using the %in% operator works correctly. This applies to both the FWA stream network data and the data sources used in the vignette. Filtering the same exact query using == does not work, even with the greenspaces data from the vignette. Specifying bcdata::filter does not solve the issue.

#Mapping test
library(bcdata)
library(sf)

#This does not work:
streams <- bcdc_query_geodata("92344413-8035-4c08-b996-65a9b3f62fca") %>%
  filter(GNIS_NAME == "Puntledge River") %>%
  collect()

#This works: 
streams <- bcdc_query_geodata("92344413-8035-4c08-b996-65a9b3f62fca") %>%
  filter(GNIS_NAME %in% c("Puntledge River")) %>%
  collect()

#This does not work:
streams <- bcdc_query_geodata("92344413-8035-4c08-b996-65a9b3f62fca") %>%
  filter(STREAM_ORDER > 5) %>%
  collect()

#This works:
streams <- bcdc_query_geodata("92344413-8035-4c08-b996-65a9b3f62fca") %>%
  filter(STREAM_ORDER %in% c(5)) %>%
  collect()

#Examples from the vignette: 
#This works: 
bcdc_query_geodata("78ec5279-4534-49a1-97e8-9d315936f08b") %>%
  filter(SCHOOL_DISTRICT_NAME %in% c("Greater Victoria", "Prince George","Kamloops/Thompson"))

#This does not work: 
bcdc_query_geodata("6a2fea1b-0cc4-4fc2-8017-eaf755d516da") %>%
  filter(PARK_PRIMARY_USE == "Park")

My package and R version:

packageVersion("bcdata")
[1] ‘0.4.1’

R version 4.3.0

Any insights would be appreciated as the stream network data is too big to download all at once. My main issue is that I need to filter by stream order. I am able to filter using a list of integers (i.e. STREAM_ORDER %in% c(5,6,7)) but only because I am familiar enough with the data to know which values I need. It seems like it would be preferable to be able to subset logically, especially if this issue also persists with other fields that are non-integer numbers that can't easily be listed.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions