Working with RPostgres against a postgres server with the pg_duckdb extension running in a container (image: pgduckdb/pgduckdb:17-v1.0.0), I can successfully make a connection and count the number of records in a parquet file:
> library(RPostgres); library(DBI);
> con <- dbConnect(
drv = RPostgres::Postgres(),
host = "<redacted>",
user = "<redacted>",
password = "<redacted>",
dbname = "<redacted>"
)
> "SELECT count(*) FROM read_parquet('s3://us-prd-motherduck-open-datasets/netflix/netflix_daily_top_10.parquet')" |>
+ dbGetQuery(conn = con)
count
1 7100
When issuing another query to get the data in the file (this query works using pgsql) I get:
> "SELECT * FROM read_parquet('s3://us-prd-motherduck-open-datasets/netflix/netflix_daily_top_10.parquet')" |>
+ dbGetQuery(conn = con)
Error:
! Multiple queries must use the same column names.
Hide Traceback
▆
1. ├─DBI::dbGetQuery(...)
2. └─DBI::dbGetQuery(...)
3. └─DBI (local) .local(conn, statement, ...)
4. ├─DBI::dbSendQuery(conn, statement, ...)
5. └─RPostgres::dbSendQuery(conn, statement, ...)
6. └─RPostgres (local) .local(conn, statement, ...)
7. ├─methods::new(...)
8. │ ├─methods::initialize(value, ...)
9. │ └─methods::initialize(value, ...)
10. └─RPostgres:::result_create(conn@ptr, statement, immediate)
I am not sure if this is an issue with the pg_duckdb extension for Postgres or whether I haven't understood how to use it properly, but since the same query works in pgsql CLI I was thinking maybe it could be related to RPostgres.