You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I created a local duck db database under /var/lib/postgresql/duck/nmv.db docker container volume mounted on a NVME SSD to test performance for analytical workload over a table with 2 billions of records.
I created the view nmv.weather_obs_grid as
SELECT duckdb.raw_query(' ATTACH ''/var/lib/postgresql/duckdb/nmv.db'' AS nmv; CREATE VIEW nmv.weather_obs_grid AS SELECT * FROM read_parquet(''/var/lib/postgresql/parquet/cgms23eur/weather_obs_grid/partitions/**/*.parquet'',
hive_partitioning=true)
');
where the parquet files have been created using pg_parquet extension:
COPY (SELECT idgrid,day,temperature_max,temperature_min,temperature_avg,vapourpressure,windspeed,precipitation,et0,radiation,rh,synchronized_bridge FROM cache_eur_cgms23.weather_obs_grid_2025) TO '/var/lib/postgresql/parquet/cgms23eur/weather_obs_grid/partitions/year=2025/data.parquet' (format 'parquet',compression 'gzip');
now if I want to query the view nmv.weather_obs_grid duckdb.raw_query works fine, duckdb.query simply not:
SELECT * from duckdb.raw_query('ATTACH ''/var/lib/postgresql/duck/nmv.db'' AS nmv;; SELECT idgrid,day,temperature_avg FROM nmv.weather_obs_grid limit 10;'); IS OK
while if I try:
SELECT * from duckdb.raw_query('ATTACH ''/var/lib/postgresql/duck/nmv.db'' AS nmv;');
and
SELECT * from duckdb.query('SELECT idgrid,day,temperature_avg FROM nmv.weather_obs_grid limit 10;');
ERROR: (PGDuckDB/Duckdb_ExecCustomScan_Cpp) Invalid Input Error: Attempting to execute an unsuccessful or closed pending query result
so the only way to get data from the local duck database is to capture and parse the output text of the raw query. This is not so efficient to get a table format using internal structure with C++ function duckdb.query but is a workaround.
First install plsh extension and create the following function:
CREATE OR REPLACE FUNCTION sysadmin."CaptureDuckdbRawQueryOutput"(
db text,
query text)
RETURNS SETOF text
LANGUAGE 'plsh'
COST 100
VOLATILE PARALLEL UNSAFE
ROWS 1000
WITH raw_output
AS (SELECT j AS row_text
FROM sysadmin."CaptureDuckdbRawQueryOutput"('/var/lib/postgresql/duck/nmv.db', 'SELECT idgrid,day,temperature_avg FROM nmv.weather_obs_grid limit 10') j
), lines
AS (SELECT regexp_split_to_table(row_text, E'\n') AS single_line
FROM raw_output
WHERE row_text IS NOT NULL AND row_text != ''
), parsed_arrays
AS (SELECT string_to_array(replace(trim(both '{}' FROM single_line),'"',''),',') AS row_data
FROM lines
WHERE single_line LIKE '{%}'
)
SELECT row_data[1]::BIGINT AS idgrid,
row_data[2]::DATE AS day,
row_data[3]::DECIMAL(3, 1) AS temperature_avg
FROM parsed_arrays;
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
I created a local duck db database under /var/lib/postgresql/duck/nmv.db docker container volume mounted on a NVME SSD to test performance for analytical workload over a table with 2 billions of records.
I created the view nmv.weather_obs_grid as
SELECT duckdb.raw_query(' ATTACH ''/var/lib/postgresql/duckdb/nmv.db'' AS nmv; CREATE VIEW nmv.weather_obs_grid AS SELECT * FROM read_parquet(''/var/lib/postgresql/parquet/cgms23eur/weather_obs_grid/partitions/**/*.parquet'',
hive_partitioning=true)
');
where the parquet files have been created using pg_parquet extension:
COPY (SELECT idgrid,day,temperature_max,temperature_min,temperature_avg,vapourpressure,windspeed,precipitation,et0,radiation,rh,synchronized_bridge FROM cache_eur_cgms23.weather_obs_grid_2025) TO '/var/lib/postgresql/parquet/cgms23eur/weather_obs_grid/partitions/year=2025/data.parquet' (format 'parquet',compression 'gzip');
now if I want to query the view nmv.weather_obs_grid duckdb.raw_query works fine, duckdb.query simply not:
SELECT * from duckdb.raw_query('ATTACH ''/var/lib/postgresql/duck/nmv.db'' AS nmv;; SELECT idgrid,day,temperature_avg FROM nmv.weather_obs_grid limit 10;'); IS OK
while if I try:
SELECT * from duckdb.raw_query('ATTACH ''/var/lib/postgresql/duck/nmv.db'' AS nmv;');
and
SELECT * from duckdb.query('SELECT idgrid,day,temperature_avg FROM nmv.weather_obs_grid limit 10;');
ERROR: (PGDuckDB/Duckdb_ExecCustomScan_Cpp) Invalid Input Error: Attempting to execute an unsuccessful or closed pending query result
so the only way to get data from the local duck database is to capture and parse the output text of the raw query. This is not so efficient to get a table format using internal structure with C++ function duckdb.query but is a workaround.
First install plsh extension and create the following function:
CREATE OR REPLACE FUNCTION sysadmin."CaptureDuckdbRawQueryOutput"(
db text,
query text)
RETURNS SETOF text
LANGUAGE 'plsh'
COST 100
VOLATILE PARALLEL UNSAFE
ROWS 1000
AS$BODY$
$BODY$ ;
#!/bin/bash
psql -d yourdb -U postgres -q -c "SELECT duckdb.raw_query('ATTACH ''${1}'' AS nmv; ${2};')" 2>&1 | awk '/[ Rows:/{flag=1; next} /raw_query/{flag=0} flag && /^[[:space:]]*[0-9]/ {gsub(/\t/, "",""); print "{"" $0 ""}"}'
after you can invoke the function in this way:
WITH raw_output
AS (SELECT j AS row_text
FROM sysadmin."CaptureDuckdbRawQueryOutput"('/var/lib/postgresql/duck/nmv.db', 'SELECT idgrid,day,temperature_avg FROM nmv.weather_obs_grid limit 10') j
), lines
AS (SELECT regexp_split_to_table(row_text, E'\n') AS single_line
FROM raw_output
WHERE row_text IS NOT NULL AND row_text != ''
), parsed_arrays
AS (SELECT string_to_array(replace(trim(both '{}' FROM single_line),'"',''),',') AS row_data
FROM lines
WHERE single_line LIKE '{%}'
)
SELECT row_data[1]::BIGINT AS idgrid,
row_data[2]::DATE AS day,
row_data[3]::DECIMAL(3, 1) AS temperature_avg
FROM parsed_arrays;
Beta Was this translation helpful? Give feedback.
All reactions