Able to read many data types that exist in both Postgres and DuckDB. The following data types are currently supported for use in queries:
- Integer types (
integer,bigint, etc.) - Floating point types (
real,double precision) numeric(might get converted todouble precisioninternally see known limitations below for details)text/varchar/bpcharbitrelated types, including both fixed and varied sized bit arraybytea/blobtimestamp/timstampz/date/interval/timestamp_ns/timestamp_ms/timestamp_sbooleanuuidjson/jsonbdomainarraysfor all of the above types, but see limitations below about multi-dimensional arrays
The following types are supported only within DuckDB queries and MotherDuck tables. These types cannot be stored in regular PostgreSQL tables, but can be used temporarily within queries (e.g., when using duckdb.query() or when DuckDB functions return such types):
struct- Complex structured data type with named fieldsmap- Key-value mapping typeunion- Type that can hold values of different types
The type support in pg_duckdb is not yet complete (and might never be). The
following are known issues that you might run into. Feel free to contribute PRs
to fix these limitations:
enumtypes are not supported (PR is progress)- The DuckDB
decimaltype doesn't support the wide range of values that the Postgresnumerictype does. To avoid errors when converting between the two,numericis converted todouble precisioninternally ifDuckDBdoes not support the required precision. Obviously this might cause precision loss of the values. - The DuckDB
timestamp_nstype gets truncated to microseconds when it is converted to the Postgrestimestamptype, which loses precision in the output. Operations on atimestamp_nsvalue, such as sorting/grouping/comparing, will use the full precision. jsonbcolumns are converted tojsoncolumns when reading from DuckDB. This is because DuckDB does not have ajsonbtype.- Many Postgres
jsonandjsonbfunctions and operators are not implemented in DuckDB. Instead you can use DuckDB json functions and operators. See the DuckDB documentation for more information on these functions. - The DuckDB
tinyinttype is converted to achartype in Postgres. This is because Postgres does not have atinyinttype. This causes it to be displayed as a hex code instead of a regular number. - Conversion between in Postgres multi-dimensional arrays and DuckDB nested
LISTs in DuckDB can run into various problems, because neither database supports the thing that the other supports exactly. Specifically in Postgres it's allowed for different arrays in a column to have a different number of dimensions, e.g.[1]and[[1], [2]]can both occur in the same column. In DuckDB that's not allowed, i.e. the amount of nesting should always be the same. On the other hand, in DuckDB it's valid for different lists at the same nest-level to contain a different number of elements, e.g.[[1], [1, 2]]. This is not allowed in Postgres. So conversion between these types is only possible when the arrays follow the subset. Another possible problem that you can run into is that pg_duckdb uses the Postgres column metadata to determine the number of dimensions that an array has. Since Postgres doesn't complain when you add arrays of different dimensions, it's possible that the number of dimensions in the column metadata does not match the actual number of dimensions. To solve this you need to alter the column type:-- This configures the column to be a 3-dimensional array of text ALTER TABLE s ALTER COLUMN a SET DATA TYPE text[][][];
- For the
domainactually, during the execution of the INSERT operation, the check regardingdomainis conducted by PostgreSQL rather than DuckDB. When we execute the SELECT operation and the type of the queried field is adomain, we will convert it to the corresponding base type and let DuckDB handle it.
pg_duckdb introduces a few special Postgres types. You shouldn't create these types explicitly and normally you don't need to know about their existence, but they might show up in error messages from Postgres. These are explained below:
The duckdb.row type is returned by functions like read_parquet, read_csv, scan_iceberg, etc. Depending on the arguments of these functions they can return rows with different columns and types. Postgres doesn't support such functions well at this point in time, so for now we return a custom type from them. To then be able to get the actual columns out of these rows you have to use the "square bracket indexing" syntax, similarly to how you would get field
SELECT r['id'], r['name'] FROM read_parquet('file.parquet') r WHERE r['age'] > 21;Using SELECT * will result in the columns of this row being expanded, so your query result will never have a column that has duckdb.row as its type:
SELECT * FROM read_parquet('file.parquet');Due to limitations in Postgres, there are some limitations when using a function that returns a duckdb.row in a CTE or subquery. The main problem is that pg_duckdb cannot automatically assign useful aliases to the selected columns from the row. So while this query without a CTE/subquery returns the r[company] column as company:
SELECT r['company']
FROM duckdb.query($$ SELECT 'DuckDB Labs' company $$) r;
-- company
-- ─────────────
-- DuckDB LabsThe same query in a subquery or CTE will return the column simply as r:
WITH mycte AS (
SELECT r['company']
FROM duckdb.query($$ SELECT 'DuckDB Labs' company $$) r
)
SELECT * FROM mycte;
-- r
-- ─────────────
-- DuckDB LabsThis is easy to work around by adding an explicit alias to the column in the CTE/subquery:
WITH mycte AS (
SELECT r['company'] AS company
FROM duckdb.query($$ SELECT 'DuckDB Labs' company $$) r
)
SELECT * FROM mycte;
-- company
-- ─────────────
-- DuckDB LabsAnother limitation that can be similarly confusing is that when using SELECT * inside the CTE/subquery, and you want to reference a specific column outside the CTE/subquery, then you still need to use the r['colname'] syntax instead of simply colname. So while this works as expected:
WITH mycte AS (
SELECT *
FROM duckdb.query($$ SELECT 'DuckDB Labs' company $$) r
)
SELECT * FROM mycte;
-- company
-- ─────────────
-- DuckDB LabsThe following query will throw an error:
WITH mycte AS (
SELECT *
FROM duckdb.query($$ SELECT 'DuckDB Labs' company $$) r
)
SELECT * FROM mycte WHERE company = 'DuckDB Labs';
-- ERROR: 42703: column "company" does not exist
-- LINE 5: SELECT * FROM mycte WHERE company = 'DuckDB Labs';
-- ^
-- HINT: If you use DuckDB functions like read_parquet, you need to use the r['colname'] syntax to use columns. If you're already doing that, maybe you forgot to to give the function the r alias.This is easy to work around by using the r['colname'] syntax like so:
> WITH mycte AS (
SELECT *
FROM duckdb.query($$ SELECT 'DuckDB Labs' company $$) r
)
SELECT * FROM mycte WHERE r['company'] = 'DuckDB Labs';
-- company
-- ─────────────
-- DuckDB LabsThe duckdb.unresolved_type type is a type that is used to make Postgres understand an expression for which the type is not known at query parse time. This is the type of any of the columns extracted from a duckdb.row using the r['mycol'] syntax. Many operators and aggregates will return a duckdb.unresolved_type when one of the sides of the operator is of the type duckdb.unresolved_type, for instance r['age'] + 10.
Once the query gets executed by DuckDB the actual type will be filled in by DuckDB. So, a query result will never contain a column that has duckdb.unresolved_type as its type. And generally you shouldn't even realize that this type even exists.
You might get errors that say that functions or operators don't exist for the duckdb.unresolved_type, such as:
ERROR: function some_func(duckdb.unresolved_type) does not exist
LINE 6: some_func(r['somecol']) as somecolIn such cases a simple workaround is often to add an explicit cast to the type that the function accepts, such as some_func(r['somecol']::text) as somecol. If this happens for builtin Postgres functions, please open an issue on the pg_duckdb repository. That way we can consider adding explicit support for these functions.
The duckdb.json type is used as arguments to DuckDB JSON functions. This type exists so that these functions can take values of json, jsonb and duckdb.unresolved_type.