Skip to content

PostgREST/pg_csv

Repository files navigation

pg_csv

PostgreSQL version Coverage Status Tests

Postgres has CSV support on the COPY command, but COPY has problems:

  • It uses a special protocol, so it doesn't work with other standard features like prepared statements, pipeline mode or pgbench.
  • Is not composable. You can't use COPY inside CTEs, subqueries, view definitions or as function arguments.

pg_csv offers flexible CSV processing as a solution.

  • Includes a CSV aggregate that composes with SQL expressions.
  • Native C extension, x2 times faster than SQL queries that try to output CSV (see our CI results).
  • No dependencies except Postgres.

Installation

Clone this repo and run:

make && make install

To install the extension:

create extension pg_csv;

csv_agg

Aggregate that builds a CSV respecting RFC 4180, quoting as required.

create table projects as
select *
from (
  values
    (1, 'Death Star OS', 1),
    (2, 'Windows 95 Rebooted', 1),
    (3, 'Project "Comma,Please"', 2),
    (4, 'Escape ""Plan""', 2),
    (NULL, 'NULL & Void', NULL)
) as _(id, name, client_id);
select csv_agg(x) from projects x;
            csv_agg
--------------------------------
 id,name,client_id             +
 1,Death Star OS,1             +
 2,Windows 95 Rebooted,1       +
 3,"Project ""Comma,Please""",2+
 4,"Escape """"Plan""""",2     +
 ,NULL & Void,
(1 row)

Custom Delimiter

Custom delimiters can be used to produce different formats like pipe-separated values, tab-separated values or semicolon-separated values.

select csv_agg(x, csv_options(delimiter := '|')) from projects x;
           csv_agg
-----------------------------
 id|name|client_id          +
 1|Death Star OS|1          +
 2|Windows 95 Rebooted|1    +
 3|Open Source Lightsabers|2+
 4|Galactic Payroll System|2+
 7|Bugzilla Revival|3
(1 row)

select csv_agg(x, csv_options(delimiter := E'\t')) from projects x;
              csv_agg
-----------------------------------
 id      name    client_id        +
 1       Death Star OS   1        +
 2       Windows 95 Rebooted     1+
 3       Open Source Lightsabers 2+
 4       Galactic Payroll System 2+
 7       Bugzilla Revival        3
(1 row)

Note

  • Newline, carriage return and double quotes are not supported as delimiters to maintain the integrity of the separated values format.
  • The delimiter can only be a single char, if a longer string is specified only the first char will be used.
  • Why use a csv_options constructor function instead of extra arguments? Aggregates don't support named arguments in postgres, see a discussion on #2 (comment).

BOM

You can include a byte-order mark (BOM) to make the CSV compatible with Excel.

select csv_agg(x, csv_options(bom := true)) from projects x;

      csv_agg
-------------------
id,name,client_id+
 1,Death Star OS,1
 2,Windows 95 Rebooted,1
 3,Open Source Lightsabers,2
 4,Galactic Payroll System,2
 5,Bugzilla Revival,3
(1 row)

Header

You can omit or include the CSV header.

select csv_agg(x, csv_options(header := false)) from projects x;

           csv_agg
-----------------------------
 1,Death Star OS,1          +
 2,Windows 95 Rebooted,1    +
 3,Open Source Lightsabers,2+
 4,Galactic Payroll System,2+
 7,Bugzilla Revival,3
(1 row)

Null string

NULL values are represented by an empty string by default. This can be changed with the nullstr option.

SELECT csv_agg(x, csv_options(nullstr:='<NULL>')) AS body
FROM   projects x;

              body
--------------------------------
 id,name,client_id             +
 1,Death Star OS,1             +
 2,Windows 95 Rebooted,1       +
 3,"Project ""Comma,Please""",2+
 4,"Escape """"Plan""""",2     +
 <NULL>,NULL & Void,<NULL>
(1 row)

Limitations

  • For large bulk exports and imports, COPY ... CSV should still be preferred as its faster due to streaming support.