Skip to content

steve-chavez/pg_csv

 
 

Repository files navigation

pg_csv

PostgreSQL version Coverage Status Tests

Postgres has CSV support on the COPY command, but COPY has problems:

  • It uses a special protocol, so it doesn't work with other standard features like prepared statements, pipeline mode or pgbench.
  • Is not composable. You can't use COPY inside CTEs, subqueries, view definitions or as function arguments.

pg_csv offers flexible CSV processing as a solution.

  • Includes a CSV aggregate that composes with SQL expressions.
  • Native C extension, x2 times faster than SQL queries that try to output CSV (see our CI results).
  • No dependencies except Postgres.

Installation

PostgreSQL >= 12 is supported. Clone this repo and run:

make && make install

To install the extension:

create extension pg_csv;

csv_agg

Aggregate that builds a CSV respecting RFC 4180, quoting as required.

create table projects as
select *
from (
  values
    (1, 'Death Star OS', 1),
    (2, 'Windows 95 Rebooted', 1),
    (3, 'Project "Comma,Please"', 2),
    (4, 'Escape ""Plan""', 2),
    (NULL, 'NULL & Void', NULL)
) as _(id, name, client_id);
select csv_agg(x) from projects x;
            csv_agg
--------------------------------
 id,name,client_id             +
 1,Death Star OS,1             +
 2,Windows 95 Rebooted,1       +
 3,"Project ""Comma,Please""",2+
 4,"Escape """"Plan""""",2     +
 ,NULL & Void,
(1 row)

Custom Delimiter

Custom delimiters can be used to produce different formats like pipe-separated values, tab-separated values or semicolon-separated values.

select csv_agg(x, csv_options(delimiter := '|')) from projects x;
           csv_agg
-----------------------------
 id|name|client_id          +
 1|Death Star OS|1          +
 2|Windows 95 Rebooted|1    +
 3|Open Source Lightsabers|2+
 4|Galactic Payroll System|2+
 7|Bugzilla Revival|3
(1 row)

select csv_agg(x, csv_options(delimiter := E'\t')) from projects x;
              csv_agg
-----------------------------------
 id      name    client_id        +
 1       Death Star OS   1        +
 2       Windows 95 Rebooted     1+
 3       Open Source Lightsabers 2+
 4       Galactic Payroll System 2+
 7       Bugzilla Revival        3
(1 row)

Note

  • Newline, carriage return and double quotes are not supported as delimiters to maintain the integrity of the separated values format.
  • The delimiter can only be a single char, if a longer string is specified only the first char will be used.
  • Why use a csv_options constructor function instead of extra arguments? Aggregates don't support named arguments in postgres, see a discussion on PostgREST#2 (comment).

BOM

You can include a byte-order mark (BOM) to make the CSV compatible with Excel.

select csv_agg(x, csv_options(bom := true)) from projects x;

      csv_agg
-------------------
id,name,client_id+
 1,Death Star OS,1
 2,Windows 95 Rebooted,1
 3,Open Source Lightsabers,2
 4,Galactic Payroll System,2
 5,Bugzilla Revival,3
(1 row)

Header

You can omit or include the CSV header.

select csv_agg(x, csv_options(header := false)) from projects x;

           csv_agg
-----------------------------
 1,Death Star OS,1          +
 2,Windows 95 Rebooted,1    +
 3,Open Source Lightsabers,2+
 4,Galactic Payroll System,2+
 7,Bugzilla Revival,3
(1 row)

Null string

NULL values are represented by an empty string by default. This can be changed with the nullstr option.

SELECT csv_agg(x, csv_options(nullstr:='<NULL>')) AS body
FROM   projects x;

              body
--------------------------------
 id,name,client_id             +
 1,Death Star OS,1             +
 2,Windows 95 Rebooted,1       +
 3,"Project ""Comma,Please""",2+
 4,"Escape """"Plan""""",2     +
 <NULL>,NULL & Void,<NULL>
(1 row)

Limitations

  • For large bulk exports and imports, COPY ... CSV should still be preferred as its faster due to streaming support.

About

CSV aggregate for Postgres

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 59.5%
  • PLpgSQL 18.4%
  • Makefile 13.7%
  • Nix 8.4%