feat: add countdistinct() aggregate function by datavaultbuilder · Pull Request #4957 · PostgREST/postgrest

datavaultbuilder · 2026-05-26T10:34:05Z

Use Case: If I want to know the customers from the order table I need to do a count distinct

Solution: adding new count(distinct field) as countdistinct() and not as count(distinct). I took decision not to need to import params in the aggregates. But if you think count(distinct) would have been better: it would be about 10-15 lines more code and possible. But as I'm not familiar with the code the change looked more risky with count(distinct) as I couldn't foresee if I could potentially create some performance or memory issue.

Tested to work, that is is gated if aggs are off. tested against collisions with tables and columns named countdistincts.

Returns based on postgres rules name count(). (i.e. not countdistinct)

Added tests for the different cases if the amounts are correct.

Docs Updated.

This is my first PR so I'm open for feedback.

added count distinct as countdistinct agg. tested to work, that is is gated if aggs are off. tested against collisions with tables and columns named countdistincts. Returns based on postgres rules name count().

steve-chavez · 2026-05-26T18:34:39Z

Motivation on #915 (comment). Please check if you've followed https://github.com/PostgREST/postgrest/blob/main/CONTRIBUTING.md#ai-policy.

Added count(distinct) as

Added count(distinct) as countdistinct aggregate. Tested to work, and that it is gated when aggregates are off. Tested against collisions with tables and columns named countdistinct. Returns type based on Postgres rules for count(). Extended test coverage to paths including spread path.

Added count(distinct) as countdistinct aggregate. Tested to work, and that it is gated when aggregates are off. Tested against collisions with tables and columns named countdistinct. Returns type based on Postgres rules for count().

wolfgangwalther · 2026-05-30T12:21:51Z

adding new count(distinct field) as countdistinct()

So, the reason we are limited to the few aggregates as avg, min, max, sum and count right now is, I believe, not that we'd never want to allow others - but more that we haven't found a good way to solve some of the challenges around these, yet.

In the general case, the distinct modifier is useful for other aggregates than count, too. With that in mind, I think making this part of the aggregate "name" is the wrong way to do it.

I'm not sure whether it would work to implement this as a "modifier" as count.distinct or count().distinct or so?

datavaultbuilder · 2026-05-31T14:25:33Z

Hi Wolfgang

Thank you for your feedback.

I'm really open what you think is usefull to rewrite the code. That is fast done. Just tell me what fits better in the general architecture.
Even I have never found a use case for sum distinct and avg distinct in some database they were implemented - including Postgres. So could be that somebody is interested in future to implement them but I would do that only if somebody explains what it is used in the real world for so I can add as well a meaningful test case.

Just a gut fealing about: count.distinct or count().distinct => they would be still different with the sql count(distinct ). For the first one would it be: count.distinct() ? For the second: count().distinct => just personal preference: we would then have 2 cases: one with modifier and one without - not impossible but would opt probably for count.distinct().

Just tell me what the prefered way is and I update my PR.

Regards
Petr

steve-chavez · 2026-06-01T15:28:07Z

I'm not sure whether it would work to implement this as a "modifier" as count.distinct or count().distinct or so?
but would opt probably for count.distinct().

We already have a syntax for operators modifiers which unfortunately does not seem usable for aggregates modifiers. IMO count.distinct() seems like the better of choice of the above but that would be inconsistent.

Another idea could be using count~ for count distinct, we're still not using the url safe tilde ~. Before I thought about using it as a shorthand for negated operators (~eq instead of not.eq) but maybe it can fit for this case too?

wolfgangwalther · 2026-06-01T18:32:40Z

Another idea could be using count~ for count distinct, we're still not using the url safe tilde ~. Before I thought about using it as a shorthand for negated operators (~eq instead of not.eq) but maybe it can fit for this case too?

I'm opposed to that. A tilde is really just something, but imho not semantically connected to the distinct operation. It will be hard to understand - and additionally it's limiting, in case we'll want to support other modifiers.

We already have a syntax for operators modifiers which unfortunately does not seem usable for aggregates modifiers. IMO count.distinct() seems like the better of choice of the above but that would be inconsistent.

I think a better analogy for the distinct operation is actually !inner. Here's why: We have long thought about aggregated function calls and resource embeddings as a very similar thing. We made a huge step forward with computed relationships. (note: embeddings, computed or not, always have an aggregation step as well...).

What !inner does is bad - because it also modifies the top-level rows. We had discussed elsewhere that we should have ideally done it differently.

But could we make count!distinct() work?

steve-chavez · 2026-06-01T18:49:32Z

But could we make count!distinct() work?

Agree with count!distinct(), now it's consistent.

What !inner does is bad - because it also modifies the top-level rows. We had discussed elsewhere that we should have ideally done it differently.

The idea was to replace !inner with null filtering on embeds, but IIRC there were some incompatibilities.

wolfgangwalther · 2026-06-26T19:58:25Z

Marking as draft until the feedback has been addressed and/or it has been confirmed that the count!distinct() syntax will actually work.

adde count(distinct) as countdistinct aggregate

6d12f40

added count distinct as countdistinct agg. tested to work, that is is gated if aggs are off. tested against collisions with tables and columns named countdistincts. Returns based on postgres rules name count().

datavaultbuilder added 3 commits May 27, 2026 19:25

feat: add count(distinct) as countdistinct aggregate

24f7d4d

Added count(distinct) as

feat: add count(distinct) as countdistinct aggregate

3142d4c

Added count(distinct) as countdistinct aggregate. Tested to work, and that it is gated when aggregates are off. Tested against collisions with tables and columns named countdistinct. Returns type based on Postgres rules for count().

datavaultbuilder force-pushed the feat/countdistinct-aggregate branch from 233a4f5 to 3142d4c Compare May 27, 2026 17:58

wolfgangwalther marked this pull request as draft June 26, 2026 19:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat: add countdistinct() aggregate function #4957

feat: add countdistinct() aggregate function #4957
datavaultbuilder wants to merge 4 commits into
PostgREST:mainfrom
datavaultbuilder:feat/countdistinct-aggregate

datavaultbuilder commented May 26, 2026

Uh oh!

steve-chavez commented May 26, 2026

Uh oh!

wolfgangwalther commented May 30, 2026

Uh oh!

datavaultbuilder commented May 31, 2026

Uh oh!

steve-chavez commented Jun 1, 2026

Uh oh!

wolfgangwalther commented Jun 1, 2026

Uh oh!

steve-chavez commented Jun 1, 2026

Uh oh!

wolfgangwalther commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

Uh oh!

Uh oh!

Conversation

datavaultbuilder commented May 26, 2026

Uh oh!

steve-chavez commented May 26, 2026

Uh oh!

wolfgangwalther commented May 30, 2026

Uh oh!

datavaultbuilder commented May 31, 2026

Uh oh!

steve-chavez commented Jun 1, 2026

Uh oh!

wolfgangwalther commented Jun 1, 2026

Uh oh!

steve-chavez commented Jun 1, 2026

Uh oh!

wolfgangwalther commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants