Skip to content

Conversation

@suddendust
Copy link
Contributor

@suddendust suddendust commented Nov 3, 2025

Background

Nested collections in Postgres already support GROUP BY on nested fields within JSONB documents using dot notation:

// Nested collection - GROUP BY on scalar field
Query.builder()
    .addAggregation(IdentifierExpression.of("props.seller.address.pincode"))
    .build();
  // Nested collection - GROUP BY on array field (with UNNEST)
  Query.builder()
      .addFromClause(UnnestExpression.of(IdentifierExpression.of("props.colors"), false))
      .addAggregation(IdentifierExpression.of("props.colors"))
      .build();

Problem

Flat Postgres collections did not support GROUP BY on nested JSONB fields. The FlatPostgresFieldTransformer was missing logic to handle JsonIdentifierExpression when fields were unnested, causing:

  1. Scalar fields: GROUP BY failed with "column does not exist" errors
  2. Array fields: GROUP BY grouped entire arrays instead of individual unnested elements

Solution

When you UNNEST a JSONB array field and then GROUP BY it, there are two ways the SQL can reference that field:

Without the Fix (JSONB Accessor - Wrong)

-- UNNEST creates a column alias
jsonb_array_elements("props" -> 'colors') AS p1(props_colors_encoded)

-- But GROUP BY was still using the JSONB accessor
GROUP BY "props" -> 'colors'

Problem: "props" -> 'colors' returns the entire array ["Blue", "Green"], not the individual unnested values.

Result: Groups by entire arrays → ["Blue", "Green"], ["Black"], ["Orange", "Blue"]

With the Fix (Direct Column Reference - Correct)

-- UNNEST creates a column alias
jsonb_array_elements("props" -> 'colors') AS p1(props_colors_encoded)

-- GROUP BY now references the unnested column directly
GROUP BY "props_colors_encoded"

So, we've added unnest-aware logic in FlatPostgresFieldTransformer.visit(JsonIdentifierExpression) to check the pgColMapping and return direct column references for unnested fields instead of JSONB accessors.

Usage from Entity Service

GROUP BY on scalar JSONB field:

  Query.builder()
      .addSelection(JsonIdentifierExpression.of("props", "brand"))
      .addSelection(AggregateExpression.of(COUNT, ConstantExpression.of(1)), "count")
      .addAggregation(JsonIdentifierExpression.of("props", "brand"))
      .build();
  // Returns: [{props: {brand: "Dettol"}, count: 1}, {props: {brand: "Sunsilk"}, count: 1}, ...]

GROUP BY on JSONB array field (with UNNEST):

  Query.builder()
      .addSelection(JsonIdentifierExpression.of("props", "colors"), "color")
      .addSelection(AggregateExpression.of(COUNT, ConstantExpression.of(1)), "count")
      .addFromClause(UnnestExpression.of(JsonIdentifierExpression.of("props", "colors"), false))
      .addAggregation(JsonIdentifierExpression.of("props", "colors"))
      .build();
  // Returns: [{color: "Blue", count: 2}, {color: "Green", count: 1}, ...]

Flat collections now behave consistently with nested collections for GROUP BY operations on JSONB fields.

@codecov
Copy link

codecov bot commented Nov 3, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.44%. Comparing base (4192bde) to head (576d542).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main     #247      +/-   ##
============================================
+ Coverage     80.41%   80.44%   +0.02%     
  Complexity     1151     1151              
============================================
  Files           215      215              
  Lines          5495     5503       +8     
  Branches        486      487       +1     
============================================
+ Hits           4419     4427       +8     
  Misses          750      750              
  Partials        326      326              
Flag Coverage Δ
integration 80.44% <100.00%> (+0.02%) ⬆️
unit 57.69% <45.45%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@suddendust suddendust changed the title [Draft] Group by on nested jsonb arrays in flat collections Group by on nested jsonb arrays in flat collections Nov 4, 2025
@suddendust suddendust changed the title Group by on nested jsonb arrays in flat collections Support GROUP BY on Nested JSONB Fields in Flat Postgres Collections Nov 4, 2025
@suresh-prakash suresh-prakash merged commit a02a279 into hypertrace:main Nov 4, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants