fix(explore): prevent verbose_name from overriding column_name in filter resolution#38689
fix(explore): prevent verbose_name from overriding column_name in filter resolution#38689Mayankaggarwal8055 wants to merge 2 commits intoapache:masterfrom
Conversation
Code Review Agent Run #ed78e8Actionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Sequence DiagramThis PR updates query building so real column names are always the primary mapping keys, while verbose names are added only as non-conflicting aliases. This keeps filter resolution predictable and prevents verbose labels from overriding actual SQL column identifiers. sequenceDiagram
participant Client
participant QueryBuilder
participant ColumnMap
participant SQLBuilder
Client->>QueryBuilder: Build query with filter field
QueryBuilder->>ColumnMap: Register each column_name as primary key
QueryBuilder->>ColumnMap: Add verbose_name only when key is unused
QueryBuilder->>ColumnMap: Resolve filter field to mapped column
QueryBuilder->>SQLBuilder: Build filter using resolved real column
SQLBuilder-->>Client: Return query with correct filtering
Generated by CodeAnt AI |
Codecov Report❌ Patch coverage is
❌ Your project check has failed because the head coverage (99.92%) is below the target coverage (100.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## master #38689 +/- ##
==========================================
- Coverage 65.10% 64.31% -0.80%
==========================================
Files 1819 2532 +713
Lines 72670 129707 +57037
Branches 23222 29981 +6759
==========================================
+ Hits 47312 83416 +36104
- Misses 25358 44838 +19480
- Partials 0 1453 +1453
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
superset/models/helpers.py
Outdated
| # Add verbose_name ONLY if it doesn't conflict | ||
| for col in self.columns: | ||
| if col.verbose_name and col.verbose_name not in columns_by_name: | ||
| columns_by_name[col.verbose_name] = col |
There was a problem hiding this comment.
Suggestion: verbose_name aliases are added without checking metric names, so a verbose alias that matches a metric name will be resolved as a column first. This bypasses metric-filter handling and applies the filter in WHERE instead of HAVING, producing incorrect query results. Exclude aliases that collide with metric names when building columns_by_name. [logic error]
Severity Level: Major ⚠️
- ❌ HAVING metric filters downgraded into WHERE predicates.
- ⚠️ `POST /api/v1/chart/data` may return wrong aggregates.| # Add verbose_name ONLY if it doesn't conflict | |
| for col in self.columns: | |
| if col.verbose_name and col.verbose_name not in columns_by_name: | |
| columns_by_name[col.verbose_name] = col | |
| # Add verbose_name ONLY if it doesn't conflict | |
| metric_names = {metric.metric_name for metric in self.metrics} | |
| for col in self.columns: | |
| if ( | |
| col.verbose_name | |
| and col.verbose_name not in columns_by_name | |
| and col.verbose_name not in metric_names | |
| ): | |
| columns_by_name[col.verbose_name] = col |
Steps of Reproduction ✅
1. Call chart data execution via `POST /api/v1/chart/data` in
`superset/charts/data/api.py:199-260`, which builds `ChartDataCommand` and executes query
payloads.
2. Request execution flows through `ChartDataCommand.run()`
(`superset/commands/chart/data/get_data_command.py:40-48`) to
`QueryContextProcessor.get_payload()`
(`superset/common/query_context_processor.py:325-361`) and then `_get_full()`
(`superset/common/query_actions.py:153-161`) which calls
`query_context.get_df_payload(...)`.
3. `get_df_payload()` calls datasource query generation through `get_query_result()`
(`superset/common/query_context_processor.py:131,235-243`) and eventually
`ExploreMixin.get_sqla_query()` (`superset/models/helpers.py:2630+`).
4. In `get_sqla_query()`, `columns_by_name` includes `verbose_name` aliases
(`superset/models/helpers.py:2721-2724`). If a column verbose alias equals a metric name
(cross-type collision is allowed; only per-table uniqueness exists in
`superset/connectors/sqla/models.py:1` for columns and `:246` for metrics), then filter
resolution checks column first (`helpers.py:3015`) and metric second
(`helpers.py:3018-3027`), so `is_metric_filter` stays false.
5. The filter is then appended to `WHERE` (`helpers.py:3046-3048`) instead of `HAVING`,
producing semantically wrong SQL for metric filters (e.g., metric-name filter intended for
aggregate expression applied to base column predicate).Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** superset/models/helpers.py
**Line:** 2721:2724
**Comment:**
*Logic Error: `verbose_name` aliases are added without checking metric names, so a verbose alias that matches a metric name will be resolved as a column first. This bypasses metric-filter handling and applies the filter in `WHERE` instead of `HAVING`, producing incorrect query results. Exclude aliases that collide with metric names when building `columns_by_name`.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.…ns_by_name - Preserve column_name as source of truth - Add verbose_name only when no conflict with columns or metrics - Fix incorrect filter resolution caused by alias collision
What problem does this solve?
Filtering can break when a column has a
verbose_namethat conflicts with an existing column name.In the current implementation, both
column_nameandverbose_nameare added to thecolumns_by_namemapping. If averbose_namematches an existing key, it can override the actual column mapping. This leads to incorrect column resolution during query building, causing filters to fail or behave unexpectedly.What does this PR do?
column_nameremains the primary and authoritative key incolumns_by_nameverbose_nameas a secondary alias only when it does not conflict with existing keysWhy is this change needed?
column_nameis the true identifier used in SQL queries, whileverbose_nameis intended for display purposes. Allowingverbose_nameto overridecolumn_namebreaks this distinction and can result in invalid or missing filters.This change ensures predictable and correct filter behavior even when
verbose_nameis present.Testing
column_nameand non-conflictingverbose_nameRelated
Related to #38339
This addresses part of the issue where filter resolution fails due to mismatches between displayed labels and actual column identifiers. Additional handling may still be required for adhoc column labels.