fix(mcp): accept Superset vocabulary in chart configs and clarify query_dataset metric errors#40972
Draft
richardfogaca wants to merge 1 commit into
Draft
Conversation
Codecov Report❌ Patch coverage is
❌ Your project check has failed because the head coverage (99.95%) is below the target coverage (100.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## master #40972 +/- ##
==========================================
- Coverage 64.30% 63.71% -0.59%
==========================================
Files 2657 2657
Lines 144059 144101 +42
Branches 33216 33225 +9
==========================================
- Hits 92639 91817 -822
- Misses 49798 50660 +862
- Partials 1622 1624 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
…ry_dataset metric errors
d0f9abb to
4a8a098
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SUMMARY
LLM clients talking to the MCP service reliably reach for Superset's public vocabulary —
datasource_id,viz_type, plugin viz names likeecharts_timeseries_bar, form_data fields likeshow_legend— before consulting the MCP tool schema. Today each of those guesses is rejected by validation, and every rejection costs the client a full model round trip (and, in MCP hosts that gate non-read-only tools, a user approval prompt per retry).We instrumented live agentic sessions against a workspace running this service and found that every
generate_chartretry across all test conditions was the same request-shape guess, in a small set of recurring forms:datasource_idinstead ofdataset_id(Superset's REST vocabulary)viz_type: "bar"/"dist_bar"/"echarts_timeseries_bar"where the config union expects itschart_typediscriminator (xy,pie, ...)show_legend(and a bool whereLegendConfigis expected),chart_orientationColumnRefobjects are expected ("x_axis": "genre","y": ["count"])A typical session burned 2–4 rejected
generate_chartcalls plusget_chart_type_schemaconsultations before converging — roughly doubling wall-clock time for a simple "create a bar chart" request.Separately,
query_dataset's metric validation produced a misleading hint chain: on a dataset with no saved metrics,Unknown metric: 'sum__global_sales'carried no guidance, while the adjacentorder_byerror suggestedDid you mean: global_sales?(a column, valid for order_by) — which LLM clients then tried as a metric, failing again in a loop. The tool is saved-metrics-only by design, but nothing in the error said so.This PR makes the validation layer absorb the unambiguous synonyms instead of refusing them, and makes the saved-metrics-only contract explicit:
Chart request schemas (
chart/schemas.py)ChartRequestNormalizerMixinonGenerateChartRequest,UpdateChartRequest,UpdateChartPreviewRequest, andGenerateExploreLinkRequest:datasource_id→dataset_id(explicitdataset_idwins)viz_typekey accepted as achart_typesynonym (explicitchart_typewins)kind:bar/dist_bar/echarts_timeseries_bar→xy+bar,echarts_timeseries_line→xy+line,big_number_total→big_number, etc. An explicitly providedkindis never overridden.XYChartConfig:xaccepts a bare column-name string (matching the existinggroup_bycoercion)yaccepts bare strings / a single entry (reuses_normalize_group_by_input)legendgains ashow_legendalias and accepts a bool (True→{"show": true})orientationgains achart_orientationaliasThe
UnknownFieldCheckMixin"did you mean?" behavior is unchanged for genuinely unknown fields —order_descstill gets a clear rejection (covered by a regression test).query_datasetmetric errors (dataset/tool/query_dataset.py)BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
N/A (API validation behavior).
Before (live transcript excerpts):
After: each of those payloads validates on the first attempt.
TESTING INSTRUCTIONS
24 new unit tests cover the alias/coercion paths (including precedence rules and the unknown-field regression); the full chart + dataset MCP suites pass (1006 tests).
For a live check: call
generate_chartwith{"datasource_id": <id>, "config": {"viz_type": "bar", "x_axis": "<column>", "y": [{"name": "<column>", "aggregate": "SUM"}], "show_legend": true}}— it should produce a chart instead of three validation errors.ADDITIONAL INFORMATION