Skip to content

feat(mcp): add update_dataset_metric tool for editing saved dataset metrics#40975

Open
msyavuz wants to merge 1 commit into
apache:masterfrom
msyavuz:msyavuz/feat/mcp-update-dataset-metric
Open

feat(mcp): add update_dataset_metric tool for editing saved dataset metrics#40975
msyavuz wants to merge 1 commit into
apache:masterfrom
msyavuz:msyavuz/feat/mcp-update-dataset-metric

Conversation

@msyavuz

@msyavuz msyavuz commented Jun 11, 2026

Copy link
Copy Markdown
Member

SUMMARY

MCP exposes saved dataset metrics read-only via get_dataset_info, but there is no write path — editing a metric's SQL expression, name, label, or format requires the UI. This adds an update_dataset_metric tool: given a dataset (ID/UUID) and a metric (ID/UUID/name), it partially updates the metric's properties (expression, metric_name, verbose_name, description, d3format, metric_type, currency, warning_text, extra).

Design notes:

  • A dedicated tool rather than a generic dataset-PUT: DatasetDAO.update_metrics is full-replacement (metrics omitted from the array are deleted), which is an easy foot-gun for LLM callers. The tool rebuilds the full metrics list internally — untouched metrics as {id, metric_name} stubs, the target with only the explicitly-passed properties merged on top — so it can never create or delete metrics.
  • Goes through UpdateDatasetCommand, which enforces dataset ownership and validates name uniqueness and the SQL expression (validate_stored_expression).
  • exclude_unset semantics distinguish "not provided" (keep stored value) from explicit null (clear it).
  • Also extracts the int-or-UUID dataset resolver from query_dataset into a shared dataset_utils.resolve_dataset.

Example call:

{"dataset_id": 123, "metric": "sum_revenue", "expression": "SUM(net_revenue)", "d3format": "$,.2f"}

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A — MCP tool, no UI.

TESTING INSTRUCTIONS

  • pytest tests/unit_tests/mcp_service/dataset/ (97 tests)
  • Manually: connect an MCP client, call update_dataset_metric with a dataset ID and metric name, confirm the change in the dataset editor and that other metrics are untouched.

ADDITIONAL INFORMATION

  • Has associated issue: No
  • Required feature flags: None
  • Changes UI: No
  • Includes DB Migration: No
  • Introduces new feature or API: new MCP tool update_dataset_metric
  • Removes existing feature or API: No

@dosubot dosubot Bot added the data:dataset Related to dataset configurations label Jun 11, 2026
Comment on lines +25 to +44
def resolve_dataset(
identifier: int | str, eager_options: list[Any] | None = None
) -> Any | None:
"""Resolve a dataset by int ID or UUID string.

Replicates the identifier resolution logic from ModelGetInfoCore._find_object().
"""
from superset.daos.dataset import DatasetDAO

opts = eager_options or None

if isinstance(identifier, int):
return DatasetDAO.find_by_id(identifier, query_options=opts)

# Try parsing as int
try:
id_val = int(identifier)
return DatasetDAO.find_by_id(id_val, query_options=opts)
except (ValueError, TypeError):
pass

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Restrict dataset lookup inputs to UUID-only identifiers and remove integer-ID resolution paths so this new helper does not expose internal numeric IDs through a public MCP-facing flow. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This new helper explicitly accepts and resolves integer IDs before UUIDs, which exposes internal numeric identifiers through a public MCP-facing lookup path.
That matches the custom rule about not introducing internal integer IDs when UUID-based identifiers are being used.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/dataset/dataset_utils.py
**Line:** 25:44
**Comment:**
	*Custom Rule: Restrict dataset lookup inputs to UUID-only identifiers and remove integer-ID resolution paths so this new helper does not expose internal numeric IDs through a public MCP-facing flow.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +62 to +70
def _metric_not_found_message(metrics: list[Any], identifier: int | str) -> str:
names = [m.metric_name for m in metrics]
msg = f"Metric '{identifier}' not found on this dataset."
if not names:
return f"{msg} This dataset has no saved metrics."
suggestions = difflib.get_close_matches(str(identifier), names, n=3, cutoff=0.6)
if suggestions:
return f"{msg} Did you mean: {', '.join(suggestions)}?"
return f"{msg} Available metrics: {', '.join(sorted(names))}."

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add a concise docstring to this newly added helper function so it is documented consistently with the other new functions in the file. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This is a newly added Python helper function and it does not include a docstring. The custom rule requires new functions and classes to be documented inline, so the suggestion accurately identifies a real violation.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/dataset/tool/update_dataset_metric.py
**Line:** 62:70
**Comment:**
	*Custom Rule: Add a concise docstring to this newly added helper function so it is documented consistently with the other new functions in the file.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +39 to +40
def mcp_server():
return mcp

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add an explicit return type annotation to this fixture function so new Python code is fully typed. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This is a new Python function in the PR and it has no return type annotation.
The rule requires new Python code to be fully typed, so this is a real violation.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** tests/unit_tests/mcp_service/dataset/tool/test_update_dataset_metric.py
**Line:** 39:40
**Comment:**
	*Custom Rule: Add an explicit return type annotation to this fixture function so new Python code is fully typed.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎



@pytest.fixture(autouse=True)
def mock_auth():

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Annotate the return type of this yielding fixture (for example as a generator type) to satisfy the full type-hint requirement for new functions. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This new fixture function lacks a return type annotation. Under the typing rule for new Python code, that omission is a genuine violation.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** tests/unit_tests/mcp_service/dataset/tool/test_update_dataset_metric.py
**Line:** 44:44
**Comment:**
	*Custom Rule: Annotate the return type of this yielding fixture (for example as a generator type) to satisfy the full type-hint requirement for new functions.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +56 to +61
def make_metric(
metric_id=10,
metric_name="count",
uuid="a1b2c3d4-5678-90ab-cdef-1234567890ab",
**overrides,
):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add type hints for all parameters and the return value of this helper to comply with the fully typed new-code rule. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

The helper is newly added and its parameters and return value are untyped.
That directly violates the requirement that new Python code be fully typed.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** tests/unit_tests/mcp_service/dataset/tool/test_update_dataset_metric.py
**Line:** 56:61
**Comment:**
	*Custom Rule: Add type hints for all parameters and the return value of this helper to comply with the fully typed new-code rule.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

return metric


def make_dataset(dataset_id=1, metrics=None):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Add parameter and return type annotations to this helper function so it meets the typing requirement for newly added functions. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This new helper function has untyped parameters and no return annotation, so it does violate the new-code typing rule.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** tests/unit_tests/mcp_service/dataset/tool/test_update_dataset_metric.py
**Line:** 79:79
**Comment:**
	*Custom Rule: Add parameter and return type annotations to this helper function so it meets the typing requirement for newly added functions.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

@netlify

netlify Bot commented Jun 11, 2026

Copy link
Copy Markdown

Deploy Preview for superset-docs-preview ready!

Name Link
🔨 Latest commit 2311482
🔍 Latest deploy log https://app.netlify.com/projects/superset-docs-preview/deploys/6a2aca3cabf1050008c4b3b0
😎 Deploy Preview https://deploy-preview-40975--superset-docs-preview.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.
🤖 Make changes Run an agent on this branch

To edit notification comments on pull requests, go to your Netlify project configuration.

Comment on lines +520 to +524
dataset_id: int | str = Field(
...,
description="Dataset identifier — numeric ID or UUID string. "
"Use list_datasets to find valid IDs.",
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Restrict this request identifier to UUID-only input (or a dedicated UUID field) so the new public API does not continue accepting internal integer IDs. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This is a new public API request field that explicitly accepts numeric internal IDs in addition to UUIDs. The rule says new public API identifiers should not expose or depend on internal integer IDs when UUID-based identifiers are being added or changed. This is a real violation.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/dataset/schemas.py
**Line:** 520:524
**Comment:**
	*Custom Rule: Restrict this request identifier to UUID-only input (or a dedicated UUID field) so the new public API does not continue accepting internal integer IDs.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +525 to +529
metric: int | str = Field(
...,
description="Metric to update — numeric metric ID, metric UUID, or "
"metric_name (e.g. 'sum_revenue'). Numeric strings are treated as IDs. "
"Use get_dataset_info to discover a dataset's saved metrics.",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Remove numeric metric ID support from this API identifier and require UUID (or non-internal name-only lookup) to avoid exposing or depending on internal integer IDs in the new interface. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This request schema explicitly allows a numeric metric ID and even treats numeric strings as IDs, which exposes and depends on internal integer identifiers in a new public interface. That matches the rule violation.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/dataset/schemas.py
**Line:** 525:529
**Comment:**
	*Custom Rule: Remove numeric metric ID support from this API identifier and require UUID (or non-internal name-only lookup) to avoid exposing or depending on internal integer IDs in the new interface.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +592 to +593
id: int | None = Field(None, description="Metric ID")
uuid: str | None = Field(None, description="Metric UUID")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Drop the integer metric ID from the response model and expose only the UUID-based identifier for this newly added metric detail payload. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

The response model includes an integer metric ID alongside a UUID field. Since this is a newly added public API payload and a UUID identifier is being introduced, exposing the internal integer ID violates the rule.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/dataset/schemas.py
**Line:** 592:593
**Comment:**
	*Custom Rule: Drop the integer metric ID from the response model and expose only the UUID-based identifier for this newly added metric detail payload.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

class UpdateDatasetMetricResponse(BaseModel):
"""Response schema for update_dataset_metric."""

dataset_id: int | None = Field(None, description="Dataset ID")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Replace this integer dataset identifier field with a UUID-based public identifier in the new response schema to avoid exposing internal numeric IDs. [custom_rule]

Severity Level: Minor ⚠️

Why it matters? 🤔

This response schema exposes an internal integer dataset ID in a new public API response. That is exactly the kind of identifier the rule says to avoid when UUID-based identifiers are being added or changed.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/dataset/schemas.py
**Line:** 604:604
**Comment:**
	*Custom Rule: Replace this integer dataset identifier field with a UUID-based public identifier in the new response schema to avoid exposing internal numeric IDs.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

@codecov

codecov Bot commented Jun 11, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 36.11111% with 92 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.27%. Comparing base (6c5ad1e) to head (2311482).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
.../mcp_service/dataset/tool/update_dataset_metric.py 16.66% 70 Missing ⚠️
superset/mcp_service/dataset/dataset_utils.py 20.00% 12 Missing ⚠️
superset/mcp_service/dataset/schemas.py 78.57% 9 Missing ⚠️
superset/mcp_service/dataset/tool/query_dataset.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #40975      +/-   ##
==========================================
- Coverage   64.30%   64.27%   -0.03%     
==========================================
  Files        2657     2659       +2     
  Lines      144059   144188     +129     
  Branches    33216    33228      +12     
==========================================
+ Hits        92639    92684      +45     
- Misses      49798    49880      +82     
- Partials     1622     1624       +2     
Flag Coverage Δ
hive 39.44% <36.11%> (-0.01%) ⬇️
mysql 58.15% <36.11%> (-0.04%) ⬇️
postgres 58.22% <36.11%> (-0.05%) ⬇️
presto 41.03% <36.11%> (-0.01%) ⬇️
python 59.69% <36.11%> (-0.05%) ⬇️
sqlite 57.84% <36.11%> (-0.04%) ⬇️
unit 100.00% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines +510 to +511
symbolPosition: Literal["prefix", "suffix"] | None = Field( # noqa: N815
None, description="Where to render the symbol relative to the value."

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Restricting currency.symbolPosition to only "prefix"/"suffix" is incompatible with the existing dataset update contract, which accepts arbitrary non-empty strings (and legacy data may contain other values). This can cause valid update requests or serialization of existing metric currency payloads to fail with validation errors. Align this field type with the broader Superset schema behavior instead of enforcing a strict enum here. [api mismatch]

Severity Level: Major ⚠️
- ❌ update_dataset_metric fails for metrics with custom symbolPosition
- ⚠️ MCP clients see unexpected validation or internal errors
Steps of Reproduction ✅
1. Observe the existing REST dataset update contract in
`superset/datasets/schemas.py:17-20`, where
`DatasetMetricCurrencyPutSchema.symbolPosition` is defined as
`fields.String(validate=Length(1, 128))` with no enum restriction, and in the generated
OpenAPI docs (`docs/static/resources/openapi.json:5205-5208`) describing `symbolPosition`
as a generic string field. This confirms existing APIs accept arbitrary non-empty strings
(not just `"prefix"`/`"suffix"`).

2. Note that metric currency values are stored in the database via `CurrencyType` in
`superset/models/sql_types/base.py:67-86`, which parses legacy string values into dicts
using `parse_currency_string` without constraining `symbolPosition`. As a result, existing
datasets can legitimately contain arbitrary `currency={"symbol": "USD", "symbolPosition":
"before"}` or other non-enum values.

3. Call the MCP tool `update_dataset_metric` (entrypoint
`superset/mcp_service/dataset/tool/update_dataset_metric.py:113-229`) against any dataset
whose metric has such a currency dict (e.g. created previously via the REST API or UI), by
sending a minimal request that does not touch currency: `{"request": {"dataset_id": 1,
"metric": "count", "expression": "COUNT(1)"}}`. The tool loads the dataset metrics (lines
167-179) and locates the target metric (lines 179-183).

4. After `UpdateDatasetCommand.run()` returns the updated dataset (lines 207-213), the
tool serializes the metric via `_serialize_metric` at `update_dataset_metric.py:73-100`.
That calls `MetricCurrency.model_validate(currency)` when `currency` is a dict (lines
93-95); however `MetricCurrency.symbolPosition` is declared as `Literal["prefix",
"suffix"]` in `superset/mcp_service/dataset/schemas.py:35-43`, so any existing metric with
`symbolPosition` not equal to `"prefix"` or `"suffix"` causes a
`pydantic.ValidationError`. This bubbles into the broad `except Exception` block at lines
253-258, which logs `"Unexpected error updating dataset metric"` and re-raises, causing
the MCP tool call to fail with an internal error even though the underlying dataset update
may have succeeded.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/dataset/schemas.py
**Line:** 510:511
**Comment:**
	*Api Mismatch: Restricting `currency.symbolPosition` to only `"prefix"`/`"suffix"` is incompatible with the existing dataset update contract, which accepts arbitrary non-empty strings (and legacy data may contain other values). This can cause valid update requests or serialization of existing metric currency payloads to fail with validation errors. Align this field type with the broader Superset schema behavior instead of enforcing a strict enum here.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

Comment on lines +589 to +598
class DatasetMetricDetail(SqlMetricInfo):
"""Full saved-metric details, including identifiers."""

id: int | None = Field(None, description="Metric ID")
uuid: str | None = Field(None, description="Metric UUID")
metric_type: str | None = Field(None, description="Metric type")
currency: MetricCurrency | None = Field(
None, description="Currency formatting configuration"
)
warning_text: str | None = Field(None, description="Warning text")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The response model for updated metrics omits the extra property even though the request allows updating it and the tool advertises returning the metric after update. This makes the response incomplete for one of the supported update fields and breaks callers that expect to read back the updated extra value. Add extra to the metric detail response schema so response content matches supported updates. [incomplete implementation]

Severity Level: Major ⚠️
- ⚠️ update_dataset_metric response omits updated extra metadata
- ⚠️ Callers cannot confirm saved metric extra changes
Steps of Reproduction ✅
1. Inspect the request schema for the MCP tool in
`superset/mcp_service/dataset/schemas.py:46-92`. `UpdateDatasetMetricRequest` includes
`extra: str | None = Field(None, description="JSON-encoded string with extra metric
metadata.")` (lines 88-92) and `UPDATABLE_METRIC_FIELDS` (lines 20-32) includes `"extra"`,
so callers are explicitly allowed to update `extra`.

2. Inspect the response-side metric model `DatasetMetricDetail` in the same file at
`superset/mcp_service/dataset/schemas.py:120-129`. It subclasses `SqlMetricInfo` (which
defines `metric_name`, `verbose_name`, `expression`, `description`, `d3format` at lines
41-50) and adds `id`, `uuid`, `metric_type`, `currency`, and `warning_text` (lines
123-129), but does not define an `extra` field.

3. Examine the tool implementation in
`superset/mcp_service/dataset/tool/update_dataset_metric.py:73-100`. `_serialize_metric()`
builds a `DatasetMetricDetail` from the ORM metric, setting `id`, `uuid`, `metric_name`,
`verbose_name`, `expression`, `description`, `d3format`, `metric_type`, `currency`, and
`warning_text`, but never includes `metric.extra` when constructing the response object.

4. From an MCP client, call the tool `update_dataset_metric` (entrypoint
`update_dataset_metric.py:113-229`) with a payload that only updates `extra`, for example:
`{"request": {"dataset_id": 1, "metric": "count", "extra": "{\"foo\": \"bar\"}"}}`. The
request validates and `UpdateDatasetMetricRequest.updates()` (lines 94-103) includes
`"extra"` in the metrics payload sent to `UpdateDatasetCommand`. After the update, the
tool returns `UpdateDatasetMetricResponse.metric` as a `DatasetMetricDetail` (lines
220-223), but because `DatasetMetricDetail` has no `extra` field and `_serialize_metric()`
never passes it through, the JSON response lacks any `metric["extra"]` field. Thus callers
that just updated `extra` cannot read back the new value via this tool, despite the schema
advertising that the metric is returned "after the update".

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** superset/mcp_service/dataset/schemas.py
**Line:** 589:598
**Comment:**
	*Incomplete Implementation: The response model for updated metrics omits the `extra` property even though the request allows updating it and the tool advertises returning the metric after update. This makes the response incomplete for one of the supported update fields and breaks callers that expect to read back the updated `extra` value. Add `extra` to the metric detail response schema so response content matches supported updates.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

@bito-code-review bito-code-review Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Agent Run #513598

Actionable Suggestions - 1
  • superset/mcp_service/dataset/dataset_utils.py - 1
    • Missing unit tests for new utility · Line 25-50
Additional Suggestions - 2
  • tests/unit_tests/mcp_service/dataset/tool/test_update_dataset_metric.py - 1
    • Missing error path test coverage · Line 253-378
      The test suite covers DatasetNotFoundError, DatasetForbiddenError, and DatasetInvalidError, but does not test DatasetUpdateFailedError which is handled at update_dataset_metric.py lines 248-252. Adding coverage ensures complete error path validation.
  • superset/mcp_service/app.py - 1
    • Documentation field name mismatch · Line 166-166
      The documentation says 'label' but the actual updatable field is named 'verbose_name' per the schema definition. Update the description to say 'verbose_name' for consistency.
      Code suggestion
      --- a/superset/mcp_service/app.py
      +++ b/superset/mcp_service/app.py
       @@ -163,7 +163,7 @@ Dataset Management:
      - list_datasets: List datasets with advanced filters (1-based pagination)
      - get_dataset_info: Get detailed dataset information by ID (includes columns/metrics)
      - create_virtual_dataset: Save a SQL query as a virtual dataset for charting (requires write access)
      -- update_dataset_metric: Update a saved metric on a dataset — expression, name, label, format (requires dataset ownership)
      +- update_dataset_metric: Update a saved metric on a dataset — expression, name, verbose_name, format (requires dataset ownership)
      - query_dataset: Query a dataset using its semantic layer (saved metrics, dimensions, filters) without needing a saved chart
Filtered by Review Rules

Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.

  • tests/unit_tests/mcp_service/dataset/tool/test_update_dataset_metric.py - 1
  • superset/mcp_service/dataset/schemas.py - 1
    • Missing validation_alias for symbolPosition · Line 510-512
Review Details
  • Files reviewed - 8 · Commit Range: 2311482..2311482
    • superset/mcp_service/app.py
    • superset/mcp_service/dataset/dataset_utils.py
    • superset/mcp_service/dataset/schemas.py
    • superset/mcp_service/dataset/tool/__init__.py
    • superset/mcp_service/dataset/tool/query_dataset.py
    • superset/mcp_service/dataset/tool/update_dataset_metric.py
    • tests/unit_tests/mcp_service/dataset/tool/test_query_dataset.py
    • tests/unit_tests/mcp_service/dataset/tool/test_update_dataset_metric.py
  • Files skipped - 0
  • Tools
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful
    • MyPy (Static Code Analysis) - ✔︎ Successful
    • Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

  • /review - Manually triggers a full AI review.

  • /pause - Pauses automatic reviews on this pull request.

  • /resume - Resumes automatic reviews.

  • /resolve - Marks all Bito-posted review comments as resolved.

  • /abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by Bito Logo

Comment on lines +25 to +50
def resolve_dataset(
identifier: int | str, eager_options: list[Any] | None = None
) -> Any | None:
"""Resolve a dataset by int ID or UUID string.

Replicates the identifier resolution logic from ModelGetInfoCore._find_object().
"""
from superset.daos.dataset import DatasetDAO

opts = eager_options or None

if isinstance(identifier, int):
return DatasetDAO.find_by_id(identifier, query_options=opts)

# Try parsing as int
try:
id_val = int(identifier)
return DatasetDAO.find_by_id(id_val, query_options=opts)
except (ValueError, TypeError):
pass

# Try UUID
if _is_uuid(str(identifier)):
return DatasetDAO.find_by_id(identifier, id_column="uuid", query_options=opts)

return None

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing unit tests for new utility

The resolve_dataset function is new but lacks dedicated unit tests. Per BITO.md rule [11730], new tools/functions require comprehensive unit tests covering success paths, error scenarios, and edge cases. Tests should be placed in tests/unit_tests/mcp_service/dataset/ following existing MCP test patterns.

Code Review Run #513598


Should Bito avoid suggestions like this for future reviews? (Manage Rules)

  • Yes, avoid them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

data:dataset Related to dataset configurations size/XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant