Skip to content

mcp-data-platform-v0.17.2

Choose a tag to compare

@github-actions github-actions released this 13 Feb 19:54
· 151 commits to main since this release
2667559

Performance fix: Disables SELECT COUNT(*) row estimation during DataHub query enrichment by default, eliminating full table scans that caused search timeouts.

Problem

When datahub_query_enrichment: true, every DataHub tool call (search, get_entity, get_schema, etc.) enriches results with Trino query context by calling GetTableAvailability() for each URN. This unconditionally ran SELECT COUNT(*) FROM table per URN — a search returning 10 results triggered 10 sequential COUNT() queries. On large PostgreSQL tables, each COUNT() caused a full table scan, making most DataHub searches timeout.

Fix

Added EstimateRowCounts config flag to the Trino query adapter, defaulting to false (disabled). When disabled, GetTableAvailability() still verifies table existence via DescribeTable but skips the expensive COUNT(*). Table availability, query table paths, and sample SQL are all still returned — only the row count estimate is omitted.

Upgrading

No config changes needed. The fix takes effect immediately on upgrade — COUNT(*) queries stop, DataHub searches become fast.

To restore row count estimates (if your tables are small enough), explicitly opt in:

injection:
  datahub_query_enrichment: true
  estimate_row_counts: true    # opt-in to COUNT(*) per table

Changes

File Change
pkg/query/trino/adapter.go Added EstimateRowCounts bool to Config; guarded COUNT(*) behind the flag; extracted estimateRowCount() helper to satisfy complexity limits
pkg/platform/config.go Added EstimateRowCounts to InjectionConfig (yaml: estimate_row_counts)
pkg/platform/platform.go Wired Injection.EstimateRowCounts into Trino adapter config
pkg/query/trino/adapter_test.go Added TestGetTableAvailability_RowCountsDisabled; updated existing tests to explicitly set EstimateRowCounts: true
pkg/platform/config_test.go Added assertion that EstimateRowCounts defaults to false

Changelog

  • fix: disable COUNT(*) row estimation in query enrichment by default (#92) (@cjimti)

Installation

Homebrew (macOS)

brew install txn2/tap/mcp-data-platform

Claude Code CLI

claude mcp add mcp-data-platform -- mcp-data-platform

Docker

docker pull ghcr.io/txn2/mcp-data-platform:v0.17.2

Verification

All release artifacts are signed with Cosign. Verify with:

cosign verify-blob --bundle mcp-data-platform_0.17.2_linux_amd64.tar.gz.sigstore.json \
  mcp-data-platform_0.17.2_linux_amd64.tar.gz