mcp-data-platform-v0.17.2
Performance fix: Disables SELECT COUNT(*) row estimation during DataHub query enrichment by default, eliminating full table scans that caused search timeouts.
Problem
When datahub_query_enrichment: true, every DataHub tool call (search, get_entity, get_schema, etc.) enriches results with Trino query context by calling GetTableAvailability() for each URN. This unconditionally ran SELECT COUNT(*) FROM table per URN — a search returning 10 results triggered 10 sequential COUNT() queries. On large PostgreSQL tables, each COUNT() caused a full table scan, making most DataHub searches timeout.
Fix
Added EstimateRowCounts config flag to the Trino query adapter, defaulting to false (disabled). When disabled, GetTableAvailability() still verifies table existence via DescribeTable but skips the expensive COUNT(*). Table availability, query table paths, and sample SQL are all still returned — only the row count estimate is omitted.
Upgrading
No config changes needed. The fix takes effect immediately on upgrade — COUNT(*) queries stop, DataHub searches become fast.
To restore row count estimates (if your tables are small enough), explicitly opt in:
injection:
datahub_query_enrichment: true
estimate_row_counts: true # opt-in to COUNT(*) per tableChanges
| File | Change |
|---|---|
pkg/query/trino/adapter.go |
Added EstimateRowCounts bool to Config; guarded COUNT(*) behind the flag; extracted estimateRowCount() helper to satisfy complexity limits |
pkg/platform/config.go |
Added EstimateRowCounts to InjectionConfig (yaml: estimate_row_counts) |
pkg/platform/platform.go |
Wired Injection.EstimateRowCounts into Trino adapter config |
pkg/query/trino/adapter_test.go |
Added TestGetTableAvailability_RowCountsDisabled; updated existing tests to explicitly set EstimateRowCounts: true |
pkg/platform/config_test.go |
Added assertion that EstimateRowCounts defaults to false |
Changelog
Installation
Homebrew (macOS)
brew install txn2/tap/mcp-data-platformClaude Code CLI
claude mcp add mcp-data-platform -- mcp-data-platformDocker
docker pull ghcr.io/txn2/mcp-data-platform:v0.17.2Verification
All release artifacts are signed with Cosign. Verify with:
cosign verify-blob --bundle mcp-data-platform_0.17.2_linux_amd64.tar.gz.sigstore.json \
mcp-data-platform_0.17.2_linux_amd64.tar.gz