Skip to content

Add write operations for DataHub entities#41

Merged
cjimti merged 32 commits intomainfrom
feature/write-operations
Feb 10, 2026
Merged

Add write operations for DataHub entities#41
cjimti merged 32 commits intomainfrom
feature/write-operations

Conversation

@cjimti
Copy link
Copy Markdown
Member

@cjimti cjimti commented Feb 10, 2026

Summary

Adds 7 write tools to the MCP DataHub toolkit, enabling AI assistants to modify DataHub metadata through the same composable interface used for reads. Write operations are disabled by default and must be explicitly opted in via WriteEnabled: true.

Write Tools Added

Tool Description
datahub_update_description Update entity description
datahub_add_tag Add a tag to an entity
datahub_remove_tag Remove a tag from an entity
datahub_add_glossary_term Add a glossary term to an entity
datahub_remove_glossary_term Remove a glossary term from an entity
datahub_add_link Add a link to an entity
datahub_remove_link Remove a link from an entity

Architecture

Client Layer (pkg/client/)

  • rest.go — REST API infrastructure for DataHub's /aspects endpoints. Implements getAspect() for reading raw aspect JSON and postIngestProposal() for writing metadata change proposals. All write operations use DataHub's REST API (not GraphQL) for better compatibility.

  • write.go — Seven write methods on *Client using read-modify-write patterns:

    • Tags: reads globalTags aspect, adds/removes tag associations, writes back
    • Glossary Terms: reads glossaryTerms aspect, adds/removes term associations, writes back
    • Links: reads institutionalMemory aspect, adds/removes link elements, writes back
    • Description: writes editableDatasetProperties aspect directly
    • All Add operations are idempotent (duplicate adds are no-ops)
    • All Remove operations are idempotent (removing non-existent items is a no-op)

Tools Layer (pkg/tools/)

  • write_description.go, write_tags.go, write_glossary_terms.go, write_links.go — Tool handler implementations with input validation and WriteEnabled gating via getWriteClient().

  • Toolkit integration — Write tools are registered through the same RegisterAll() / RegisterWith() pattern as read tools. getWriteClient() checks Config.WriteEnabled before dispatching, returning ErrWriteDisabled if writes are not opted in.

  • Multi-server support — Per-connection write_enabled overrides via ConnectionConfig. A connection inherits the global WriteEnabled setting unless explicitly overridden.

Configuration

# Enable write operations (default: false)
export DATAHUB_WRITE_ENABLED=true

# Per-connection override in additional servers JSON
export DATAHUB_ADDITIONAL_SERVERS='{"staging":{"url":"...","token":"...","write_enabled":false}}'

Testing

Unit Tests (30 files changed, 3287 insertions)

  • rest_test.go — REST infrastructure tests: getAspect() success/not-found/unauthorized, postIngestProposal() success/error, checkRESTStatus() for all status codes, restBaseURL() derivation.
  • write_test.go — All 7 write methods with httptest mocks covering: successful writes, duplicate detection (idempotency), missing aspects (404 → empty), invalid URNs.
  • write_description_test.go, write_tags_test.go, write_glossary_terms_test.go, write_links_test.go — Tool handler tests covering: success paths, empty URN validation, write-disabled gating, client errors.
  • toolkit_test.go — Extended with write tool registration coverage, WriteEnabled config tests, multi-server write override tests.

Integration Tests (write_integration_test.go)

  • 7 end-to-end tests against a real DataHub instance, gated by //go:build integration
  • Each test follows: Setup → Write → Verify → Cleanup → Verify Cleanup
  • Add operations include idempotency sub-tests
  • t.Cleanup() registered before writes with independent contexts
  • Unique nanos-based values prevent test collision
  • Run via make test-integration with DATAHUB_URL and DATAHUB_TOKEN

Coverage

  • Overall: 85.8% (threshold: 80%)
  • All CI checks pass: golangci-lint (0 issues), go vet, gosec (0 issues), govulncheck (no vulnerabilities)

Code Quality Improvements

  • .golangci.yml — Added additional linters and tighter thresholds
  • revive.toml — Added revive linter configuration
  • codecov.yml — Added codecov configuration
  • Makefile — Enhanced with coverage thresholds, mutation testing, patch coverage, security scanning, and verify target
  • CLAUDE.md — Updated with write tools documentation and verification standards

Test Plan

  • make verify passes (lint, test, coverage, security, deadcode, build)
  • golangci-lint run — 0 issues
  • gosec ./... — 0 issues
  • govulncheck ./... — no vulnerabilities
  • go test -race ./... — all pass (integration tests excluded by build tag)
  • Coverage ≥80% maintained
  • make test-integration against a live DataHub instance

cjimti added 30 commits February 9, 2026 23:28
Seven tests covering all write methods (UpdateDescription, AddTag,
RemoveTag, AddGlossaryTerm, RemoveGlossaryTerm, AddLink, RemoveLink)
against a real DataHub instance. Gated by //go:build integration tag
and run via make test-integration.
@cjimti cjimti self-assigned this Feb 10, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Feb 10, 2026

Codecov Report

❌ Patch coverage is 88.30409% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 87.75%. Comparing base (5123797) to head (1dec900).
⚠️ Report is 9 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #41      +/-   ##
==========================================
+ Coverage   86.34%   87.75%   +1.41%     
==========================================
  Files          32       37       +5     
  Lines        2299     2744     +445     
==========================================
+ Hits         1985     2408     +423     
+ Misses        215      207       -8     
- Partials       99      129      +30     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Inline revive rules directly in .golangci.yml instead of using
config-file reference, which is not supported in golangci-lint
v2.1.6 (used by CI). Remove strict complexity rules (cognitive-
complexity, cyclomatic, max-public-structs, argument-limit,
function-result-limit) that were never enforced in CI.
Cover error paths in client write operations (invalid URN, server
errors, invalid JSON responses) and REST API functions. Add write tool
integration tests that invoke tools through the MCP server to cover
register function closures.
@cjimti cjimti merged commit a63f8e4 into main Feb 10, 2026
8 checks passed
@cjimti cjimti deleted the feature/write-operations branch March 4, 2026 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant