Skip to content

Feat: BurstIQ database Connector#25268

Merged
ulixius9 merged 16 commits intoopen-metadata:mainfrom
akashverma0786:feature/burstiq-database-connector
Feb 14, 2026
Merged

Feat: BurstIQ database Connector#25268
ulixius9 merged 16 commits intoopen-metadata:mainfrom
akashverma0786:feature/burstiq-database-connector

Conversation

@akashverma0786
Copy link
Collaborator

@akashverma0786 akashverma0786 commented Jan 13, 2026

Describe your changes:

Screenshot 2026-01-15 at 10 30 54 (1) Screenshot 2026-01-15 at 10 30 28 Screenshot 2026-01-15 at 10 30 07

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • New database connector:
    • Complete BurstIQ LifeGraph connector with metadata ingestion, lineage extraction, and test connection validation
  • Multi-tenant configuration:
    • Required fields: username, password, realmName (Keycloak auth), biqSdzName (Secure Data Zone), biqCustomerName (customer identifier)
    • SDZ and customer name passed as HTTP headers (biq_sdz_name, biq_customer_name) for API request routing
  • Python ingestion implementation:
    • client.py (316 lines) with lazy authentication, token refresh, and multi-tenant header management
    • metadata.py (560 lines) extracts databases, schemas, tables, columns, and constraints
    • lineage.py (232 lines) extracts table-to-table lineage from Edge definitions
  • Comprehensive test coverage:
    • 1,280 lines of unit tests across 4 test files with multi-tenant configuration mocking
  • UI integration:
    • Service icon, connector documentation (BurstIQ.md), BETA service designation, and generated TypeScript types

This will update automatically on new commits.


@akashverma0786 akashverma0786 self-assigned this Jan 13, 2026
@akashverma0786 akashverma0786 requested review from a team as code owners January 13, 2026 17:03
@akashverma0786 akashverma0786 added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Jan 13, 2026
@github-actions
Copy link
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 13, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 65%
65.65% (56181/85580) 45.04% (29332/65126) 47.86% (8868/18530)

@github-actions
Copy link
Contributor

The Python checkstyle failed.

Please run make py_format and py_format_check in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Python code formatting.

You can install the pre-commit hooks with make install_test precommit_install.

@gitar-bot
Copy link

gitar-bot bot commented Feb 13, 2026

🔍 CI failure analysis for fc6ea7d: All CI failures unrelated to BurstIQ connector. maven-sonarcloud-ci had 8 integration test failures (0.1% failure rate - 7926 tests run, 7918 passed): TableResourceTest I/O errors, DataProductResourceTest status code 400 errors. Previous failures: 2 Python tests (CockroachDB permissions - 99.6% pass rate), 2 playwright shards (E2E timeouts), 2 integration tests (vector embedding), mcp-tests (Docker v1.32), generate-types (git fetch), maven-collate-ci (cascading).

Issue

Current CI Failures: 2 new jobs analyzed (maven-sonarcloud-ci, Test Report) + 10 previous failures

Root Cause

New Failures

maven-sonarcloud-ci (job 63579493666):

  • Integration test failures: 8 out of 7926 tests failed (99.9% pass rate)
  • Test results: 7926 tests run, 7918 passed, 701 skipped, 8 failed
  • Failed tests:
    1. TableResourceTest.java:2670 - I/O error during HTTP request (multiple occurrences)
    2. DataProductResourceTest.java:850 - expected: <success> but was: <failure>
    3. DataProductResourceTest.java:904 - status code: 400, reason phrase: Error reading response
  • Error patterns:
    • Network I/O errors during HTTP requests (TableResourceTest)
    • Status code 400 errors (DataProductResourceTest)
    • Response reading failures
  • Duration: Long-running job (1h 27m build + test time)

Test Report (job 63594857596): Dependent on upstream test failures

Previous Failures (from prior analysis)

py-run-tests (3.11) (job 63579310708) & py-run-tests (3.10) (job 63579310701):

  • Identical CockroachDB permission issues (99.6% pass rate - 549/551 tests passed)

playwright-ci-postgresql (5, 6) (job 63579314432) & (2, 6) (job 63579314426):

  • Multiple E2E test timeouts

integration-tests-mysql-elasticsearch (job 63579282988) & integration-tests-postgres-opensearch (job 63579283248):

  • Vector embedding resource cleanup race conditions

mcp-tests (job 63579282989):

  • Docker testcontainers client v1.32 too old

generate-types (job 63579282980):

  • Git fetch network failure

maven-collate-ci (job 63579282903):

  • Cascading failure from upstream dependencies

Details

BurstIQ Changes Analysis

All PR changes isolated to BurstIQ connector:

  • Python ingestion source files (ingestion/src/metadata/ingestion/source/database/burstiq/)
  • Unit test files (ingestion/tests/unit/topology/database/test_burstiq_*.py)
  • Configuration schemas (openmetadata-spec/.../burstIQConnection.json)
  • UI integration (service icon, docs, TypeScript types)
  • Minor UI component updates for column display panels

No changes to:

  • TableResourceTest or TableResource backend code
  • DataProductResourceTest or DataProductResource backend code
  • Domain functionality
  • HTTP client or network handling
  • Response parsing logic
  • Java integration test framework
  • CockroachDB profiler functionality
  • Playwright E2E tests
  • Vector embedding services
  • MCP module or Docker configurations

maven-sonarcloud-ci Detailed Analysis

The failures show flaky integration test patterns:

TableResourceTest I/O errors:

I/O error during HTTP request

This indicates:

  • Network connectivity issues during test execution
  • HTTP client timeouts or connection failures
  • Common in long-running integration test suites (1h+ runtime)
  • Backend server may be slow to respond under load

DataProductResourceTest errors:

expected: <success> but was: <failure>
status code: 400, reason phrase: Error reading response

This indicates:

  • Backend API returned 400 Bad Request
  • Response parsing failure
  • Possible test data race condition or timing issue
  • May indicate backend service instability during long test runs

High pass rate: 7918/7926 tests passed (99.9%), including:

  • 7900+ integration tests across all OpenMetadata services
  • Database resource tests
  • Team, user, workflow tests
  • Only 8 tests failed out of 7926 total tests

The failures are isolated and flaky:

  • Only 0.1% failure rate
  • Network/HTTP-related errors (not code logic errors)
  • Tests in features completely unrelated to database connectors
  • Similar patterns seen in previous CI runs (I/O errors, timeouts)

Relationship to BurstIQ

None. The BurstIQ connector:

  • Does not modify TableResource or DataProductResource Java code
  • Does not change HTTP client or network handling
  • Does not alter integration test framework
  • Does not affect response parsing logic
  • Does not modify domain functionality
  • Only adds a new Python-based REST API connector for metadata extraction from BurstIQ LifeGraph

These are flaky integration test failures showing network I/O issues and timing problems during long-running test execution, completely unrelated to the BurstIQ connector implementation.

Code Review 👍 Approved with suggestions 2 resolved / 5 findings

Well-structured new BurstIQ connector with comprehensive tests. Two minor findings from previous review remain unresolved (truthiness checks on limit/skip, dict rebuilt per call), plus one new minor edge case around missing token validation after auth.

💡 Edge Case: No validation of access_token after successful auth

📄 ingestion/src/metadata/ingestion/source/database/burstiq/client.py:106

After response.raise_for_status() succeeds, self.access_token is set via token_data.get("access_token") without checking that the value is non-null/non-empty. If the OAuth server returns 200 OK but with a malformed body (missing access_token field), the code will:

  1. Set self.access_token = None
  2. Log "Authentication successful"
  3. On every subsequent API call, _get_auth_header() will see not self.access_token is True and re-trigger _authenticate(), causing repeated auth attempts

Adding a validation check after extracting the token would make this more robust and provide a clear error message rather than a confusing retry loop.

Suggested fix
            self.access_token = token_data.get("access_token")
            if not self.access_token:
                raise ValueError(
                    "Authentication response did not contain an access token"
                )
💡 Edge Case: Truthiness checks on limit/skip exclude valid zero values

📄 ingestion/src/metadata/ingestion/source/database/burstiq/client.py:262-263 📄 ingestion/src/metadata/ingestion/source/database/burstiq/client.py:316-319

In get_dictionaries() and get_edges(), the parameters limit and skip are checked with if limit: and if skip:. Since 0 is falsy in Python, passing limit=0 or skip=0 will silently drop these from the query parameters.

While limit=0 is unusual, skip=0 is a perfectly valid pagination offset meaning "start from the beginning." This could cause subtle bugs if callers explicitly pass skip=0.

Suggested fix: Use is not None checks:

if limit is not None:
    params["limit"] = limit
if skip is not None:
    params["skip"] = skip
💡 Quality: _map_burstiq_datatype rebuilds dict on every call

📄 ingestion/src/metadata/ingestion/source/database/burstiq/metadata.py:368

The type_mapping dictionary inside _map_burstiq_datatype() is constructed on every invocation. Since this method is called once per column for every table, this creates unnecessary overhead during metadata ingestion of large schemas (the PR mentions 600+ dictionaries).

Suggested fix: Move type_mapping to a module-level constant or a class attribute so it's allocated once.

✅ 2 resolved
Bug: fqn.build with entity_type=Table rejects column_name kwarg

📄 ingestion/src/metadata/ingestion/source/database/burstiq/metadata.py:434-442
The foreign key constraint code at lines 434-442 of metadata.py calls fqn.build() with entity_type=Table and passes column_name=attribute.name. However, the Table FQN builder registered in fqn.py only accepts service_name, database_name, schema_name, table_name, fetch_multiple_entities, and skip_es_search as keyword arguments (enforced by * keyword-only syntax). Passing the extra column_name kwarg will raise a TypeError, which gets wrapped in an FQNBuildingException.

This means any dictionary with a referenceDictionaryName attribute will cause a runtime crash when building foreign key constraints, effectively preventing FK metadata from being ingested.

The correct approach (matching common_db_source.py) is:

  1. Build the table FQN first (without column_name)
  2. Then append the column name using fqn._build(table_fqn, column_name)
  3. Wrap the result in FullyQualifiedEntityName()
referred_table_fqn = fqn.build(
    metadata=self.metadata,
    entity_type=Table,
    service_name=self.context.get().database_service,
    database_name=self.context.get().database,
    schema_name=self.context.get().database_schema,
    table_name=attribute.referenceDictionaryName,
)
if referred_table_fqn:
    col_fqn = fqn._build(referred_table_fqn, attribute.name, quote=False)
    if col_fqn:
        table_constraints.append(
            TableConstraint(
                constraintType=ConstraintType.FOREIGN_KEY,
                columns=[attribute.name],
                referredColumns=[FullyQualifiedEntityName(col_fqn)],
            )
        )
Bug: Foreign key referredColumns needs FQN, not bare column name

📄 ingestion/src/metadata/ingestion/source/database/burstiq/metadata.py:434-442
The referredColumns field in TableConstraint expects fully qualified entity names (FQNs) as defined by the OpenMetadata schema (basic.json#/definitions/fullyQualifiedEntityName). Currently, the code passes just the bare attribute name (e.g., "patient_id"), which won't properly reference the column in the referenced table.

Additionally, the comment "Assume same column name in referenced table" is a risky assumption — the foreign key column in the source table may have a different name than the primary key column in the referenced table.

Suggested fix: Build a proper FQN for the referred column using fqn.build() and the referenceDictionaryName, or omit referredColumns if the target column is unknown:

if attribute.referenceDictionaryName:
    referred_col_fqn = fqn.build(
        metadata=self.metadata,
        entity_type=Table,
        service_name=self.context.get().database_service,
        database_name=self.context.get().database,
        schema_name=self.context.get().database_schema,
        table_name=attribute.referenceDictionaryName,
        column_name=attribute.name,  # or the actual referenced column if available
    )
    table_constraints.append(
        TableConstraint(
            constraintType=ConstraintType.FOREIGN_KEY,
            columns=[attribute.name],
            referredColumns=[referred_col_fqn] if referred_col_fqn else [],
        )
    )

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

@sonarqubecloud
Copy link

@ulixius9 ulixius9 merged commit 281570e into open-metadata:main Feb 14, 2026
22 of 33 checks passed
akashverma0786 pushed a commit that referenced this pull request Feb 17, 2026
* Feat: BurstIQ database Connector

* ts files

* burstiq python and ui changes

* Added tests

* Added req for sdz and customer name in schema and ts files

* Fixed tests for new changes

* fix code and python checkstyle

---------

Co-authored-by: Akash Verma <akashverma@Akashs-MacBook-Pro-2.local>
Co-authored-by: Sriharsha Chintalapani <harshach@users.noreply.github.com>
Co-authored-by: Mayur Singal <39544459+ulixius9@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants