Skip to content

Conversation

@IceS2
Copy link
Contributor

@IceS2 IceS2 commented Feb 12, 2026

  1. workflow_output_handler.py
    Fix IndexError crash in _print_execution_time_summary() when no execution time entries are recorded. The tabulate() call passes a 6-element colalign tuple but the table has 0 columns when data is empty, causing an index-out-of-range. Fix: early return when the summary table has no rows.
  2. datalake/connection.py
    Move cloud-provider client imports (DatalakeS3Client, DatalakeGcsClient, DatalakeAzureBlobClient) behind lazy-imports inside _get_client(). This avoids importing all three cloud SDKs (boto3, google-cloud, azure) at module load time. Only the one actually needed gets imported. Reduces import-time errors when optional cloud dependencies aren't installed.
  3. mlflow/metadata.py
    Fix _get_ml_store() to use the run's artifact_uri for the ML store storage location instead of version.source. The model version's source field points to the model artifact path, while run.info.artifact_uri points to the actual storage root. The run object is now passed as a parameter.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR applies small robustness and dependency-loading fixes across the ingestion framework to prevent crashes in edge cases, reduce optional-cloud dependency import issues, and correct the MLflow MLStore storage location used during ingestion.

Changes:

  • Prevent _print_execution_time_summary() from calling tabulate() with an empty execution-time summary.
  • Lazy-import cloud-provider Datalake clients in _get_client() to avoid importing unused SDKs at module load time.
  • Use MLflow run artifact_uri (when available) for MlStore.storage, and pass the run into _get_ml_store().

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
ingestion/src/metadata/workflow/workflow_output_handler.py Adds an early return to avoid tabulate() crashing when no execution-time entries exist.
ingestion/src/metadata/ingestion/source/database/datalake/connection.py Moves S3/GCS/Azure client imports into provider-specific branches to avoid loading optional SDKs unnecessarily.
ingestion/src/metadata/ingestion/source/mlmodel/mlflow/metadata.py Adjusts MLflow MLStore storage to prefer the run’s artifact_uri and updates method signature accordingly.

Comment on lines 170 to 174
def _get_ml_store( # pylint: disable=arguments-differ
self,
version: ModelVersion,
run,
) -> Optional[MlStore]:
Copy link

Copilot AI Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run is introduced as a new parameter but is left untyped. This module otherwise imports and uses MLflow entity types (e.g., RunData, ModelVersion), so adding a Run/Optional[Run] annotation would keep the API self-documenting and improve IDE/static analysis support.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Contributor

github-actions bot commented Feb 12, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.13)

Vulnerabilities (28)

Package Vulnerability ID Severity Installed Version Fixed Version
linux-libc-dev CVE-2024-46786 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-21946 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-22022 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-22083 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-22107 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-22121 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-37926 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-38022 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-38129 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-38361 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-38718 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-39871 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-68340 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-68349 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-68800 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-71085 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2025-71116 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-22984 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-22990 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23001 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23010 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23054 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23074 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23084 🚨 HIGH 6.1.159-1 6.1.162-1
linux-libc-dev CVE-2026-23097 🚨 HIGH 6.1.159-1 6.1.162-1
pcre2 CVE-2022-1586 🔥 CRITICAL 10.32-3.el8_6 10.40-1
pcre2 CVE-2022-1587 🔥 CRITICAL 10.32-3.el8_6 10.40-1
pcre2 CVE-2019-20454 🚨 HIGH 10.32-3.el8_6 10.34-1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (9)

Package Vulnerability ID Severity Installed Version Fixed Version
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

@github-actions
Copy link
Contributor

github-actions bot commented Feb 12, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.12)

Vulnerabilities (7)

Package Vulnerability ID Severity Installed Version Fixed Version
libpam-modules CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-modules-bin CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-runtime CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam0g CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
pcre2 CVE-2022-1586 🔥 CRITICAL 10.32-3.el8_6 10.40-1
pcre2 CVE-2022-1587 🔥 CRITICAL 10.32-3.el8_6 10.40-1
pcre2 CVE-2019-20454 🚨 HIGH 10.32-3.el8_6 10.34-1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (19)

Package Vulnerability ID Severity Installed Version Fixed Version
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
aiohttp CVE-2025-69223 🚨 HIGH 3.12.12 3.13.3
aiohttp CVE-2025-69223 🚨 HIGH 3.13.2 3.13.3
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6
azure-core CVE-2026-21226 🚨 HIGH 1.37.0 1.38.0
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
protobuf CVE-2026-0994 🚨 HIGH 4.25.8 6.33.5, 5.29.6
pyasn1 CVE-2026-23490 🚨 HIGH 0.6.1 0.6.2
python-multipart CVE-2026-24486 🚨 HIGH 0.0.20 0.0.22
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

@gitar-bot
Copy link

gitar-bot bot commented Feb 12, 2026

🔍 CI failure analysis for 831fd0a: All CI failures (Python 3.10 setup failure, Playwright E2E, Python 3.11 Oracle lineage) are unrelated to this PR's Python backend ingestion changes. py-run-tests 3.10 failed due to missing cachetools dependency during environment setup.

Issue

Multiple CI jobs have failed across different test suites:

  • Python Tests (py-run-tests 3.10): Test environment setup failure with missing cachetools dependency
  • Playwright E2E (playwright-ci-postgresql 2, 6): 1 failed + 3 flaky tests (290 passed - 98.6% pass rate)
  • Python Tests (py-run-tests 3.11): 4 Oracle view lineage failures (3821 passed - 99.9% pass rate)

Root Cause

All test failures are unrelated to the changes in this PR. The PR modifies only Python ingestion backend files:

  • ingestion/src/metadata/ingestion/source/database/datalake/connection.py (lazy imports for cloud clients)
  • ingestion/src/metadata/ingestion/source/mlmodel/mlflow/metadata.py (artifact URI fix)
  • ingestion/src/metadata/workflow/workflow_output_handler.py (IndexError fix)

Details

NEW: Python Test Environment Setup Failure (py-run-tests 3.10)

Error Type: Dependency installation failure during test environment setup

Error Message:

ModuleNotFoundError: No module named 'cachetools'

Location: /opt/hostedtoolcache/Python/3.10.19/x64/lib/python3.10/site-packages/metadata/ingestion/lineage/masker.py:22

Import Chain Leading to Failure:

validate_compose.py
  → metadata.utils.logger
    → metadata.__init__
      → metadata.profiler.metrics.registry
        → metadata.profiler.metrics.system.system
          → metadata.utils.importer
            → metadata.ingestion.api.steps
              → metadata.ingestion.api.step
                → metadata.ingestion.ometa.ometa_api
                  → metadata.ingestion.ometa.mixins.mlmodel_mixin
                    → metadata.ingestion.ometa.mixins.lineage_mixin
                      → metadata.ingestion.lineage.parser
                        → metadata.ingestion.lineage.masker
                          → from cachetools import LRUCache  # FAILS HERE

What Happened:

  1. CI job started test environment setup
  2. Docker containers built and started successfully (PostgreSQL, OpenSearch, OpenMetadata server, Airflow)
  3. Pip installed openmetadata-ingestion package
  4. During DAG validation, attempted to import metadata.utils.logger
  5. Import chain reached metadata.ingestion.lineage.masker which tried to import cachetools
  6. cachetools module not found - import failed
  7. DAG validation failed with exit code 1
  8. Setup continued with warning: "⚠ Warning: DAG validation failed with exit code 1"

Additional Setup Issues Noted:

  1. Dependency Conflicts During Installation:

    litellm 1.80.9 requires grpcio<1.68.0,>=1.62.3, but you have grpcio 1.78.0
    opentelemetry-proto 1.27.0 requires protobuf<5.0,>=3.19, but you have protobuf 6.33.5
    pynacl 1.6.1 requires cffi>=2.0.0, but you have cffi 1.17.1
    snowflake-snowpark-python 1.43.0 requires protobuf<6.32,>=3.20, but you have protobuf 6.33.5
    

    These dependency version conflicts suggest broader issues with the pip installation in this environment.

  2. Missing DAG:

    ✗ Failed to unpause index_metadata (HTTP 404)
    Response: {"detail":"Dag with id: index_metadata was not found"}
    

    The index_metadata DAG was not available, suggesting incomplete test environment setup.

Why This Is Unrelated to PR Changes:

Aspect This PR Failure
Files Modified datalake/connection.py, mlflow/metadata.py, workflow/output_handler.py lineage/masker.py (NOT modified)
Changes Made Lazy imports (S3/GCS/Azure), artifact URI fix, empty data guard Missing cachetools import
Scope Datalake connector, MLflow, workflow output Lineage masking module
Type of Change Code logic changes Dependency installation issue

Detailed Analysis:

  1. This PR does NOT modify:

    • metadata/ingestion/lineage/masker.py (the file where import fails)
    • requirements.txt or setup.py (dependency specifications)
    • Installation or setup scripts
    • Any code related to cachetools dependency
  2. This PR modifies:

    • Datalake connection: Moves cloud client imports inside method (lazy loading)
    • MLflow metadata: Uses run.info.artifact_uri instead of version.source
    • Workflow output handler: Adds early return when summary table is empty
  3. The failure is environmental:

    • Occurred during test environment setup, before any tests ran
    • cachetools is a required dependency for the lineage masker module
    • The dependency should have been installed by pip but wasn't
    • Multiple other dependency version conflicts were also noted during installation
    • This indicates a CI environment pip installation problem
  4. No code path connection:

    • The modified files (Datalake, MLflow, workflow output) don't import or use lineage/masker.py
    • The lazy import changes in Datalake are for cloud clients (S3/GCS/Azure), not lineage
    • The MLflow artifact URI change doesn't affect lineage masking
    • The workflow output handler guard doesn't affect dependency installation

Playwright E2E Test Failures (playwright-ci-postgresql 2, 6) - Previously Reported

Test Results: 1 failed, 3 flaky, 290 passed (98.6% pass rate)

Failures:

  1. FAILED: AutoPilot.spec.ts - Service deletion showing wrong service name (Kafka expected, MySQL shown)
  2. FLAKY: AutoPilot.spec.ts - Agent status message timeout
  3. FLAKY: IncidentManager.spec.ts - Browser crash during 9-minute test
  4. FLAKY: TestSuitePipelineRedeploy.spec.ts - Pipeline success message timeout

Why Unrelated: Frontend TypeScript/React UI tests have no connection to Python backend ingestion module changes.

Python Test Failures (py-run-tests 3.11) - Previously Reported

Test Results: 4 failed, 3821 passed (99.9% pass rate)

Failures: All 4 in Oracle view lineage tests (test_view_lineage.py) - empty upstream edges when dependencies expected

Why Unrelated: Oracle connector is completely separate from Datalake/MLflow/workflow modules. Likely regression from main branch merge.

Summary Table: Modified Files vs Failures

Modified File Changes py-run-tests 3.10 Failure Connection
datalake/connection.py Lazy import S3/GCS/Azure clients Missing cachetools in lineage/masker.py None - different modules, no dependency changes
mlflow/metadata.py Artifact URI resolution Missing cachetools in lineage/masker.py None - MLflow unrelated to lineage masker
workflow/output_handler.py Empty data guard for tabulate() Missing cachetools in lineage/masker.py None - doesn't affect dependency installation

Supporting Evidence

  1. Failure occurred before test execution: py-run-tests 3.10 failed during environment setup, not during actual tests
  2. Dependency installation problem: Missing cachetools + 4 version conflicts indicate pip install issues
  3. Modified files don't touch affected areas:
    • Datalake: S3/GCS/Azure lazy imports
    • MLflow: artifact_uri vs source path
    • Workflow: empty data guard
    • None of these affect: lineage/masker.py, cachetools import, or dependency resolution
  4. File not modified by PR: lineage/masker.py (where error occurs) is not in the PR diff
  5. High pass rates in other jobs: 98.6% Playwright, 99.9% Python 3.11
Code Review 👍 Approved with suggestions 0 resolved / 1 findings

Clean, well-scoped bugfixes for three independent issues. The minor type annotation suggestion from the previous review remains unaddressed on the run parameter in _get_ml_store.

💡 Quality: Missing type annotation for run parameter in _get_ml_store

📄 ingestion/src/metadata/ingestion/source/mlmodel/mlflow/metadata.py:173

The new run parameter on _get_ml_store lacks a type annotation, while the rest of the file consistently uses type hints (e.g., version: ModelVersion, data: RunData). Since self.client.get_run() returns mlflow.entities.Run, the parameter should be typed as Optional[Run] to make the defensive if run and run.info guard self-documenting and to maintain consistency with the codebase's typing conventions.

Suggested fix
        run: Optional["Run"] = None,

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants