Skip to content

Datalake CSV Ingestion OOM #24886

@keshavmohta09

Description

@keshavmohta09

Affected module
Does it impact the UI, backend or Ingestion Framework?

Describe the bug
An out-of-memory (OOM) error occurs when reading a large CSV file during metadata ingestion in the data lake connector.

To Reproduce

  • Create a CSV file of approximately 3.5 GB.
  • Run metadata ingestion using the data lake connector.

Screenshots or steps to reproduce

Expected behavior
Metadata ingestion should complete successfully without running out of memory and should handle large CSV files correctly.

Version:

  • OS: [e.g. iOS]
  • Python version:
  • OpenMetadata version: [e.g. 0.8]
  • OpenMetadata Ingestion package version: [e.g. openmetadata-ingestion[docker]==XYZ]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

Status

No status

Status

In Progress 🏗️

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions