Skip to content

Conversation

@dk107dk
Copy link

@dk107dk dk107dk commented Feb 14, 2025

Added CsvPath Framework to the data ingestion tools with a link to csvpath.org.

David

What is this tool for?

Bringing CVS, Excel, data frames and other delimited files into the organization in a more controlled way. It adds durable identification, enables validation and data upgrading/canonicalization, and stages data for import into the data lake from an immutable archive. CsvPath Framework is pre-integrated with OpenTelemetry and OpenLineage for observability and tracing.

What's the difference between this tool and similar ones?

There are no known open source or commercial products that take a similar approach to automated delimited data preboarding built to scale to any number of data partners. For user self-service delimited file uploading there is FlatFile and OneSchema. The Frictionless Framework + CKAN provide validated publishing of delimited file datasets which is adjacent to preboarding if publishing within the enterprise. (CsvPath Framework is also pre-integrated with CKAN, but taking a different more-DataOps infrastructure approach and volume focus).


Anyone who agrees with this pull request could submit an Approve review to it.

Added CsvPath Framework to the data ingestion tools with a link to csvpath.org.

David
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant