feat(dataset): add collectAllKeys option for full CSV export (#2945)#3007
Merged
janbuchar merged 4 commits intoapify:masterfrom Jun 16, 2025
Merged
Conversation
Contributor
|
Hi! 👋 Just checking in to see if there's anything more I can do on this PR or if any changes are needed. Thanks for reviewing when you get a chance! |
janbuchar
requested changes
Jun 13, 2025
B4nan
requested changes
Jun 16, 2025
Member
B4nan
left a comment
There was a problem hiding this comment.
please fix the formatting (just run yarn format)
B4nan
approved these changes
Jun 16, 2025
Contributor
Author
|
Hi! The changes are now ready and reviewed. Could someone please approve the CI so the workflow can run? Thanks in advance! |
barjin
added a commit
that referenced
this pull request
Aug 13, 2025
Ports the `collectAllKeys` option from #3007 to the `BasicCrawler.exportData` method.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What Changed
This PR adds support for a new option in Dataset.exportToCSV():
collectAllKeys?: booleanWhen set to true, this option ensures that all unique keys across all dataset items are included in the exported CSV header, rather than only using the keys from the first item.
Motivation
Fixes #2945.
When datasets contain heterogeneous objects (e.g., items with different fields), using only the first item’s keys results in incomplete exports. This change allows full field coverage.
Implementation Details
Added
collectAllKeys?: booleanto theDatasetExportToOptionsinterface.Modified the export logic to:
Gather keys across all items using
Array.from(new Set(...))when the option istrue.Fall back to the default behavior otherwise.
Handle empty dataset case gracefully.
Test Coverage
Added a test in
dataset.test.tsto verify:Export includes all fields (
id,name,age) from multiple items with varying structures.