Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions data/implementations/features/api-features.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
category_id: api-features
features:
- id: api-external-column-data
display_name: External column data
note: "In parquet.thrift: ColumnChunk->file_path"
- id: api-parquet-summary-file
display_name: Parquet summary file
note: "Files named _metadata that consolidate footers from multiple parquet files."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reference to the file_path field will help keep this specific. So perhaps something like

Suggested change
note: "Files named _metadata that consolidate footers from multiple parquet files."
note: "Files named _metadata that consolidate footers from multiple parquet files, stored in ColumnChunk->file_path"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is "parquet summary file" really the same as supporting file_path?

Maybe summary files are the only known use case of actually using file_path? But nothing in the spec says file_path is necessarily associated with "summary files". In fact is there anything in the spec about summary file or is that just an implementation choice some people made?

I'm not opposed to having a row for support of summary files, but that seems like a different question than file_path support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd err on not including that a reference to file_path here.

In fact is there anything in the spec about summary file or is that just an implementation choice some people made?

There is not apache/parquet-format#542 has some more details.


- id: api-sorting-columns
display_name: Row group "Sorting column" metadata
Expand Down
2 changes: 1 addition & 1 deletion data/implementations/support/arrow-go.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ support:
status: read
format-data-page-v2:
status: full
api-external-column-data:
api-parquet-summary-file:
status: none
api-sorting-columns:
status: full
Expand Down
2 changes: 1 addition & 1 deletion data/implementations/support/arrow-rs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ support:
status: full
format-data-page-v2:
status: full
api-external-column-data:
api-parquet-summary-file:
status: none
api-sorting-columns:
status: full
Expand Down
2 changes: 1 addition & 1 deletion data/implementations/support/arrow.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ support:
since_version_write: "19.0.0"
format-data-page-v2:
status: full
api-external-column-data:
api-parquet-summary-file:
status: full
api-sorting-columns:
status: full
Expand Down
2 changes: 1 addition & 1 deletion data/implementations/support/cudf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ support:
status: full
format-data-page-v2:
status: full
api-external-column-data:
api-parquet-summary-file:
status: write
api-sorting-columns:
status: write
Expand Down
2 changes: 1 addition & 1 deletion data/implementations/support/duckdb.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ support:
status: read
format-data-page-v2:
status: full
api-external-column-data:
api-parquet-summary-file:
status: none
api-sorting-columns:
status: read
Expand Down
4 changes: 2 additions & 2 deletions data/implementations/support/hyparquet.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -113,8 +113,8 @@ support:
status: unknown
format-data-page-v2:
status: full
api-external-column-data:
status: full
api-parquet-summary-file:
status: none
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@platypii I looked at hyparquet documentation quickly and didn't see this mentioned but please let me know if I missed it.

api-sorting-columns:
status: none
api-rowgroup-pruning-stats:
Expand Down
2 changes: 1 addition & 1 deletion data/implementations/support/parquet-java.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ support:
status: full
format-data-page-v2:
status: full
api-external-column-data:
api-parquet-summary-file:
status: full
api-sorting-columns:
status: none
Expand Down