Skip to content

[pull] main from quickwit-oss:main#186

Merged
pull[bot] merged 3 commits intoage-rs:mainfrom
quickwit-oss:main
Apr 8, 2026
Merged

[pull] main from quickwit-oss:main#186
pull[bot] merged 3 commits intoage-rs:mainfrom
quickwit-oss:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Apr 8, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

g-talbot and others added 3 commits April 8, 2026 07:55
* feat: replace fixed MetricDataPoint fields with dynamic tag HashMap

* feat: replace ParquetField enum with constants and dynamic validation

* feat: derive sort order and bloom filters from batch schema

* feat: union schema accumulation and schema-agnostic ingest validation

* feat: dynamic column lookup in split writer

* feat: remove ParquetSchema dependency from indexing actors

* refactor: deduplicate test batch helpers

* lint

* feat(31): sort schema foundation — proto, parser, display, validation, window, TableConfig

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rustdoc link errors — use backticks for private items

* Update quickwit/quickwit-parquet-engine/src/table_config.rs

Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>

* Update quickwit/quickwit-parquet-engine/src/table_config.rs

Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>

* style: rustfmt long match arm in default_sort_fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: replace fixed MetricDataPoint fields with dynamic tag HashMap

* feat: replace ParquetField enum with constants and dynamic validation

* feat: derive sort order and bloom filters from batch schema

* feat: union schema accumulation and schema-agnostic ingest validation

* feat: dynamic column lookup in split writer

* feat: remove ParquetSchema dependency from indexing actors

* refactor: deduplicate test batch helpers

* lint

* feat(31): sort schema foundation — proto, parser, display, validation, window, TableConfig

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rustdoc link errors — use backticks for private items

* feat(31): compaction metadata types — extend split metadata, postgres model, field lookup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update quickwit/quickwit-parquet-engine/src/table_config.rs

Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>

* Update quickwit/quickwit-parquet-engine/src/table_config.rs

Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>

* style: rustfmt long match arm in default_sort_fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: make parquet_file field backward-compatible in MetricsSplitMetadata

Pre-existing splits were serialized before the parquet_file field was
added, so their JSON doesn't contain it. Adding #[serde(default)]
makes deserialization fall back to empty string for old splits.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: replace fixed MetricDataPoint fields with dynamic tag HashMap

* feat: replace ParquetField enum with constants and dynamic validation

* feat: derive sort order and bloom filters from batch schema

* feat: union schema accumulation and schema-agnostic ingest validation

* feat: dynamic column lookup in split writer

* feat: remove ParquetSchema dependency from indexing actors

* refactor: deduplicate test batch helpers

* lint

* feat(31): sort schema foundation — proto, parser, display, validation, window, TableConfig

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: rustdoc link errors — use backticks for private items

* feat(31): compaction metadata types — extend split metadata, postgres model, field lookup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(31): wire TableConfig into sort path, add compaction KV metadata

Wire TableConfig-driven sort order into ParquetWriter and add
self-describing Parquet file metadata for compaction:

- ParquetWriter::new() takes &TableConfig, resolves sort fields at
  construction via parse_sort_fields() + ParquetField::from_name()
- sort_batch() uses resolved fields with per-column direction (ASC/DESC)
- SS-1 debug_assert verification: re-sort and check identity permutation
- build_compaction_key_value_metadata(): embeds sort_fields, window_start,
  window_duration, num_merge_ops, row_keys (base64) in Parquet kv_metadata
- SS-5 verify_ss5_kv_consistency(): kv_metadata matches source struct
- write_to_file_with_metadata() replaces write_to_file()
- prepare_write() shared method for bytes and file paths
- ParquetWriterConfig gains to_writer_properties_with_metadata()
- ParquetSplitWriter passes TableConfig through
- All callers in quickwit-indexing updated with TableConfig::default()
- 23 storage tests pass including META-07 self-describing roundtrip

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update quickwit/quickwit-parquet-engine/src/table_config.rs

Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>

* Update quickwit/quickwit-parquet-engine/src/table_config.rs

Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>

* style: rustfmt long match arm in default_sort_fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: make parquet_file field backward-compatible in MetricsSplitMetadata

Pre-existing splits were serialized before the parquet_file field was
added, so their JSON doesn't contain it. Adding #[serde(default)]
makes deserialization fall back to empty string for old splits.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: handle empty-column batches in accumulator flush

When the commit timeout fires and the accumulator contains only
zero-column batches, union_fields is empty and concat_batches fails
with "must either specify a row count or at least one column".
Now flush_internal treats empty union_fields the same as empty
pending_batches — resets state and returns None.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Matthew Kim <matthew.kim@datadoghq.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@pull pull bot locked and limited conversation to collaborators Apr 8, 2026
@pull pull bot added the ⤵️ pull label Apr 8, 2026
@pull pull bot merged commit 7645703 into age-rs:main Apr 8, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant