fix(write-buffer): use explicit capacity instead of defaults #27099

philjb · 2026-01-07T21:21:00Z

This change is based on an assumed issue in data which spans many months with very small wal files; we assume the data is present in most of the 10 minute table chunks over the time range; this is sparse data that has only a few rows per 10 minutes. The TableBuffer creates a MutableTableChunk for each 10 min chunk in the months range. With arrow's default 1024 element allocations for our tag and field information, this can be a substantial in-memory use for wal files that serialize to a few megs or less. It is akin to a zipbomb. If the wal files were larger, snapshots would have been triggered. Instead, the TableBuffer grows to many GBs.

The changes here use the exact capacity that can be determined when rows are added to the MutableTableChunk. Some workloads that were coincidentally tuned to the 1024 element default will see more allocations but like Vecs the exponential reservations will catch up fast.

port of influxdata/influxdb_pro#2071

This change is based on an assumed issue in data which spans many months with very small wal files; we assume the data is present in most of the 10 minute table chunks over the time range; this is sparse data that has only a few rows per 10 minutes. The TableBuffer creates a MutableTableChunk for each 10 min chunk in the months range. With arrow's default 1024 element allocations for our tag and field information, this can be a substantial in-memory use for wal files that serialize to a few megs or less. It is akin to a zipbomb. If the wal files were larger, snapshots would have been triggered. Instead, the TableBuffer grows to many GBs. The changes here use the exact capacity that can be determined when rows are added to the MutableTableChunk. Some workloads that were coincidentally tuned to the 1024 element default will see more allocations but like Vecs the exponential reservations will catch up fast. * port of influxdata/influxdb_pro#2071

* fix: improve various error messages (#27084) * fix: show WriteLineError.error_message in ParseError display * fix: show source of reqwest::Error in influxdb3_client * feat: add --tls-no-verify option to CLI subcommands (#2096) (#27102) * feat: add --tls-no-verify option to non-serve subcommands * test: validate --tls-no-verify flag * fix: ignore cargo audit for unmaintained crate bincode (#27101) Ignores RUSTSEC-2025-0141 until we can migrate away. We have previously accepted unmaintained crates in the past via * #27009 * #26112 * chore: point install script to 3.8.0 (#27030) * feat(port): backport retention and delete fixes from ent to core (#27074) This PR backports several bug fixes and improvements related to retention and deletion from the `influxdata/influxdb_pro` repository: - **influxdb_pro PR # 1986**: Fix catalog to prevent deleting tables from already deleted databases - **influxdb_pro PR # 1991**: Update error message from "delete" to "modify" for `AlreadyDeleted` error - **influxdb_pro PR # 2043**: Add resource name to `AlreadyDeleted` error for better error messages - **influxdb_pro PR # 2046**: Set default retention period for `_internal` database to 7 days Changes: - Add check in soft_delete_table to return `AlreadyDeleted` error if database is already deleted - Change `CatalogError::AlreadyDeleted` to include resource name - Update all `AlreadyDeleted` error sites to include resource name - Add `INTERNAL_DB_RETENTION_PERIOD` constant (7 days) - Update `create_internal_db` to use retention period * fix(tests): use a 2125 date for "future dates" instead of 2025. (#27080) * fix: add EOF marker to end of metrics scrape (#27083) * fix: add EOF marker to end of metrics scrape * fix: add EOF marker to end of metrics scrape * chore(deps): bump rsa from 0.9.9 to 0.9.10 (#27088) Bumps [rsa](https://github.com/RustCrypto/RSA) from 0.9.9 to 0.9.10. - [Changelog](https://github.com/RustCrypto/RSA/blob/v0.9.10/CHANGELOG.md) - [Commits](RustCrypto/RSA@v0.9.9...v0.9.10) --- updated-dependencies: - dependency-name: rsa dependency-version: 0.9.10 dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix(write-buffer): use explicit capacity instead of defaults (#27099) This change is based on an assumed issue in data which spans many months with very small wal files; we assume the data is present in most of the 10 minute table chunks over the time range; this is sparse data that has only a few rows per 10 minutes. The TableBuffer creates a MutableTableChunk for each 10 min chunk in the months range. With arrow's default 1024 element allocations for our tag and field information, this can be a substantial in-memory use for wal files that serialize to a few megs or less. It is akin to a zipbomb. If the wal files were larger, snapshots would have been triggered. Instead, the TableBuffer grows to many GBs. The changes here use the exact capacity that can be determined when rows are added to the MutableTableChunk. Some workloads that were coincidentally tuned to the 1024 element default will see more allocations but like Vecs the exponential reservations will catch up fast. * port of influxdata/influxdb_pro#2071 * feat: add retention to show output (#1680) (#27107) * chore: additional retention logging * feat: retention show output * chore: display retention period in human readable format Co-authored-by: Joe-Blount <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Phil Bracikowski <[email protected]> Co-authored-by: Chunchun Ye <[email protected]> Co-authored-by: Joe-Blount <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Lili Cosic <[email protected]>

philjb requested a review from adrian-thurston January 7, 2026 21:21

waynr approved these changes Jan 7, 2026

View reviewed changes

philjb merged commit 2f9ca27 into main Jan 7, 2026
12 of 13 checks passed

philjb deleted the pjb/ear/6603/sparse-wal-replay-memory-parquet branch January 7, 2026 23:11

waynr mentioned this pull request Jan 12, 2026

chore: cherry-pick fixes from main for 3.8.1 release #27110

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(write-buffer): use explicit capacity instead of defaults #27099

fix(write-buffer): use explicit capacity instead of defaults #27099

Uh oh!

philjb commented Jan 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix(write-buffer): use explicit capacity instead of defaults #27099

fix(write-buffer): use explicit capacity instead of defaults #27099

Uh oh!

Conversation

philjb commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

philjb commented Jan 7, 2026 •

edited

Loading