Skip to content

Unbounded object_cache sizeΒ #1720

@c-thiel

Description

@c-thiel

Apache Iceberg Rust version

None

Describe the bug

Currently we use a weighed moka Cache to store read objects:

cache: moka::future::Cache::builder()
.weigher(|_, val: &CachedItem| match val {
CachedItem::ManifestList(item) => size_of_val(item.as_ref()),
CachedItem::Manifest(item) => size_of_val(item.as_ref()),
} as u32)
.max_capacity(cache_size_bytes)
.build(),

However, the current implementation only takes the stack size of the struct itself into account.
Thus, a cached ManifestList would return a low size_of_val number, while its many entries could allocate a significant number of storage on the heap.

I am unsure if exact calculation is feasable. I would opt for a rough estimate, such as 2KB for a list and 8KB for a Manfest.

Let me know what you think!

To Reproduce

Create a ManifestList with many entries. Read it cached, watch the memory usage significantly overshoot the set max_capacity.

Expected behavior

No response

Willingness to contribute

None

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions