-
Notifications
You must be signed in to change notification settings - Fork 36
Description
Hi, I'm hoping someone will be able to help us resolve a memory issue with TileDB-Py...
When running a large number of queries against a TileDB dense array, we are seeing an accumulation of memory usage. This can run into several GBs and eventually causes an out of memory error.
It seems periodically re-instantiating the TileDB context object leads to controlled memory usage - this suggests memory is held by the context object and correctly released when the context object is garbage collected. Unfortunately, adding this logic to periodically re-instantiate the TileDB context object isn't a practical solution for our more complex scripts (i.e. with multiple open arrays in PyTorch Datasets, requirement to track the number of queries when re-instantiating the context etc.), so we're hoping we can resolve this issue.
This occurs even when sm.tile_cache_size is set to 0.
This may be related to #150 or #440.
Please see a reproducible example below.
Thanks everyone for your help!
Python version:
Python 3.10.12
Python environment:
numpy==2.2.5
packaging==25.0
psutil==7.0.0
tiledb==0.34.0
Create a test array:
import numpy as np
import tiledb
import os
import psutil
import datetime
x = np.ones(10000000)
ctx = tiledb.default_ctx({"sm.tile_cache_size": 0, "sm.io_concurrency_level": 1, "sm.compute_concurrency_level": 1})
path = 'test_tile_db'
d1 = tiledb.Dim(
'test_domain', domain=(0, x.shape[0] - 1), tile=10000, dtype="uint32"
)
domain = tiledb.Domain(d1)
v = tiledb.Attr(
'test_value',
dtype="float32",
)
schema = tiledb.ArraySchema(
domain=domain, attrs=(v,), cell_order="row-major", tile_order="row-major"
)
A = tiledb.DenseArray.create(path, schema)
values = x.astype(np.float32)
with tiledb.DenseArray(path, mode="w", ctx=ctx) as A:
A[:] = {'test_value': values}
Run a large number of queries and track memory usage:
ctx = tiledb.Ctx({"sm.tile_cache_size": 0, "sm.io_concurrency_level": 1, "sm.compute_concurrency_level": 1})
data = tiledb.open(path, mode='r', ctx=ctx)
for i in range(100000):
array = data[0]
if i % 10000 == 0:
process = psutil.Process(os.getpid())
ram_usage = process.memory_info().rss / 1e6
print(datetime.datetime.now(), ram_usage, 'MB', 'after', i, 'queries')
2025-05-15 10:37:23.794463 157.769728 MB after 0 queries
2025-05-15 10:37:38.979566 283.578368 MB after 10000 queries
2025-05-15 10:37:52.696840 413.016064 MB after 20000 queries
2025-05-15 10:38:06.442493 542.4128 MB after 30000 queries
2025-05-15 10:38:19.806429 671.66208 MB after 40000 queries
2025-05-15 10:38:33.460704 801.017856 MB after 50000 queries
2025-05-15 10:38:47.761755 930.164736 MB after 60000 queries
2025-05-15 10:39:02.624914 1059.328 MB after 70000 queries
2025-05-15 10:39:17.331571 1189.036032 MB after 80000 queries
2025-05-15 10:39:33.014144 1318.25664 MB after 90000 queries
Controlled memory usage when periodically re-instantiating the context object:
ctx = tiledb.Ctx({"sm.tile_cache_size": 0, "sm.io_concurrency_level": 1, "sm.compute_concurrency_level": 1})
data = tiledb.open(path, mode='r', ctx=ctx)
for i in range(100000):
array = data[0]
if i % 10000 == 0:
ctx = tiledb.Ctx({"sm.tile_cache_size": 0, "sm.io_concurrency_level": 1, "sm.compute_concurrency_level": 1})
data = tiledb.open(path, mode='r', ctx=ctx)
process = psutil.Process(os.getpid())
ram_usage = process.memory_info().rss / 1e6
print(datetime.datetime.now(), ram_usage, 'MB', 'after', i, 'queries')
2025-05-15 10:41:44.234509 161.562624 MB after 0 queries
2025-05-15 10:41:57.731925 290.267136 MB after 10000 queries
2025-05-15 10:42:10.923391 290.267136 MB after 20000 queries
2025-05-15 10:42:24.230450 290.312192 MB after 30000 queries
2025-05-15 10:42:37.653962 290.033664 MB after 40000 queries
2025-05-15 10:42:51.223860 284.045312 MB after 50000 queries
2025-05-15 10:43:04.061728 285.372416 MB after 60000 queries
2025-05-15 10:43:17.623785 284.352512 MB after 70000 queries
2025-05-15 10:43:31.396821 283.860992 MB after 80000 queries
2025-05-15 10:43:45.231651 284.598272 MB after 90000 queries