Skip to content

Commit 6ac39f7

Browse files
committed
FIX: Fix to_ndarray implementation
1 parent bee69d3 commit 6ac39f7

File tree

2 files changed

+14
-11
lines changed

2 files changed

+14
-11
lines changed

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,9 @@
55
#### Breaking changes
66
- Renamed the `TimeSeriesHttpAPI` class to `TimeseriesHttpAPI`
77

8+
#### Bug Fixes
9+
- Fixed an issue where `DBNStore.to_csv()`, `DBNStore.to_df()`, `DBNStore.to_json()`, and `DBNStore.to_ndarray()` would consume large amounts of memory.
10+
811
## 0.17.0 - 2023-08-10
912

1013
This release includes improvements to the ergonomics of the clients metadata API, you can read more about the changes [here](https://databento.com/blog/api-improvements-august-2023).

databento/common/dbnstore.py

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22

33
import abc
44
import datetime as dt
5-
import functools
65
import logging
76
from collections.abc import Generator
87
from io import BytesIO
@@ -1069,15 +1068,16 @@ def to_ndarray(
10691068
raise ValueError("a schema must be specified for mixed DBN data")
10701069
schema = self.schema
10711070

1072-
schema_records = filter(
1073-
lambda r: isinstance(r, SCHEMA_STRUCT_MAP[schema]), # type: ignore
1074-
self,
1075-
)
1076-
1077-
decoder = functools.partial(np.frombuffer, dtype=SCHEMA_DTYPES_MAP[schema])
1078-
result = tuple(map(decoder, map(bytes, schema_records)))
1071+
record_buffer = BytesIO()
1072+
num_records = 0
1073+
for record in filter(lambda r: isinstance(r, SCHEMA_STRUCT_MAP[schema]), self): # type: ignore [arg-type]
1074+
num_records += 1
1075+
record_buffer.write(bytes(record))
10791076

1080-
if not result:
1081-
return np.empty(shape=(0, 1), dtype=SCHEMA_DTYPES_MAP[schema])
1077+
result = np.frombuffer(
1078+
record_buffer.getvalue(),
1079+
dtype=SCHEMA_DTYPES_MAP[schema],
1080+
count=num_records,
1081+
)
10821082

1083-
return np.ravel(result)
1083+
return result

0 commit comments

Comments
 (0)