Skip to content

Commit e627523

Browse files
author
Benjamin Moody
committed
_rd_compressed_file: read signals in 1048576-sample chunks.
A bug in libsndfile [1] means that if we try to read a chunk of more than 16777216 (2**24) total samples (across all channels) from a FLAC file, and the number of channels is not a power of two, then sf_read_short will return zero. Work around this bug by allocating an array ourselves and reading a small block of samples at a time (a FLAC file has at most 8 channels, at most 7 if not a power of two, and 7 * 1048576 < 16777216). [1] libsndfile/libsndfile#431
1 parent 78a1a86 commit e627523

File tree

1 file changed

+17
-1
lines changed

1 file changed

+17
-1
lines changed

wfdb/io/_signal.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1875,7 +1875,23 @@ def _rd_compressed_file(
18751875
start_samp = start_frame * samps_per_frame[0]
18761876
end_samp = end_frame * samps_per_frame[0]
18771877
sf.seek(start_samp + sample_offset)
1878-
sig_data = sf.read(end_samp - start_samp, dtype=read_dtype)
1878+
1879+
# We could do this:
1880+
# sig_data = sf.read(end_samp - start_samp, dtype=read_dtype)
1881+
# However, sf.read fails for huge blocks (over 2**24 total
1882+
# samples) due to a bug in libsndfile:
1883+
# https://github.com/libsndfile/libsndfile/issues/431
1884+
# So read the data in chunks instead.
1885+
n_samp = end_samp - start_samp
1886+
sig_data = np.empty((n_samp, n_sig), dtype=read_dtype)
1887+
CHUNK_SIZE = 1024 * 1024
1888+
for chunk_start in range(0, n_samp, CHUNK_SIZE):
1889+
chunk_end = chunk_start + CHUNK_SIZE
1890+
chunk_data = sf.read(out=sig_data[chunk_start:chunk_end])
1891+
samples_read = chunk_data.shape[0]
1892+
if samples_read != CHUNK_SIZE:
1893+
sig_data = sig_data[: chunk_start + samples_read]
1894+
break
18791895

18801896
# If we read an 8-bit stream as int16 or a 24-bit stream as
18811897
# int32, soundfile shifts each sample left by 8 bits. We

0 commit comments

Comments
 (0)