Skip to content

Commit 026c745

Browse files
committed
Add chunk_size to remote.rst
1 parent f91089a commit 026c745

File tree

1 file changed

+8
-1
lines changed

1 file changed

+8
-1
lines changed

docs/remote.rst

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ spatial patch from a large whole slide image from the IDC.
5656
)
5757
5858
# Read directly from the blob object using lazy frame retrieval
59-
with blob.open(mode="rb") as reader:
59+
with blob.open(mode="rb", chunk_size=500_000) as reader:
6060
im = hd.imread(reader, lazy_frame_retrieval=True)
6161
6262
# Grab an arbitrary region of tile full pixel matrix
@@ -80,6 +80,13 @@ spatial patch from a large whole slide image from the IDC.
8080
Figure produced by the above code snippet showing an arbitrary spatial
8181
region of a slide loaded directly from a Google Cloud bucket
8282

83+
It is important to set the `chunk_size` parameter carefully. This value is the
84+
number of bytes that are retrieved in a single request (set to around 500kB in
85+
the above example). Ideally this should be just large enough to retrieve a
86+
single frame of the image in one request, but any larger leads to unnecessary
87+
data being retrieved. The default value is 40MiB, which is orders of magnitude
88+
larger than the size of most image frames and therefore will be very inefficient.
89+
8390
As a further example, we use lazy frame retrieval to load only a specific set
8491
of segments from a large multi-organ segmentation of a CT image in the IDC
8592
stored in binary format (meaning each segment is stored using a separate set of

0 commit comments

Comments
 (0)