@@ -51,7 +51,7 @@ The dual-store architecture allows you to use different store types for source a
51
51
such as a remote store for source data and a local store for persistent caching.
52
52
53
53
Performance Benefits
54
- -------------------
54
+ --------------------
55
55
56
56
The CacheStore provides significant performance improvements for repeated data access:
57
57
@@ -70,7 +70,8 @@ The CacheStore provides significant performance improvements for repeated data a
70
70
... _ = zarr_array_nocache[:]
71
71
>>> elapsed_nocache = time.time() - start
72
72
>>>
73
- >>> print (f " Speedup: { elapsed_nocache/ elapsed_cache:.2f } x " )
73
+ >>> # Cache provides speedup for repeated access
74
+ >>> speedup = elapsed_nocache / elapsed_cache # doctest: +SKIP
74
75
75
76
Cache effectiveness is particularly pronounced with repeated access to the same data chunks.
76
77
@@ -103,7 +104,7 @@ to the same chunk will be served from the local cache, providing dramatic speedu
103
104
The cache persists between sessions when using a LocalStore for the cache backend.
104
105
105
106
Cache Configuration
106
- ------------------
107
+ -------------------
107
108
108
109
The CacheStore can be configured with several parameters:
109
110
@@ -156,7 +157,7 @@ The CacheStore can be configured with several parameters:
156
157
... )
157
158
158
159
Cache Statistics
159
- ---------------
160
+ ----------------
160
161
161
162
The CacheStore provides statistics to monitor cache performance and state:
162
163
@@ -166,28 +167,38 @@ The CacheStore provides statistics to monitor cache performance and state:
166
167
>>>
167
168
>>> # Get comprehensive cache information
168
169
>>> info = cached_store.cache_info()
169
- >>> print (f " Cache store type: { info[' cache_store_type' ]} " )
170
- >>> print (f " Max age: { info[' max_age_seconds' ]} seconds " )
171
- >>> print (f " Max size: { info[' max_size' ]} bytes " )
172
- >>> print (f " Current size: { info[' current_size' ]} bytes " )
173
- >>> print (f " Tracked keys: { info[' tracked_keys' ]} " )
174
- >>> print (f " Cached keys: { info[' cached_keys' ]} " )
175
- >>> print (f " Cache set data: { info[' cache_set_data' ]} " )
170
+ >>> info[' cache_store_type' ] # doctest: +SKIP
171
+ 'MemoryStore'
172
+ >>> isinstance (info[' max_age_seconds' ], (int , str ))
173
+ True
174
+ >>> isinstance (info[' max_size' ], (int , type (None )))
175
+ True
176
+ >>> info[' current_size' ] >= 0
177
+ True
178
+ >>> info[' tracked_keys' ] >= 0
179
+ True
180
+ >>> info[' cached_keys' ] >= 0
181
+ True
182
+ >>> isinstance (info[' cache_set_data' ], bool )
183
+ True
176
184
177
185
The `cache_info() ` method returns a dictionary with detailed information about the cache state.
178
186
179
187
Cache Management
180
- ---------------
188
+ ----------------
181
189
182
190
The CacheStore provides methods for manual cache management:
183
191
184
192
>>> # Clear all cached data and tracking information
185
- >>> await cached_store.clear_cache()
193
+ >>> import asyncio
194
+ >>> asyncio.run(cached_store.clear_cache()) # doctest: +SKIP
186
195
>>>
187
- >>> # Check cache info after clearing
188
- >>> info = cached_store.cache_info()
189
- >>> print (f " Tracked keys after clear: { info[' tracked_keys' ]} " ) # Should be 0
190
- >>> print (f " Current size after clear: { info[' current_size' ]} " ) # Should be 0
196
+ >>> # Check cache info after clearing
197
+ >>> info = cached_store.cache_info() # doctest: +SKIP
198
+ >>> info[' tracked_keys' ] == 0 # doctest: +SKIP
199
+ True
200
+ >>> info[' current_size' ] == 0 # doctest: +SKIP
201
+ True
191
202
192
203
The `clear_cache() ` method is an async method that clears both the cache store
193
204
(if it supports the `clear ` method) and all internal tracking data.
@@ -249,7 +260,7 @@ The dual-store architecture provides flexibility in choosing the best combinatio
249
260
of source and cache stores for your specific use case.
250
261
251
262
Examples from Real Usage
252
- -----------------------
263
+ ------------------------
253
264
254
265
Here's a complete example demonstrating cache effectiveness:
255
266
@@ -270,24 +281,20 @@ Here's a complete example demonstrating cache effectiveness:
270
281
>>> zarr_array[:] = np.random.random((100 , 100 ))
271
282
>>>
272
283
>>> # Demonstrate cache effectiveness with repeated access
273
- >>> print (" First access (cache miss):" )
274
284
>>> start = time.time()
275
- >>> data = zarr_array[20 :30 , 20 :30 ]
285
+ >>> data = zarr_array[20 :30 , 20 :30 ] # First access (cache miss)
276
286
>>> first_access = time.time() - start
277
287
>>>
278
- >>> print (" Second access (cache hit):" )
279
288
>>> start = time.time()
280
- >>> data = zarr_array[20 :30 , 20 :30 ] # Same data should be cached
289
+ >>> data = zarr_array[20 :30 , 20 :30 ] # Second access (cache hit)
281
290
>>> second_access = time.time() - start
282
291
>>>
283
- >>> print (f " First access time: { first_access:.4f } s " )
284
- >>> print (f " Second access time: { second_access:.4f } s " )
285
- >>> print (f " Cache speedup: { first_access/ second_access:.2f } x " )
286
- >>>
287
292
>>> # Check cache statistics
288
293
>>> info = cached_store.cache_info()
289
- >>> print (f " Cached keys: { info[' cached_keys' ]} " )
290
- >>> print (f " Current cache size: { info[' current_size' ]} bytes " )
294
+ >>> info[' cached_keys' ] > 0 # Should have cached keys
295
+ True
296
+ >>> info[' current_size' ] > 0 # Should have cached data
297
+ True
291
298
292
299
This example shows how the CacheStore can significantly reduce access times for repeated
293
300
data reads, particularly important when working with remote data sources. The dual-store
0 commit comments