-
Notifications
You must be signed in to change notification settings - Fork 408
Open
Description
When working with chained urlpaths I noticed that the SimpleCacheFileSystem._cache
dict kept growing.
The root cause is that the "fo"
item in storage_options causes fsspec.utils.tokenize to return tokens dependent on the path that's being accessed:
# cat fsspec-simplecache-tokenize.py
import fsspec
from fsspec.utils import tokenize
fs0 = fsspec.open('simplecache::memory:///foo').fs
fs1 = fsspec.open('simplecache::memory:///bar').fs
print("fs0.storage_options", fs0.storage_options)
print("fs1.storage_options", fs1.storage_options)
print("id(fs0)", id(fs0))
print("id(fs1)", id(fs1))
print("fs0._fs_token_", fs0._fs_token_)
print("fs1._fs_token_", fs1._fs_token_)
$ python fsspec-simplecache-tokenize.py
fs0.storage_options {'target_options': {}, 'target_protocol': 'memory', 'fo': '/foo'}
fs1.storage_options {'target_options': {}, 'target_protocol': 'memory', 'fo': '/bar'}
id(fs0) 124024007322416
id(fs1) 124024006983344
fs0._fs_token_ e838936f5e20160de437f995501e2d03
fs1._fs_token_ 995b48f48e0829c5b85cc01f83d2099a
Should the "fo"
item be popped from the kwargs
provided to fsspec.utils.tokenize
(and all nested target_options
)?
I wonder if this should be applied in general or only for filesystems that have a passthrough behavior like simplecache.
Thanks,
Andreas
Metadata
Metadata
Assignees
Labels
No labels