Skip to content

Conversation

milahu
Copy link

@milahu milahu commented Aug 7, 2025

fix #1904

also fix the memory filesystem:

AttributeError: 'MemoryFileSystem' object has no attribute '_intrans'

all tests are passing:

nix-shell -p python3.pkgs.{aiohttp,pytest,requests,numpy,pytest-asyncio} --run 'pytest -k localmemory -k memory --maxfail=1'

@milahu milahu force-pushed the localmemory-init branch from 024a418 to 169cb50 Compare August 7, 2025 18:58
@milahu milahu force-pushed the localmemory-init branch from 169cb50 to 5953c86 Compare August 7, 2025 19:00
Copy link
Member

@martindurant martindurant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't expecting a new implementation - I thought you were thinking to change the existing one. This, of course, never has a chance to affect the existing tests and fixtures.

I wouldn't even bother putting this in a new module, since it's so similar to rmemoryFS.

point to different memory filesystems.
"""

pseudo_dirs = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set the class variable even if none of the instances see it?

cachable = False # same as: skip_instance_cache = True

def __init__(self, *args, **kwargs):
self.logger = logger # global
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why bother having this on this instance? We only even have one logger anyway.

Comment on lines 30 to 32
def __init__(self, *args, **kwargs):
self.logger = logger
super().__init__(*args, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def __init__(self, *args, **kwargs):
self.logger = logger
super().__init__(*args, **kwargs)

@milahu
Copy link
Author

milahu commented Aug 8, 2025

I wasn't expecting a new implementation

yes... that was a quickfix

i tried to implement this via the skip_instance_cache keyword
but kwargs are not passed to init...

# fsspec/implementations/memory.py

class MemoryFileSystem(AbstractFileSystem):

    store: ClassVar[dict[str, Any]] = {}  # global, do not overwrite!
    pseudo_dirs = [""]  # global, do not overwrite!
    protocol = "memory"
    root_marker = "/"
    _intrans = False

    def __init__(self, *args, **kwargs):
        self.logger = logger
        print("init kwargs", kwargs) # FIXME this always prints {}
        skip_instance_cache = kwargs.get("skip_instance_cache", False)
        if skip_instance_cache:
            self.store = {}
            self.pseudo_dirs = [""]
        super().__init__(*args, **kwargs)
# fsspec/implementations/tests/test_memory.py

import os
from pathlib import PurePosixPath, PureWindowsPath

import pytest

from fsspec.implementations.local import LocalFileSystem, make_path_posix
from fsspec.implementations.memory import MemoryFileSystem


def test_identical_instances():
    fs1 = MemoryFileSystem()
    fs2 = MemoryFileSystem()

    assert id(fs1) == id(fs2)
    assert id(fs1.store) == id(fs2.store)

    fs1.touch("/fs1.txt")
    fs2.touch("/fs2.txt")
    assert fs1.ls("/", detail=False) == ["/fs1.txt", "/fs2.txt"]
    assert fs2.ls("/", detail=False) == ["/fs1.txt", "/fs2.txt"]


def test_separate_instances():
    fs1 = MemoryFileSystem(skip_instance_cache=True)
    fs2 = MemoryFileSystem(skip_instance_cache=True)

    assert id(fs1) != id(fs2)
    assert id(fs1.store) != id(fs2.store) # FIXME this fails

    fs1.touch("/fs1.txt")
    fs2.touch("/fs2.txt")
    assert fs1.ls("/", detail=False) == ["/fs1.txt"]
    assert fs2.ls("/", detail=False) == ["/fs2.txt"]

@martindurant
Copy link
Member

i tried to implement this via the skip_instance_cache keyword
but kwargs are not passed to init

Correct, this happens in __new__, so you would either have to have a new store every time or share the one - not a mix of both.
Are you happy with this PR the way things are?

@milahu
Copy link
Author

milahu commented Aug 11, 2025

success. merged with MemoryFileSystem
i only had to change fsspec/spec.py

-        skip = kwargs.pop("skip_instance_cache", False)
+        skip = kwargs.get("skip_instance_cache", False)

to pass skip_instance_cache to init

i wanted to avoid adding a new parameter like local_memory=True
because this would have required setting two parameters:
local_memory=True, skip_instance_cache=True

@martindurant
Copy link
Member

Note the failure in s3fs:

FAILED s3fs/tests/test_s3fs.py::test_async_stream - TypeError: AioSession.__init__() got an unexpected keyword argument 'skip_instance_cache'

This is expected - now you pass on an extra argument that implementations don't know about. Of course, this kwarg isn't much used in the tests, but the same could happen for other implementations that don't explicitly discard unneeded kwargs.

@martindurant
Copy link
Member

How about either

  • a flag in init which, if True, assigns the instance-local variables or otherwise (the default) does not, and so they will still point to the global ones
  • make the new filesystem as you have, but have it a subclass of MemFS in the same module, so you end up adding far fewer LoC.

@milahu
Copy link
Author

milahu commented Sep 2, 2025

sorry, abandoned

(i would prefer "a flag in init" to not pollute the classes namespace)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

create multiple separate memory filesystems

2 participants