Skip to content

ReturnnConfig ignores file caching #310

@albertz

Description

@albertz

When you put a tk.Path(..., cached=True) object somewhere into your ReturnnConfig, it will not use file caching.

Is this expected behavior? At least I did not expect this.

The reason for that:

config = instanciate_delayed(config)

This code recursively goes through config, and does:

if isinstance(o, DelayedBase):
    o = o.get()

Now, tk.Path derives from DelayedBase, and has this get:

def get(self):
    return self.get_path()

And get_path returns the uncached file path.

So, in the written RETURNN config, you will not have any hashed file paths.

(Btw, in the RasrConfig, I think it calls str(path) instead, and Path.__str__ returns self.get_cached_path().)

Another note: I just realize, the RETURNN config writing is also a separate Task anyway, which likely runs on a different node, so file caching could not be done directly there. However, the common file_caching function is implemented like:

def file_caching(path):
  return f"`cf {path}`"

So, it anyway does not return the file path of a cached file, but just this special formatted string, which only RASR can really handle properly. So Path.__str__ likely would break other tools.

I don't really have a good solution or suggestion currently.

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions