-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Description
I got this for the first time, and also only once, after trying again, it did not occur again:
$ python ./sis m recipe/i6_experiments/...
[2026-02-11 20:40:47,000] WARNING: get_executable_path: use of gs is deprecated, please provide a Path object for gs.RETURNN_PYTHON_EXE
[2026-02-11 20:40:47,000] WARNING: get_executable_path: use of gs is deprecated, please provide a Path object for gs.RETURNN_ROOT
[2026-02-11 20:40:57,733] INFO: Loaded config: recipe/i6_experiments/users/zeyer/experiments/exp2025_11_11_ctc_lm_search.py (loaded module:
i6_experiments.users.zeyer.experiments.exp2025_11_11_ctc_lm_search)
[2026-02-11 20:40:57,733] INFO: Config loaded (time needed: 11.73)
[2026-02-11 20:41:01,470] CRITICAL: Trying to create a task with an invalid function name
[2026-02-11 20:41:01,470] CRITICAL: Job name: Job<work/i6_core/returnn/training/GetBestPtCheckpointJob.xUyBGR1LEpaY>
[2026-02-11 20:41:01,470] CRITICAL: Function name: run
[2026-02-11 20:41:01,470] ERROR: Exception in thread <DummyProcess(Thread-3 (worker), started daemon 23359902389824)>:
EXCEPTION
Traceback (most recent call last):
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/tools.py", line 304, in default_handle_exceptio
n_interrupt_main_thread.<locals>.wrapped_func
line: return func(*args, **kwargs)
locals:
func = <local> <function sisyphus.graph.SISGraph.for_all_nodes.<locals>.runner_helper>
args = <local> (Job<work/i6_core/returnn/training/GetBestPtCheckpointJob.xUyBGR1LEpaY>,) kwargs = <local> {}
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/graph.py", line 596, in SISGraph.for_all_nodes.<locals>.runner_helper
line: res = f(job) locals:
f = <local> <function sisyphus.graph.SISGraph.get_jobs_by_status.<locals>.get_unfinished_jobs> job = <local> Job<work/i6_core/returnn/training/GetBestPtCheckpointJob.xUyBGR1LEpaY>
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/graph.py", line 472, in SISGraph.get_jobs_by_status.<locals>.get_unfinished_jobs
line: for task in job._sis_tasks(): locals:
job = <local> Job<work/i6_core/returnn/training/GetBestPtCheckpointJob.xUyBGR1LEpaY>
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/job.py", line 838, in Job._sis_tasks
line: task.set_job(self) locals:
task = <local> <Task 'run' job=Job<work/i6_core/returnn/training/GetBestPtCheckpointJob.xUyBGR1LEpaY>>
self = <local> Job<work/i6_core/returnn/training/GetBestPtCheckpointJob.xUyBGR1LEpaY>
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/task.py", line 85, in Task.set_job
line: getattr(self._job, name)
locals: self = <local> <Task 'run' job=Job<work/i6_core/returnn/training/GetBestPtCheckpointJob.xUyBGR1LEpaY>>
self._job = <local> Job<work/i6_core/returnn/training/GetBestPtCheckpointJob.xUyBGR1LEpaY>
name = <local> 'run'
AttributeError: 'GetBestPtCheckpointJob' object has no attribute 'run'
[2026-02-11 20:41:01,620] ERROR: Main thread unhandled exception:
EXCEPTION
Traceback (most recent call last):
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/__main__.py", line 234, in main
line: args.func(args)
locals:
args = <local> Namespace(log_level=20, config_files=['recipe/i6_experiments/users/zeyer/experiments/exp2025_11_11_ctc_lm_search.py'],
run=False, clear_errors_once=False, clear_interrupts_once=False, ignore_once=False, http_port=None, filesystem=None, interactive=False, ui
=False, argv=['recipe/i6_experiments/use...
args.func = <local> <function sisyphus.manager.manager>
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/manager.py", line 114, in manager
line: manager = Manager(
sis_graph=sis_graph,
job_engine=job_engine,
link_outputs=args.run,
clear_errors_once=args.clear_errors_once,
clear_interrupts_once=args.clear_interrupts_once,
ignore_once=args.ignore_once,
start_computations=args.run,
interative=args.interactive,
ui=args.ui,
)
locals:
manager = <local> None
sis_graph = <local> <sisyphus.graph.SISGraph object at 0x153fb8d776b0>
job_engine = <local> <sisyphus.engine.EngineSelector object at 0x153ee6f9f260>
args = <local> Namespace(log_level=20, config_files=['recipe/i6_experiments/users/zeyer/experiments/exp2025_11_11_ctc_lm_search.py'],
run=False, clear_errors_once=False, clear_interrupts_once=False, ignore_once=False, http_port=None, filesystem=None, interactive=False, ui
=False, argv=['recipe/i6_experiments/use...
args.run = <local> False
args.clear_errors_once = <local> False
args.clear_interrupts_once = <local> False
args.ignore_once = <local> False
args.interactive = <local> False
args.ui = <local> False
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/manager.py", line 236, in Manager.__init__
line: self.update_jobs()
locals:
self = <local> <Manager(Thread-2, initial)>
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/manager.py", line 253, in Manager.update_jobs
line: self.jobs = jobs = self.sis_graph.get_jobs_by_status(engine=self.job_engine, skip_finished=skip_finished)
locals:
self = <local> <Manager(Thread-2, initial)>
self.jobs = <local> !AttributeError: 'Manager' object has no attribute 'jobs'
self.sis_graph = <local> <sisyphus.graph.SISGraph object at 0x153fb8d776b0>
self.job_engine = <local> <sisyphus.engine.EngineSelector object at 0x153ee6f9f260>
skip_finished = <local> True
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/graph.py", line 502, in SISGraph.get_jobs_by_status
line: self.for_all_nodes(get_unfinished_jobs, nodes=nodes)
locals:
self = <local> <sisyphus.graph.SISGraph object at 0x153fb8d776b0>
get_unfinished_jobs = <local> <function sisyphus.graph.SISGraph.get_jobs_by_status.<locals>.get_unfinished_jobs>
nodes = <local> None
File "/rwthfs/rz/cluster/home/az668407/setups/combined/2021-05-31/tools/sisyphus/sisyphus/graph.py", line 611, in SISGraph.for_all_nodes
line: time.sleep(0.1)
KeyboardInterrupt
[2026-02-11 20:41:01,623] WARNING: Main thread exit. Still running non-daemon threads: {<LocalEngine(Thread-1, started 23359904515648)>}
No idea what this is. But I also did not really investigate. Might be some very rare random hiccup, or very rare race condition. Just reporting in case anyone stumbles upon this again.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels