Skip to content

Fix rest parallel#652

Merged
jkrue merged 9 commits intodevfrom
fix-rest-parallel
May 13, 2025
Merged

Fix rest parallel#652
jkrue merged 9 commits intodevfrom
fix-rest-parallel

Conversation

@XaverStiensmeier
Copy link
Contributor

Two errors caused bibigrid to fail when running in parallel:

  1. Many configuration files where created locally and then pushed remotely. This caused issues when more than one bibigrid process tried to write and/or upload files at the same time e.g. site.yaml, hosts.yaml, ...
  2. A bug caused a global - intended to be static - variable to be changed. This caused issues because due to the nature of the REST API global variables persist through multiple runs.

Both errors are fixed with this pull. This fix discontinues the branch https://github.com/BiBiServ/bibigrid/tree/hotfix-rest-parallel.

  1. Is fixed by preparing the file contents and writing them directly to the remote
  2. Is fixed by not changing the static variable but loading its static content in a method and returning a local variable instead.

@jkrue jkrue merged commit f1a97ae into dev May 13, 2025
4 checks passed
@XaverStiensmeier XaverStiensmeier added this to the SimpleVM Ready milestone Jun 23, 2025
@XaverStiensmeier XaverStiensmeier deleted the fix-rest-parallel branch June 30, 2025 08:22
jkrue added a commit that referenced this pull request Sep 24, 2025
* Fix rest parallel (#652)

* added write to remote

* first attempt at remote writing

* changed instances of write_yaml that happen for creation to direct remote writes.

* added TODO

* changed os_versions in cloud_node_requirements.yaml

* changed version number. Went down to 3 to align with github repository

* fixed global variable not static causing parallel create runs to affect each other

* pleasing linter

* pleasing linter

* Fixing node memory handling (#655)

* Attempt using TRES_CORE_MEMORY

* instead of //32 and capped at 2000 //4 +1000 sounds more reasonable

* fixed equation

* log warning if ram < 4096

* added unit

* line too long

* Address volumes by id not name internally (#656)

* fixed volume name or id

* added new volume key "id" to schema

* added new volume key "id" to rest model

* moved rest models to models/ folder

* fixed tests, improved function naming

* improved readability. Ignored pylint multiple branches for volume creation

* duplicate ignore

* added disable duplicate code

* removed code duplicate

* fixed name not set bug, renamed path of ingetration_test bibigrid.yaml to bibigrid_test.yaml

* added info in configuration.md

* updated bibigrid.yaml

* Draft a POC for SOCKS support

* improve socks connection retry logic

* Add documentation for using socks5Proxy

* remove support of focal repositories

* add handler for apt cache update

* tpyo, flush handlers

* added newline

* Restructure package to allow building with uv/pip etc (#660)

* Add build system

- use uv for dependency management
- add bibigrid entrypoint
- use click for command line parsing

* Adapt pyproject.toml

* adapy pyproject.toml

* Restructure package

- add pyproject.toml and use uv as build system
- move ansible resources into package and adapt paths

* remove auto-generated files

* remove auto-generated file again

* updated resources paths to bibigrid/resources

* Updated CLI click. Changed to match structure and argument

* dirty fix

* minor updates to usability and documentation

* fixed path for integration_test

* fixed startup to new run_action structure

* pleased pylint

* updated documentation from 'bibigrid -c' to 'bibigrid create'

* rebuild uv.lock

* updated version in pyproject.toml to match version change to align with future bibigrid releases

* changed cli to main to be more explicit

* added a single line to explain how to install BiBiGrid as a package. Can be improved upon in the future.

---------

Co-authored-by: Xaver Stiensmeier <xaverstiensmeier@gmx.de>

* changed dependency installation for linting workflow due to pytoml instead of requirements files

* fixing ansible lint not being installed

* fixed path for linter

* disabled pylint too many arguments for cli

* implemented/adapted paramiko sock5 connection

* ide now supports socks5

* removed extra debug messages

* fixed tests

* minor readme fixes

* added very basic documentation

* added information on how to establish the connection

* improved socks5 documentation

* set mem to * 0.98 - 150

* please linter

* fix default partition warning

* force template overwrite

* early state. Not working because it is unclear how we want to dynamically load file path

* fix(Dockerfile):updated dockerfile to work with new pyproject.toml

* Fix docker restarts even when no change (#673)

* fix docker restarts even when no change

* changed to better notify structure

* fixed name of handler

* Allow setting custom cluster_id when creating resources (#661)

* Allow setting custom cluster_id

* added log message for cluster id in startup

* improved cluster_id checking and allowed passing cluster_id to create from CLI

* improved error message for malformed cluster ids

* improved error message for malformed cluster ids

* updated uv.lock and re-added requirements files in case someone prefers them.

* enabled MAX_ID_LENGTH check again.

* improved log message

---------

Co-authored-by: Xaver Stiensmeier <xaverstiensmeier@gmx.de>

* fix(Locks):waiting for all locks to avoid held locks

* updated packages

* pleased ansible lint by adding bibigrid_ to runtime defined variables

* fixing line too long

* Fixed paramiko version

* pleased linter

---------

Co-authored-by: Manuel Koesters <17874544+MKoesters@users.noreply.github.com>
Co-authored-by: Jan Krueger <jkrueger@cebitec.uni-bielefeld.de>
Co-authored-by: Manuel <mkoes@protonmail.com>
Co-authored-by: dweinholz <david-weinholz@web.de>
jkrue added a commit that referenced this pull request Oct 8, 2025
* Fix rest parallel (#652)

* added write to remote

* first attempt at remote writing

* changed instances of write_yaml that happen for creation to direct remote writes.

* added TODO

* changed os_versions in cloud_node_requirements.yaml

* changed version number. Went down to 3 to align with github repository

* fixed global variable not static causing parallel create runs to affect each other

* pleasing linter

* pleasing linter

* Fixing node memory handling (#655)

* Attempt using TRES_CORE_MEMORY

* instead of //32 and capped at 2000 //4 +1000 sounds more reasonable

* fixed equation

* log warning if ram < 4096

* added unit

* line too long

* Address volumes by id not name internally (#656)

* fixed volume name or id

* added new volume key "id" to schema

* added new volume key "id" to rest model

* moved rest models to models/ folder

* fixed tests, improved function naming

* improved readability. Ignored pylint multiple branches for volume creation

* duplicate ignore

* added disable duplicate code

* removed code duplicate

* fixed name not set bug, renamed path of ingetration_test bibigrid.yaml to bibigrid_test.yaml

* added info in configuration.md

* updated bibigrid.yaml

* Draft a POC for SOCKS support

* improve socks connection retry logic

* Add documentation for using socks5Proxy

* remove support of focal repositories

* add handler for apt cache update

* tpyo, flush handlers

* added newline

* Restructure package to allow building with uv/pip etc (#660)

* Add build system

- use uv for dependency management
- add bibigrid entrypoint
- use click for command line parsing

* Adapt pyproject.toml

* adapy pyproject.toml

* Restructure package

- add pyproject.toml and use uv as build system
- move ansible resources into package and adapt paths

* remove auto-generated files

* remove auto-generated file again

* updated resources paths to bibigrid/resources

* Updated CLI click. Changed to match structure and argument

* dirty fix

* minor updates to usability and documentation

* fixed path for integration_test

* fixed startup to new run_action structure

* pleased pylint

* updated documentation from 'bibigrid -c' to 'bibigrid create'

* rebuild uv.lock

* updated version in pyproject.toml to match version change to align with future bibigrid releases

* changed cli to main to be more explicit

* added a single line to explain how to install BiBiGrid as a package. Can be improved upon in the future.

---------

Co-authored-by: Xaver Stiensmeier <xaverstiensmeier@gmx.de>

* changed dependency installation for linting workflow due to pytoml instead of requirements files

* fixing ansible lint not being installed

* fixed path for linter

* disabled pylint too many arguments for cli

* implemented/adapted paramiko sock5 connection

* ide now supports socks5

* removed extra debug messages

* fixed tests

* minor readme fixes

* added very basic documentation

* added information on how to establish the connection

* improved socks5 documentation

* set mem to * 0.98 - 150

* please linter

* fix default partition warning

* force template overwrite

* early state. Not working because it is unclear how we want to dynamically load file path

* fix(Dockerfile):updated dockerfile to work with new pyproject.toml

* Fix docker restarts even when no change (#673)

* fix docker restarts even when no change

* changed to better notify structure

* fixed name of handler

* Allow setting custom cluster_id when creating resources (#661)

* Allow setting custom cluster_id

* added log message for cluster id in startup

* improved cluster_id checking and allowed passing cluster_id to create from CLI

* improved error message for malformed cluster ids

* improved error message for malformed cluster ids

* updated uv.lock and re-added requirements files in case someone prefers them.

* enabled MAX_ID_LENGTH check again.

* improved log message

---------

Co-authored-by: Xaver Stiensmeier <xaverstiensmeier@gmx.de>

* fix(Locks):waiting for all locks to avoid held locks

* updated packages

* pleased ansible lint by adding bibigrid_ to runtime defined variables

* fixing line too long

* Fixed paramiko version

* pleased linter

* added zenodo link

---------

Co-authored-by: Manuel Koesters <17874544+MKoesters@users.noreply.github.com>
Co-authored-by: Jan Krueger <jkrueger@cebitec.uni-bielefeld.de>
Co-authored-by: Manuel <mkoes@protonmail.com>
Co-authored-by: dweinholz <david-weinholz@web.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

REST: BiBiGrid Can't Run In Parallel Due To Created And Uploaded Configuration Files

2 participants