- Added support for submitting jobs to different batch servers on Metacentrum-family clusters. See the manual for more information.
- Added support for specifying a custom interpreter when submitting a job (e.g. Python, Julia). See the manual for more information.
- Breaking Python API change: Renamed methods from camelCase to snake_case.
- Paths are now resolved to absolute paths without following symlinks, ensuring compatibility across machines with different mount points (e.g. Robox and Sokar).
- Improved hostname resolution to allow accessing worker nodes of the Sokar cluster from machines outside the
ncbr.muni.czdomain. - Fixed log lines in
qqoutfiles being truncated. - Single-node qq jobs should no longer fail on Metacentrum when a batch server is temporarily unreachable during job initialization.
- On LUMI,
qq nodesnow shows the physical number of CPU cores available on each node, not the number of threads. - qq no longer includes completed array tasks within uncompleted array jobs in the output of
qq jobsandqq stat, unless the-a/--alloption is used. - Bug fix: Fixed incorrect conversion of default walltime of Slurm partitions.
- Added
continuousjobs: a light-weight alternative to loop jobs. Continuous jobs automatically submit their continuation but do not track their cycle nor do they perform archival operations. See the manual for more information.
- Added the
--transfer-modeand--archive-modeoptions, which allow automatically transferring (and archiving, respectively) files from the working directory for other jobs than those successfully finished. See the manual for more information. - As a consequence of the above change, the behavior of
qq go,qq sync, andqq wipehas been slightly adjusted. - Breaking change: In
qq submit, list options (e.g.,--include,--exclude,--depend,--props) are now sourced exclusively from either the command line or the submitted script, if specified. Values from both sources are no longer merged. The previous behavior was inconsistent and could cause confusion and bugs, such as duplicated resources in loop jobs. - Bug fix: Fixed an issue where autocomplete for the script name in
qq submitdid not work if a previous option's value contained=. - Bug fix: Fixed parsing of
qqdirectives in submitted scripts containing numeric values.
- Bug fix: Updated installation scripts so that installation works even for nodes opening login shell.
- Bug fix: Fixed script name autocomplete so it works after options and with
--option=valuesyntax. - Updated installation scripts to use the updated link to the repository.
- Added passive support for array jobs. In the output of
qq jobsandqq stat, individual sub-jobs are displayed for all array jobs. - Added autocomplete for script name in
qq submitandqq shebang. - Some rewordings.
- The operation for obtaining the list of working nodes at job start is now retried potentially decreasing the number of failures on unstable systems (like Metacentrum).
qq cd -hnow properly prints help.
- Number of CPU cores, number of GPUs, the amount of memory and the amount of storage can be now requested per-node using the submission options
ncpus-per-node,ngpus-per-node,mem-per-node, andwork-size-per-node. Per-node properties override per-cpu properties (mem-per-cpu,work-size-per-cpu) but are overriden by "total" properties (ncpus,ngpus,mem,work-size).
- The scripts now by default try to allocate the maximum possible number of MPI ranks.
- Numbers of MPI ranks are now specified per node (in
*_mdscripts) or per client (in*_rescripts).
- The available types of working directories for the current environment are now shown in the output of
qq submit -h. - Fixed a regression from v0.5: missing size property in
qq nodesis now correctly intepreted as zero size. - When a job is killed, runtime files are copied to the input directory only after the executed process finishes.
- Changed the way working directories on Karolina and LUMI are created allowing their complete removal.
- Collection of Slurm jobs (which is complicated by Slurm's architecture) is now performed in parallel and is consequently much faster.
Wiper.deletemethod has been renamed toWiper.wipe.Killer.terminatemethod has been renamed toKiller.kill.SubmitterFactoryno longer requires a list of supported parameters and instead loads it itself.- Added getter methods to
Submitter. Submitterno longer requires to provide the "command line". Command line is no longer written into qq info files.
- If no info file is detected when running
qq go,qq info,qq kill,qq sync, andqq wipe, an error message is printed. (This fixes a regression in v0.5.0.)
- qq is now fully compatible with the LUMI supercomputer.
- The
.errand.outruntime files are now copied from the working directory to the input directory even when a job fails or is killed. This makes it easier to inspect what went wrong while keeping the input directory in a consistent state — all other files remain in the working directory.
- Added the
qq wipecommand for safely deleting the working directories of failed or killed jobs.
qq infonow displays the status of individual Slurm job steps when multiple steps exist and the information is available from the batch system.
- The Comment column is now hidden when no queues include a comment.
- Added a new Max Nodes column showing the maximum number of nodes that can be requested in each queue. This column is hidden if no queue has a set maximal number of nodes.
- You can now use the
--includeoption to specify additional files or directories outside the job's input directory. These will be copied into the working directory upon submission.
- Added support for the
-hflag as a shorthand for--help. - Added shell autocomplete for qq commands.
- Fixed incorrect naming of loop jobs when the job script had a file extension.
- Made it possible to submit qq jobs from directories other than the current working directory.
get_info_files_from_job_id_or_dirnow properly catchesPermissionErrorwhen reading restricted info files.- Retrieving job lists from Slurm is now significantly faster (still limited by Slurm performance).
- Fixed an issue preventing jobs from using multiple MPI ranks on some PBS clusters.
- Improved the dynamic output of
qq jobs: unused columns are now hidden. - Operations on job IDs are now faster.
- Job comments are now shown in the output of
qq jobs -eandqq stat -e(if available). qq syncnow correctly synchronizes contents of selected directories when using the-foption.
- Most methods in
BatchJobInterface,BatchQueueInterface, andBatchNodeInterfacenow have optional return values.
- qq can now be used on IT4Innovations clusters with the Slurm batch scheduler.
- A new
qq submitoption,--account, has been added to allow submitting jobs on IT4I.
- Introduced a new command,
qq shebang, which makes it easier to add the requiredqq runshebang line to your scripts.
- Added a flag
-e/--extraforqq jobsandqq stat, which makes qq print additional information about each job. Currently, the input machine and input directory are printed (if available), but the list may be expanded in the future.
- The environment variables
QQ_NCPUS(number of allocated CPU cores),QQ_NGPUS(number of allocated GPU cores),QQ_NNODES(number of allocated nodes), andQQ_WALLTIME(walltime in hours) are now exported to the job environment.
- When
scratch_shmorinput_diris requested, bothwork-sizeandwork-size-per-cpuproperties are now properly removed from the list of resources and are no longer displayed in the output ofqq info. - Fixed occasional SSH authentication failures by explicitly enabling GSSAPI authentication.
- Fixed current cycle identification in loop jobs. Only a partial match in archived files is now required to consider them.
- Jobs obtained using
qq jobsandqq statare now always sorted by job ID. - The number of queued jobs shown in the output of
qq queuesnow always includes both queued and held jobs. The column title was changed to 'QH' to reflect this.
- Refactored the loading of the YAML Dumper and SafeLoader.
- Removed the 'QQ' prefix from all custom class names (excluding errors).
- Added support for manually disabling automatic resubmission in loop jobs by returning the value of the
QQ_NO_RESUBMITenvironment variable from within the job script.
- Fixed a bug that prevented files from being rsynced when the user’s group differed between the computing node and the filesystem containing the input directory.
- Renamed PBSJobInfo to PBSJob.
- Set up GitHub Actions to take care of releases.