teuthology/suite/run.py: Improve ScheduleFail exception#1924
teuthology/suite/run.py: Improve ScheduleFail exception#1924
Conversation
9ca70bd to
5fe4921
Compare
5fe4921 to
3f548c6
Compare
zmc
left a comment
There was a problem hiding this comment.
Just a couple minor changes suggested; have you scheduled a run with this branch? I'd be curious to see it if so
3f548c6 to
27db304
Compare
|
integration test failed: |
|
jenkins retest this please |
47ddc26 to
27db304
Compare
fc1bcd3 to
d4df210
Compare
d4df210 to
a2bc796
Compare
a2bc796 to
340e02b
Compare
340e02b to
13d1c6d
Compare
do it for /teuthology/teuthlogy.sh && /start.sh Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
13d1c6d to
a498fef
Compare
There was a problem hiding this comment.
LGTM! To be sure, I tested it on a few runs:
https://pulpito.ceph.com/vallariag-2026-03-12_22:38:13-nvmeof-wip-rocky10-branch-of-the-day-2026-03-09-1773079353-tentacle-distro-default-trial/
https://pulpito.ceph.com/vallariag-2026-03-12_19:02:32-nvmeof-wip-rocky10-branch-of-the-day-2026-03-09-1773079353-tentacle-distro-default-trial/
https://pulpito.ceph.com/vallariag-2026-03-12_15:19:21-nvmeof-wip-rocky10-branch-of-the-day-2026-03-09-1773079353-tentacle-distro-default-trial/
teuthology/suite/__init__.py
Outdated
| conf[key] = value | ||
| except ValueError: | ||
| log.error(" --{} value has incorrect type/format".format(key)) | ||
| raise ScheduleFailError("--{} value has incorrect type/format".format(key),'') |
There was a problem hiding this comment.
Do we need to log and raise? I'd also maybe just use an f-string here.
|
Do we have any idea why the integration test is failing here? I'm not seeing a useful clue |
Added more loggings and utilizes exceptions e.g., ScheduleFail, GitError Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
Changes: - Added tmp_path fixture to all four TestScheduleSuite test methods - Create the expected suite directory structure (tmp_path/suites/suite) - Update self.args.suite_dir to point to the temporary directory This ensures that when schedule_suite() constructs the suite path: suite_path = os.path.join(suite_dir, suite_relpath, 'suites', suite_name) the directory actually exists and passes the os.path.exists() check. Signed-off-by: Kamoltat Sirivadhna <ksirivad@redhat.com>
9ccd2e0 to
844fa7e
Compare
I'll check back here once the newly triggered CI finishes, Thanks! @zmc |
|
@zmc yeah the failure are not super helpful, seems like the job got scheduled but teuthology container exited with code 1 mean something could have potentially gone wrong after job is scheduled? |
|
I see the integration test passes on main (triggered it just now): https://github.com/ceph/teuthology/actions/runs/23331414332 So I tried the following locally on soko04 machine: When I checkout to latest main branch locally: But when I check out to this PR branch and run same teuthology-suite command: Not sure what this means yet, but I don't see any errors in teuthology-suite output. |
|
I think the integration failure is caused by one behavior change in this PR: teuthology.suite.main() now returns job_count, and scripts/suite.py forwards that return value from the teuthology-suite CLI entrypoint. For console scripts, that value becomes the process exit code (sys.exit(return_value) behavior). That matches with what we’re seeing in integration (teuthology container exits with code 1 despite successful scheduling/log output). How about we keep job_count for API/pulpito consumers, but make the CLI path return 0 on success (and non-zero only for real errors or exceptions) |
(Merge this first before ceph/teuthology-api#51 and ceph/pulpito-ng#23)
As per: ceph/pulpito-ng#23
Fixes: https://tracker.ceph.com/issues/64820