Skip to content

chore: disable colcon-powershell#1143

Merged
bjsowa merged 26 commits intoros2from
chore/fix-ci
Feb 10, 2026
Merged

chore: disable colcon-powershell#1143
bjsowa merged 26 commits intoros2from
chore/fix-ci

Conversation

@bjsowa
Copy link
Member

@bjsowa bjsowa commented Jan 27, 2026

This is a hotfix for the CI failing due to ModuleNotFoundError: No module named 'ament_package' error (see here for example). The workflow has been failing for about a month now and I suspect it is due to some recent change to the Github runner images. I couldn't find any other solution than running the job in a container.

The other test is failing due to recent changes in rclpy that changed some executor behavior (which I authored 😅). It will be fixed in another PR later.

@MatthijsBurgh
Copy link
Contributor

In my mind it is very unlikely an update in a runner image causes the CI to break. But here is the diff between the image releases, link

You can check it to see whether you see something interesting.

@bjsowa
Copy link
Member Author

bjsowa commented Jan 28, 2026

In my mind it is very unlikely an update in a runner image causes the CI to break. But here is the diff between the image releases, link

You can check it to see whether you see something interesting.

I tried looking for something interesting there but couldn't find anything. I'm suspicious of runner image for two reasons:

  • Running the CI in ubuntu:noble container image or using ubuntu-slim runner (see here) does not result in the error.
  • I have a private repo where the CI builds the whole ROS (including ament_package) in a Nix devshell which freezes all the dependencies and it started suffering from the same error around similar time.

@MatthijsBurgh
Copy link
Contributor

Can we print the filesystem tree or similar just before building? I am just looking for a way to debug this issue.

@bjsowa
Copy link
Member Author

bjsowa commented Feb 2, 2026

@MatthijsBurgh Github actions seem to be down for now

@MatthijsBurgh
Copy link
Contributor

Yes Indeed 😢

@bjsowa
Copy link
Member Author

bjsowa commented Feb 3, 2026

@MatthijsBurgh I have no more ideas for things I can check for now.

@sea-bass
Copy link
Contributor

sea-bass commented Feb 3, 2026

honestly? I had CI fail on one of my repos and deleting all the github actions caches and rerunning solved it for me. Try that?

EDIT: oh, there are no caches here...

@bjsowa
Copy link
Member Author

bjsowa commented Feb 3, 2026

honestly? I had CI fail on one of my repos and deleting all the github actions caches and rerunning solved it for me. Try that?

EDIT: oh, there are no caches here...

There is only cache from pre-commit job. It shouldn't affect anything but I cleared it anyway and it did not change anything.

@bjsowa
Copy link
Member Author

bjsowa commented Feb 4, 2026

@MatthijsBurgh Any idea how to debug this further?

@MatthijsBurgh
Copy link
Contributor

Preferably we can print the state midway through the ros-ci-action. Are there any hooks we can define?

@bjsowa
Copy link
Member Author

bjsowa commented Feb 5, 2026

Preferably we can print the state midway through the ros-ci-action. Are there any hooks we can define?

We could just try running the commands manually, instead of using the action

@MatthijsBurgh
Copy link
Contributor

We can try checking the colcon logs.

See here how to store the logs as artifacts

@MatthijsBurgh
Copy link
Contributor

@bjsowa when running on the runner the logs show the following PYTHONPATH

PYTHONPATH=/home/runner/work/rosbridge_suite/rosbridge_suite/ros_ws/install/rosbridge_test_msgs\lib/python3.12/site-packages;${PYTHONPATH}

The backslash and the semi-colon show that some part of the tooling, either colcon/ament or the action-ros-ci, thinks we are on Windows. I think this is something we should investigate.

@MatthijsBurgh
Copy link
Contributor

@bjsowa I don't know what changed in december causing the issues. But the issue is colcon-powershell not being skipped. It wasn't skipped, because the PSModulePath is set in the runner. See the check For me it is hard to say whether that check is okay or too much of a shortcut.

But by overriding it in the CI config, we make it work for now.

Might be fixed by colcon/colcon-powershell#42

@MatthijsBurgh MatthijsBurgh changed the title chore: Run CI test job in a container chore: disable colcon-powershell Feb 9, 2026
MatthijsBurgh
MatthijsBurgh previously approved these changes Feb 9, 2026
@bjsowa
Copy link
Member Author

bjsowa commented Feb 9, 2026

Thank you @MatthijsBurgh for finding the culprit! Should we remove the other changes or leave it as it is?

@MatthijsBurgh
Copy link
Contributor

I kept these changes as I think both changes (bash as default shell; upload log artifacts) are useful. So you can merge if you agree.

@bjsowa bjsowa merged commit 9c70641 into ros2 Feb 10, 2026
2 of 6 checks passed
@bjsowa bjsowa deleted the chore/fix-ci branch February 10, 2026 07:30
bjsowa added a commit that referenced this pull request Feb 10, 2026
Co-authored-by: Matthijs van der Burgh <MatthijsBurgh@outlook.com>
bjsowa added a commit that referenced this pull request Feb 10, 2026
Co-authored-by: Matthijs van der Burgh <MatthijsBurgh@outlook.com>
MatthijsBurgh added a commit that referenced this pull request Feb 10, 2026
Co-authored-by: Matthijs van der Burgh <MatthijsBurgh@outlook.com>
MatthijsBurgh added a commit that referenced this pull request Feb 10, 2026
Co-authored-by: Matthijs van der Burgh <MatthijsBurgh@outlook.com>
Roald-Schaum added a commit to SHC-ASTRA/astra_rosbridge_suite that referenced this pull request Feb 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants