Skip to content

Conversation

@lool
Copy link
Contributor

@lool lool commented Mar 28, 2025

During some GitHub workflow runs, incus exec commands are randomly
interrupted by:
Error: websocket: close 1006 (abnormal closure): unexpected EOF

It's possible that this is due to an OS update happening during the run,
so pre-emptively do the updates first, which is a good idea for bug
fixes and predicatibility anyway.

During some GitHub workflow runs, incus exec commands are randomly
interrupted by:
    Error: websocket: close 1006 (abnormal closure): unexpected EOF

It's possible that this is due to an OS update happening during the run,
so pre-emptively do the updates first, which is a good idea for bug
fixes and predicatibility anyway.

Signed-off-by: Loïc Minier <[email protected]>
@ricardosalveti
Copy link

It's possible that this is due to an OS update happening during the run,
so pre-emptively do the updates first, which is a good idea for bug
fixes and predicatibility anyway.

Is there a way to block the auto update? Doing apt-get update / upgrade on every run will cause the builder behavior to potentially change as well.

@lool
Copy link
Contributor Author

lool commented Mar 28, 2025

Is there a way to block the auto update? Doing apt-get update / upgrade on every run will cause the builder behavior to potentially change as well.

(NB: we're not even sure this the issue.)

That said, the builder environment is already potentially changing for every build, as we pull build-dependencies:

  • Incus is pre-installed in our images, but the workflow still apt installs it
  • the workflow pulls a trixie image that is updated regularly
  • the workflow then installs the debos package and its dependencies in the trixie container, these packages get updated regularly
  • debos builds the image from trixie packages which get updated regularly

So I guess we can worry about the moving part of the OS install on the GH runner, but it's 20% of the moving parts :)

To your point, we could research cloud-init / unattended-upgrades / snapd and similar in our base image and see if any of them fires some updates; easiest way to do that would be to launch such an image manually and wait 60mn to see if anything gets updated by checking dpkg/apt, cloud-init and snapd logs.

@lool
Copy link
Contributor Author

lool commented Mar 28, 2025

There might also be a gcp agent doing some magic.

@doanac
Copy link
Contributor

doanac commented Mar 28, 2025

I could also update the base OS image. Then your CI jobs could go a little faster and be on something more predictable. And if there's some way to disable any type of background update (assuming that could be at play - i'm somewhat skeptical) - I'm happy to add that.

@ricardosalveti
Copy link

Yeah, let's try with this update approach on every run to see if that works around the issue, then we can isolate.

@lool lool merged commit e74cb12 into qualcomm-linux:main Mar 28, 2025
3 checks passed
@lool lool deleted the update-os branch May 28, 2025 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants