Skip to content

cirrus-run thinks the build is running long after it's done on Cirrus CI side #8

@sio

Description

@sio

Less than 1% of cirrus-run invocations hang indefinitely after Cirrus CI has long finished the corresponding build. CIRRUS_TIMEOUT is eventually reached and job failure is reported.

This issue needs further investigation. Is API server reporting incorrect build status sometimes? Is this some kind of cache/CDN issue?

Troubleshooting is difficult because of the rarity of this failure and because most invocations of cirrus-run happen non-interactively (via another CI service, e.g. GitLab).

Observer needs to act quickly upon encountering cirrus-run timeout:

  • Confirm that the build is in fact finished on Cirrus CI side. Link to the build is usually printed to stdout by cirrus-run.
  • Check API response for that particular build status:
    • Optional: Ensure that CIRRUS_API_TOKEN environment variable is provided with a correct value. Without a token only public repos will be viewable, and API rate limits will probably be more strict.
    • Execute make debug/build_status DEBUG_BUILD_ID=5735044040884224 from repo top-level directory (replace the number with your build ID)
  • Report the output here. If the script just keeps repeating the same "EXECUTING" status you can interrupt it with Ctrl+C or keep running to see when/if it fails.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions