Skip to content

Reducing number of randomly failing github actions - proposal - batch1 cache/hbar limiter large contract #4658

@jasuwienas

Description

@jasuwienas

Problem

I've been losing some time debugging failing GitHub Actions. I can't restart them myself, so I try to confirm failures aren’t caused by me. But some tests fail so often that it was hard for me to tell. We can’t make pipelines 100% reliable, but after analysis I had on Friday have an idea for few adjustments.

Solution

  1. should not cache (API Batch 1 -> acceptance-workflow)
it('should not cache "safe" block in "eth_getBlockByNumber"', async function () {
  const blockResult = await relay.call(RelayCalls.ETH_ENDPOINTS.ETH_GET_BLOCK_BY_NUMBER, ['safe', false]);
  await Utils.wait(1000);
  const blockResult2 = await relay.call(RelayCalls.ETH_ENDPOINTS.ETH_GET_BLOCK_BY_NUMBER, ['safe', false]);
  expect(blockResult).to.not.deep.equal(blockResult2);
});

At least one of the tests with "should not cache" fail often: and simply increasing the wait time should fix it. A single second sometimes isn’t enough.
Failed here for example:
https://github.com/hiero-ledger/hiero-json-rpc-relay/actions/runs/19759973622/job/56628197548?pr=4613

  1. HBAR Limiter Batch 1 - large contract deployment

This fails for me in every full local run (but passes when run alone):

AssertionError: expected 6548219490 to be close to 6468385768 +/- 26580711.6

As a hotfix, maybe simply increase the tolerance ~4× to reduce the failing rates?
Failed here:
Acceptance Tests / HBar Limiter Batch 1 / acceptance-workflow (pull_request)

I also hit many other random failures, but those are easy to spot (e.g. timeouts, 502 from remote APIs). The tests listed above are problematic because they look like real bugs rather than randomness. For example, a "wrongly cached block" failing on a cache-related PR was genuinely alarming. Same as incorrect HBAR usage in a tx-creation-related PR... At least until I realized that I'd had the same failures on my other PRs as well…

Alternatives

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions