Skip to content

Conversation

@teogeb
Copy link
Contributor

@teogeb teogeb commented Nov 20, 2024

Error

IfOperator.fetchRedundancyFactor() call throws an error during inspectRandomNode task, the task is unable to complete. The scheduler won't start subsequent inspections as the frozen task is still pending.

This kind of error is shown in logs:

INFO [2024-11-16T23:42:11.049] (inspectOverTime          ): Inspecting target {"traceId":"iSu6S5","attemptNo":1,"target":{"sponsorshipAddress":"0x335370713cea4321330cbcb5f2e2b87fe6a33e7c","operatorAddress":"0x22f694c92b74a31dc0af06cff53586ac84b2c9fc","streamPart":"streamr.eth/demos/radio#0"}}
WARN [2024-11-16T23:42:16.110] (inspectOverTime          ): Error encountered {"traceId":"iSu6S5"}
    err: {
      "type": "Error",
      "message": "Error while executing contract call \"operator.metadata\", code=TIMEOUT",
      "stack":
          Error: Error while executing contract call "operator.metadata", code=TIMEOUT
              at withErrorHandling (/home/streamr/network/packages/sdk/dist/src/contracts/contract.js:52:30)
              at async Object.fn [as metadata] (/home/streamr/network/packages/sdk/dist/src/contracts/contract.js:62:29)
              at async Operator.fetchRedundancyFactor (/home/streamr/network/packages/sdk/dist/src/contracts/Operator.js:407:34)
              at async InspectionOverTimeTask.findNodesForTargetGivenFleetState [as findNodesForTargetGivenFleetStateFn] (/home/streamr/network/packages/node/dist/src/plugins/operator/inspectionUtils.js:66:31)
              at async InspectionOverTimeTask.run (/home/streamr/network/packages/node/dist/src/plugins/operator/inspectOverTime.js:96:43)
      "reason": {
        "type": "Error",
        "message": "request timeout (code=TIMEOUT, version=6.13.1)",
        "stack":
            Error: request timeout (code=TIMEOUT, version=6.13.1)
                at makeError (/home/streamr/network/node_modules/ethers/lib.commonjs/utils/errors.js:129:21)
                at ClientRequest.<anonymous> (/home/streamr/network/node_modules/ethers/lib.commonjs/utils/geturl.js:59:50)
                at ClientRequest.emit (node:events:519:28)
                at TLSSocket.emitRequestTimeout (node:_http_client:856:9)
                at Object.onceWrapper (node:events:633:28)
                at TLSSocket.emit (node:events:531:35)
                at Socket._onTimeout (node:net:591:8)
                at listOnTimeout (node:internal/timers:581:17)
                at process.processTimers (node:internal/timers:519:7)
        "code": "TIMEOUT",
        "shortMessage": "request timeout"
      }

Fix

Added doneGate.open() call to InspectionOverTimeTask#destroy()

  • the error handler created in start() calls the destroy() method
  • as the gate is open, the await task.waitUntilPassOrDone() at line 35 is no longer blocked

@linear
Copy link

linear bot commented Nov 20, 2024

@github-actions github-actions bot added the node label Nov 20, 2024
@github-actions github-actions bot added the docs label Nov 20, 2024
@teogeb teogeb requested a review from harbu November 20, 2024 23:26
@teogeb teogeb merged commit 42d48ae into main Nov 21, 2024
23 checks passed
@teogeb teogeb deleted the inspectOverTime-freeze-NET-1377 branch November 21, 2024 10:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants