Skip to content

fix(dvcr): set Cap value for default backoff in dvcr importer#1992

Open
diafour wants to merge 1 commit intomainfrom
fix/dvcr/add-cap-value-for-default-backoff-in-importer
Open

fix(dvcr): set Cap value for default backoff in dvcr importer#1992
diafour wants to merge 1 commit intomainfrom
fix/dvcr/add-cap-value-for-default-backoff-in-importer

Conversation

@diafour
Copy link
Member

@diafour diafour commented Feb 13, 2026

Description

  • Set Cap of 1m to limit delays for each step.
  • Set Steps to 20 to limit overall timeout to ~17 minutes to survive DVCR cleanup procedure.
  • Change logic behind Cap and Steps: always run all steps, Cap is a duration limit for the step, not a maximum duration that stops retries.

Why do we need it, and what problem does it solve?

Default backoff without Cap leads to delays grows to more than 30m:

I0212 15:49:13.258399       1 backoff.go:138] Failed to execute: error copying from the source: io: read/write on closed pipe:
retry in 4m19.124687232s...


I0212 15:54:31.109497       1 backoff.go:138] Failed to execute: error copying from the source: io: read/write on closed pipe:
retry in 12m17.741731905s...


I0212 16:07:41.794799       1 backoff.go:138] Failed to execute: error copying from the source: io: read/write on closed pipe:
retry in 38m45.973894501s...

What is the expected result?

  • dvcr-importer retry delay is limited to around 1m

  • Retries after fix:

I0216 09:41:49.626651       1 backoff.go:143] Failed to execute attempt 1 of 20: error copying from the source: io: read/write on closed pipe: retry in 1s...
I0216 09:42:20.858769       1 backoff.go:143] Failed to execute attempt 2 of 20: error uploading layer: Get "https://dvcr.d8-virtualization.svc/v2/": dial tcp 10.222.143.242:443: connect: connection refused; Get "http://dvcr.d8-virtualization.svc/v2/": dial tcp 10.222.143.242:80: i/o timeout: retry in 3.1s...
I0216 09:42:54.345948       1 backoff.go:143] Failed to execute attempt 3 of 20: error uploading layer: Get "https://dvcr.d8-virtualization.svc/v2/": dial tcp 10.222.143.242:443: connect: connection refused; Get "http://dvcr.d8-virtualization.svc/v2/": dial tcp 10.222.143.242:80: i/o timeout: retry in 9.6s...
I0216 09:43:34.153793       1 backoff.go:143] Failed to execute attempt 4 of 20: error uploading layer: Get "https://dvcr.d8-virtualization.svc/v2/": dial tcp 10.222.143.242:443: connect: connection refused; Get "http://dvcr.d8-virtualization.svc/v2/": dial tcp 10.222.143.242:80: i/o timeout: retry in 27.3s...
I0216 09:44:31.530157       1 backoff.go:143] Failed to execute attempt 5 of 20: error uploading layer: Get "https://dvcr.d8-virtualization.svc/v2/": dial tcp 10.222.143.242:443: connect: connection refused; Get "http://dvcr.d8-virtualization.svc/v2/": dial tcp 10.222.143.242:80: i/o timeout: retry in 1m3s...
I0216 09:46:04.830973       1 backoff.go:143] Failed to execute attempt 6 of 20: error uploading layer: Get "https://dvcr.d8-virtualization.svc/v2/": dial tcp 10.222.143.242:443: connect: connection refused; Get "http://dvcr.d8-virtualization.svc/v2/": dial tcp 10.222.143.242:80: i/o timeout: retry in 1m4.5s...
...

Checklist

  • The code is covered by unit tests.
  • e2e tests passed.
  • Documentation updated according to the changes.
  • Changes were tested in the Kubernetes cluster manually.

Changelog entries

section: dvcr
type: fix
summary: Set Cap to limit delays to around 1m for each retry step.

@diafour diafour added this to the v1.6.0 milestone Feb 13, 2026
- Set Cap of 1m to limit delays for each step.
- Set Steps to 20 to limit overall timeout to ~17 minutes to survive DVCR cleanup procedure.
- Change logic behind Cap and Steps: always retry all steps, Cap is a maximum delay for the attempt, not a maximum delay that stop retrying.

Signed-off-by: Иван Михейкин <ivan.mikheykin@flant.com>
@diafour diafour force-pushed the fix/dvcr/add-cap-value-for-default-backoff-in-importer branch from 7d35c18 to 2d5cad2 Compare February 16, 2026 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant