fix: Disable partial download if file being downloaded is modified between retries #3941

Bravo555 · 2026-01-26T09:57:33Z

Proposed changes

In current download implementation, where in case of interruption we try to resume the download using a HTTP range request to avoid re-downloading parts we already downloaded, there is a potential issue where if the file is updated by the server between retries, we can miss this and corrupt the file.

This PR adds a check where if the file is modified, we abort the range request and request the full file again with a normal GET request.

In particular, the check if file was modified compares ETag header value if it exists, between current and previous request. If it is different than ETag from previous request, we request full range of the file again. If there's no ETag, behaviour is unchanged and we proceed with the range request, so as to not disable partial requests for servers that don't send an ETag, for example Cumulocity Inventory Binaries.

Types of changes

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
Documentation Update (if none of the other choices apply)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

Checklist

I have read the CONTRIBUTING doc
I have signed the CLA (in all commits with git commit -s. You can activate automatic signing by running just prepare-dev once)
I ran just format as mentioned in CODING_GUIDELINES
I used just check as mentioned in CODING_GUIDELINES
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)

Further comments

One more thing we could do is check if the Content-Length of the resource is the same, hopefully catching some updates where size of the file changes. Unfortunately currently I am unable to test this because to test partial requests we use chunked transfer encoding, and when it's in use, reqwest ignores Content-Length header and doesn't report it (maybe idea being that chunked transfer encoding is used inherently for streaming requests, where you most often don't know the size ahead of time). So currently, this check is not added, and we only check ETag if it's present.

Some servers also send a Last-Modified header, which can be added (unfortunately Cumulocity also doesn't support it).

There's also option to use Want-Content-Digest header to request a digest from the server, which we could then also compute locally to perform an integration check before finishing the download, but so far I haven't seen a server that supports it.

codecov · 2026-01-26T10:14:47Z

Codecov Report

❌ Patch coverage is 84.88372% with 13 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
crates/common/download/src/download.rs	76.31%	1 Missing and 8 partials ⚠️
...s/common/download/src/download/partial_response.rs	91.66%	3 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2026-01-26T10:35:04Z

Robot Results

✅ Passed	❌ Failed	⏭️ Skipped	Total	Pass %	⏱️ Duration
825	0	3	825	100	2h36m26.889894999s

albinsuresh · 2026-01-27T07:22:02Z

crates/common/download/src/download.rs

+            let mut response = self.request_range_from(url, offset).await?;
+            if was_resource_modified(&response, &prev_response) {


There is nothing wrong with this impl. But, instead of this manual check and adjusting the start position, using the IfRange header along with the ETag seems like the "accepted convention" to resume a partial download. Avoids that additional HTTP call as well (not that it really matters in the context of a file download).

IfRange indeed is something that I wanted to add as well, but in a follow-up PR.

crates/common/download/src/download.rs

didier-wenzek · 2026-01-27T08:38:34Z

crates/common/download/src/download.rs

+        (None, None) => {
+            // no etags in either request, assume resource is unchanged
+            false
+        }


It would be safer to assume the contrary. But, I also understand this choice: if the source doesn't even try to provide support to detect changes, then why bother?

Yeah, particularly with downloading files from Cumulocity where ETag is not given, I didn't want to disable partial downloads there, as people might be surprised why it stopped working... But in cases where ETag is actually available we shouldn't ignore it.

albinsuresh

Approving the impl though there are a few minor things to fix.

crates/common/download/src/download/tests.rs

crates/common/download/Cargo.toml

crates/common/download/src/download.rs

Bravo555 · 2026-01-29T12:39:25Z

There was one additional issue where if the file was modified but server returned full response anyway, we would discard the request and retry instead of downloading the new response.
Fixed in 00efec1 by applying was_resource_modified check only if response is partial content.

albinsuresh

Re-confirming my approval for the updated logic.

albinsuresh · 2026-01-30T07:36:40Z

crates/common/download/src/download/partial_response.rs

 use reqwest::StatusCode;
+use reqwest::{header, Response};
+
+pub(super) enum PartialResponse {


Optional:

Suggested change

pub(super) enum PartialResponse {

pub(super) enum RangeResponse {

IMO better to keep the name PartialResponse as it's related to the Partial Content status code.

albinsuresh · 2026-01-30T08:09:10Z

crates/common/download/src/download/partial_response.rs

+        StatusCode::PARTIAL_CONTENT => {
+            if was_resource_modified(response, prev_response) {
+                return Ok(PartialResponse::ResourceModified);


At first, I thought this inefficiency (where the already fetched partial content has to be discarded after the Etag check) could be completely avoided with the usage of If-Range header. But I just realised that a server supporting ETag doesn't guarantee that it supports If-Range as well. So, we'll need this to handle that corner case. But yeah, for the servers that support it, this path would be skipped.

albinsuresh · 2026-01-30T08:10:45Z

crates/common/download/src/download/partial_response.rs

-use reqwest::header;
 use reqwest::header::HeaderValue;
 use reqwest::StatusCode;
+use reqwest::{header, Response};


To be fixed to make the formatter happy.

Signed-off-by: Marcel Guzik <[email protected]>

didier-wenzek · 2026-01-30T09:46:39Z

crates/common/download/src/download.rs

+            let mut response = self.request_range_from(url, request_offset).await?;
+            let offset = match partial_response::response_range_start(&response, &prev_response)? {
+                PartialResponse::CompleteContent => 0,


I don't fully get how we are sure the download is making progress, as we restart here from zero in a loop and the backoff retry being inner to this loop.

We restart from zero only if the server returns 200 OK and we have to download the entire file again. self.request_range_from performs the HTTP request, and is subject to the backoff retry policy, but reading response body and writing it to file happens in save_chunks_to_file_at, which if it completes without errors, breaks out of the loop.

which if it completes without errors, breaks out of the loop.

That if is my concern. What if we repeatedly fail to consume from the network the last bytes of a file that is served in its entirety each time? Having an infinite retry loop is never safe, even when the likelihood is low.

Indeed you've convinced me that the loop being unbounded is a problem and it should be bounded somehow, but how? Should we just limit the number of iterations the loop can do, such that if we retry too many times we fail, even if theoretically every retry could make a little progress before e.g. timing out? Or should we try to do something smarter, like only count towards the limit requests that haven't made any progress (although it's not clear to me how to precisely define it)?

Ah, this whole thing turns out to be more complicated that initially anticipated, everywhere there is some little edge case. Maybe there are some dependencies that could help with this, or if not I should take a more thorough look at how projects like wget are doing this.

Should we just limit the number of iterations the loop can do

Simple and effective. I don't think we need something smarter as the point is only to avoid insane traps.

If during the download the resource was modified on the server to be smaller and we've already written more bytes to disk than the new size of the file, after retrying and overwriting the file with the new version we could end up with garbage data from the old version of the file at the end. To fix this, after completing download, we call `set_len` to discard any extra bytes that might be present after the cursor when we finish writing. Signed-off-by: Marcel Guzik <[email protected]>

Bravo555 · 2026-01-30T14:22:53Z

The issue with leftover bytes from older versions of the downloaded resource was addressed in a7763e4.
Given how this PR is growing with additional edge cases, I could alternatively submit it in a different PR and merge this one as is. Also issue with unbounded loop will have to be addressed, but arguably given this problem was already existing before and this PR didn't introduce it, I'd also prefer to do it in a follow-up PR.

Bravo555 requested a deployment to Test Pull Request January 26, 2026 09:57 — with GitHub Actions Waiting

Bravo555 force-pushed the fix/partial-download-verify-etag branch from 4fd77e9 to 6b3b3a4 Compare January 26, 2026 09:58

Bravo555 temporarily deployed to Test Pull Request January 26, 2026 09:58 — with GitHub Actions Inactive

Bravo555 changed the title ~~fix: partial download verify etag~~ fix: Disable partial download if file being downloaded is modified between retries Jan 26, 2026

Bravo555 temporarily deployed to Test Auto January 26, 2026 10:11 — with GitHub Actions Inactive

Bravo555 force-pushed the fix/partial-download-verify-etag branch from 6b3b3a4 to 783fb5a Compare January 26, 2026 14:32

Bravo555 requested a deployment to Test Pull Request January 26, 2026 14:32 — with GitHub Actions Waiting

Bravo555 force-pushed the fix/partial-download-verify-etag branch from 783fb5a to 4d81a65 Compare January 26, 2026 15:44

Bravo555 requested a deployment to Test Pull Request January 26, 2026 15:44 — with GitHub Actions Waiting

Bravo555 force-pushed the fix/partial-download-verify-etag branch from 4d81a65 to a0547c7 Compare January 26, 2026 15:45

Bravo555 temporarily deployed to Test Pull Request January 26, 2026 15:45 — with GitHub Actions Inactive

Bravo555 temporarily deployed to Test Auto January 26, 2026 16:02 — with GitHub Actions Inactive

Bravo555 marked this pull request as ready for review January 26, 2026 16:05

Bravo555 requested review from albinsuresh, didier-wenzek, jarhodes314, reubenmiller and rina23q as code owners January 26, 2026 16:05

albinsuresh reviewed Jan 27, 2026

View reviewed changes

didier-wenzek reviewed Jan 27, 2026

View reviewed changes

Bravo555 force-pushed the fix/partial-download-verify-etag branch from a0547c7 to 7187df8 Compare January 27, 2026 09:41

Bravo555 requested a deployment to Test Pull Request January 27, 2026 09:41 — with GitHub Actions Waiting

albinsuresh approved these changes Jan 28, 2026

View reviewed changes

crates/common/download/src/download/tests.rs Outdated Show resolved Hide resolved

crates/common/download/Cargo.toml Outdated Show resolved Hide resolved

crates/common/download/src/download.rs Outdated Show resolved Hide resolved

Bravo555 temporarily deployed to Test Pull Request January 29, 2026 12:33 — with GitHub Actions Inactive

Bravo555 requested a review from albinsuresh January 29, 2026 12:39

Bravo555 had a problem deploying to Test Auto January 29, 2026 12:42 — with GitHub Actions Failure

albinsuresh approved these changes Jan 30, 2026

View reviewed changes

Bravo555 added 3 commits January 30, 2026 09:22

refactor(download): split tests to other file

b78bd41

Signed-off-by: Marcel Guzik <[email protected]>

refactor(download): remove duplication in retry logic

7d4a43d

Signed-off-by: Marcel Guzik <[email protected]>

fix(download): verify etags for resumed requests

f19a5d0

Signed-off-by: Marcel Guzik <[email protected]>

Bravo555 force-pushed the fix/partial-download-verify-etag branch from 00efec1 to f19a5d0 Compare January 30, 2026 09:25

Bravo555 temporarily deployed to Test Pull Request January 30, 2026 09:25 — with GitHub Actions Inactive

Bravo555 temporarily deployed to Test Auto January 30, 2026 09:34 — with GitHub Actions Inactive

didier-wenzek reviewed Jan 30, 2026

View reviewed changes

Bravo555 temporarily deployed to Test Pull Request January 30, 2026 14:16 — with GitHub Actions Inactive

Bravo555 temporarily deployed to Test Auto January 30, 2026 14:24 — with GitHub Actions Inactive

Bravo555 added this pull request to the merge queue Feb 2, 2026

Merged via the queue into thin-edge:main with commit d827e8f Feb 2, 2026
34 checks passed

Bravo555 deleted the fix/partial-download-verify-etag branch February 2, 2026 11:49

reubenmiller added the theme:connectivity Generic connectivity related stuff like HTTP proxy etc. label Feb 2, 2026

		let mut response = self.request_range_from(url, offset).await?;
		if was_resource_modified(&response, &prev_response) {

	pub(super) enum PartialResponse {
	pub(super) enum RangeResponse {

fix: Disable partial download if file being downloaded is modified between retries #3941

fix: Disable partial download if file being downloaded is modified between retries #3941

Conversation

Bravo555 commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Types of changes

Paste Link to the issue

Checklist

Further comments

Uh oh!

codecov bot commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Robot Results

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

albinsuresh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Bravo555 commented Jan 29, 2026

Uh oh!

albinsuresh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

didier-wenzek Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Bravo555 commented Jan 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Bravo555 commented Jan 26, 2026 •

edited

Loading

codecov bot commented Jan 26, 2026 •

edited

Loading

github-actions bot commented Jan 26, 2026 •

edited

Loading

didier-wenzek Jan 30, 2026 •

edited

Loading