Skip to content
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
85 changes: 84 additions & 1 deletion fetch.bs
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ urlPrefix:https://httpwg.org/specs/rfc9651.html#;type:dfn;spec:rfc9651
url:text-parse;text:parsing structured fields
url:;text:structured header
url:token;text:structured field token
url:binary;text:structured field byte sequence
url:dictionary;text:structured field dictionary

urlPrefix:https://httpwg.org/specs/rfc9110.html#;type:dfn;spec:http
url:method.overview;text:method
Expand Down Expand Up @@ -66,6 +68,10 @@ urlPrefix:https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-layered-cooki
url:name-retrieve-cookies;text:retrieve cookies
url:name-serialize-cookies;text:serialize cookies
url:name-garbage-collect-cookies;text:garbage collect cookies

urlPrefix:https://httpwg.org/http-extensions/draft-ietf-httpbis-unencoded-digest.html#;spec:unencoded-digest
type:http-header
url:name-the-unencoded-digest-field;text:unencoded-digest
</pre>

<pre class=biblio>
Expand Down Expand Up @@ -117,6 +123,11 @@ urlPrefix:https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-layered-cooki
"href": "https://www.kb.cert.org/vuls/id/150227",
"title": "HTTP proxy default configurations allow arbitrary TCP connections."
},
"UNENCODED-DIGEST": {
"authors": ["Lucas Pardue", "Mike West"],
"href": "https://httpwg.org/http-extensions/draft-ietf-httpbis-unencoded-digest.html",
"title": "HTTP Unencoded Digest"
},
"WEBTRANSPORT-HTTP3": {
"authors": ["V. Vasiliev"],
"href": "https://datatracker.ietf.org/doc/html/draft-ietf-webtrans-http3",
Expand Down Expand Up @@ -4483,6 +4494,72 @@ indicates the request’s purpose is to fetch a resource that is anticipated to
prefetch, or to treat it differently when counting page visits.


<h3 id=unencoded-digest-header>`<code>Unencoded-Digest</code>` header</h3>

<p>The `<a http-header><code>Unencoded-Digest</code></a>` header field represents a server's
assertions about the integrity of a response's content. It is a <a>structured header</a> whose value
must be a <a data-lt="structured field dictionary">dictionary</a> whose keys specify hashing
algorithms, and whose values are <a data-lt="structured field byte sequence">byte sequences</a>
represent a digest of the response produced via the specified algorithm. [[!UNENCODED-DIGEST]]


<div algorithm>
<p>To <dfn export>verify `<code>Unencoded-Digest</code>` assertions</dfn>, given a
<a>byte sequence</a> <var>bytes</var> and a <a for=/>header list</a> <var>list</var>, run these
steps:

<ol>
<li><p>Let <var>header</var> be the result of
<a for="header list" lt="get a structured field value">getting</a> the
`<a http-header><code>Unencoded-Digest</code></a>` header as a "<code>dictionary</code>" from
<var>list</var>.

<li><p>If <var>header</var> is null, then return <b>verified</b>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should restructure this to return a boolean. <b>return value</b> is not part of Infra and it's not necessarily that much clearer. I suppose we could return an enum instead, but we don't really have enums with only two values. (Maybe we should and maybe we should refactor everything away from booleans after mostly refactoring away from flags, but that would require a bit of investigation as to what that would look like.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Booleans require reading the algorithm name carefully, so I find it clearer to name the output in some way. We do this already in at least one place in Fetch (https://fetch.spec.whatwg.org/#cross-origin-resource-policy-check), and I'd prefer that we do that more often.

I could put up a PR against Infra that attempted to generalize this pattern (define some set of values (Success and Failure, Allowed and Blocked?) that can be returned from algorithms, and use them in the algorithm linked above?) if you'd be interested in that. We might find more values over time, but the things I can easily remember from CSP and SRI would fall into one of those categories...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would improve it, though I worry what happens when we suddenly need more than two values. I guess we switch to "success", "failure", "report", or some such at that point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason we couldn't define multiple common patterns. The two I noted above (allow/block from https://fetch.spec.whatwg.org/#cross-origin-resource-policy-check, and failure from all over URL and a number of HTML algorithms) are ones I could easily find examples for. The latter in particular seems like it would benefit from putting something in Infra.

Filed whatwg/infra#686 to think about it some more.


<li>
<p><a for="list">For each</a> <var>alg</var> → <var>digest</var> of <var>header</var>:

<ol>
<li>
<p>Switch on <var>alg</var>:

<dl class=switch>
<dt>"<code>sha-256</code>"
<dd>
<ol>
<li><p>Let <var>body digest</var> be the result of executing the SHA-256 algorithm on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bodyDigest*

<var>bytes</var>. [[!FIPS-180-4]]

<li><p>If <var>body digest</var> matches <var>digest</var>, <a for="iteration">continue</a>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bodyDigest

What does "matches" mean here? Can we say "is"?

Also, perhaps we should pass algorithm and bytes to SRI so it can have the dependencies on SHA? Perhaps this entire algorithm can be in SRI and we only have the processing model hooks in fetch? Hmm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"is" is accurate.

I thought about folding this into SRI as well. That might make more sense. My feeling is that this header is more closely tied to the fetching mechanism, as it's a server's declaration of the intended response body's content, but I suppose that's true for the signature mechanism as well that would come in a subsequent PR, and that probably doesn't fit into Fetch terribly cleanly. Hrm.

@mozfreddyb might have an opinion here?

</ol>

<dt>"<code>sha-512</code>"
<dd>
<ol>
<li><p>Let <var>body digest</var> be the result of executing the SHA-512 algorithm on
<var>bytes</var>. [[!FIPS-180-4]]

<li><p>If <var>body digest</var> matches <var>digest</var>, <a for="iteration">continue</a>.
</ol>

<dt><b>Otherwise</b>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for <b>

<dd><p><a for="iteration">Continue</a>.
</dl>

<li><p>Return <b>failed</b>.
</ol>

<li><p>Return <b>verified</b>.
</ol>

<p class="note">This algorithm requires all valid digests delivered via
`<a http-header><code>Unencoded-Digest</code></a>` to match the response’s decoded body, while
ignoring unknown algorithms. Since the server controls both the body and the headers, it seems
unnecessary to allow the flexibility of allowing the asserted digests to match more than one
resource (as we do in client-initiated checks via [[SRI]], which need to support servers'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

References go at the end of a paragraph. If you want something inline you have to use <cite> and such.

content negotiation).
</div>


<h2 id=fetching>Fetching</h2>

Expand Down Expand Up @@ -5022,7 +5099,9 @@ steps:
<p class=note>This standardizes the error handling for servers that violate HTTP.

<li>
<p>If <var>request</var>'s <a for=request>integrity metadata</a> is not the empty string, then:
<p>If <var>request</var>'s <a for=request>integrity metadata</a> is not the empty string, or if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<p>If <var>request</var>'s <a for=request>integrity metadata</a> is not the empty string, or if
<p>If <var>request</var>'s <a for=request>integrity metadata</a> is not the empty string or

<var>internalResponse</var>'s <a for="response">header list</a> <a for="header list">contains</a>
`<a http-header><code>Unencoded-Digest</code></a>`, then:
Comment on lines +5101 to +5103
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means we enforce it for opaque responses. Is that what we want? What about requests that didn't care about integrity enforcement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is currently defined would apply the check to responses regardless of a page's integrity assertions. It's based on a server's assertions, not on integrity metadata associated with the request. That layering makes the SRI checks quite straightforward (similar conceptually to CSP's layering on top of SRI), as the algorithms can assume that the unencoded-digest header will be enforced, so they only need to concern themselves with signature verification.

This layering also gives servers the same ability as clients to make binding assertions about responses. Philosophically, I find that appealing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we don't want SRI to work for opaque responses so there will be some mismatches in the overall model. Is the idea that SRI also ends up poking at the header list?

I'm not a big fan of us poking at the internal header list as it means this is yet another header that cannot be hidden from web content processes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`<a http-header><code>Unencoded-Digest</code></a>`, then:
`<a http-header><code>Unencoded-Digest</code></a>`:


<ol>
<li><p>Let <var>processBodyError</var> be this step: run <a>fetch response handover</a> given
Expand All @@ -5035,6 +5114,10 @@ steps:
<p>Let <var>processBody</var> given <var>bytes</var> be these steps:

<ol>
<li><p><a>Verify `<code>Unencoded-Digest</code>` assertions</a> given <var>bytes</var> and
<var>internalResponse</var>'s <a for="response">header list</a>. If the result is not
<b>verified</b>, then run <var>processBodyError</var> and abort these steps.

<li><p>If <var>bytes</var> do not <a lt="Do bytes match metadataList?">match</a>
<var>request</var>'s <a for=request>integrity metadata</a>, then run
<var>processBodyError</var> and abort these steps. [[!SRI]]
Expand Down