Skip to content

Commit 76d7b66

Browse files
authored
Merge pull request #412 from ipfs/ipip-car-order-signaling
IPIP-412: Signaling Block Order in CARs on HTTP Gateways
2 parents 7545f1e + 0b1d0e2 commit 76d7b66

File tree

3 files changed

+383
-43
lines changed

3 files changed

+383
-43
lines changed

src/http-gateways/path-gateway.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -595,11 +595,7 @@ The following response types require an explicit opt-in, can only be requested w
595595
- Raw Block (`?format=raw`)
596596
- Opaque bytes, see [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw).
597597
- CAR (`?format=car`)
598-
- A CAR file or a stream that contains all blocks required to trustlessly verify the requested content path query, see [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) and :cite[trustless-gateway].
599-
- **Note:** by default, block order in CAR response is not deterministic,
600-
blocks can be returned in different order, depending on implementation
601-
choices (traversal, speed at which blocks arrive from the network, etc).
602-
An opt-in ordered CAR responses MAY be introduced in a future IPIP.
598+
- A CAR file or a stream that contains all blocks required to trustlessly verify the requested content path query, see [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) and Section 5 (CAR Responses) at :cite[trustless-gateway].
603599
- TAR (`?format=tar`)
604600
- Deserialized UnixFS files and directories as a TAR file or a stream, see :cite[ipip-0288].
605601
- IPNS Record

src/http-gateways/trustless-gateway.md

Lines changed: 176 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@ editors:
1313
- name: Henrique Dias
1414
github: hacdias
1515
url: https://hacdias.com/
16+
xref:
17+
- url
18+
- path-gateway
19+
- ipip-0412
1620
tags: ['httpGateways', 'lowLevelHttpGateways']
1721
order: 1
1822
---
@@ -25,31 +29,31 @@ The minimal implementation means:
2529

2630
- response type is always fully verifiable: client can decide between a raw block or a CAR stream
2731
- no UnixFS/IPLD deserialization
28-
- for CAR files:
29-
- the behavior is identical to :cite[path-gateway]
3032
- for raw blocks:
3133
- data is requested by CID, only supported path is `/ipfs/{cid}`
3234
- no path traversal or recursive resolution
35+
- for CAR files:
36+
- the pathing behavior is identical to :cite[path-gateway]
3337

3438
# HTTP API
3539

3640
A subset of "HTTP API" of :cite[path-gateway].
3741

3842
## `GET /ipfs/{cid}[/{path}][?{params}]`
3943

40-
Downloads verifiable data for the specified **immutable** content path.
44+
Downloads verifiable, content-addressed data for the specified **immutable** content path.
4145

42-
Optional `path` is permitted for requests that specify CAR format (`application/vnd.ipld.car`).
46+
Optional `path` is permitted for requests that specify CAR format (`?format=car` or `Accept: application/vnd.ipld.car`).
4347

44-
For RAW requests, only `GET /ipfs/{cid}[?{params}]` is supported.
48+
For block requests (`?format=raw` or `Accept: application/vnd.ipld.raw`), only `GET /ipfs/{cid}[?{params}]` is supported.
4549

4650
## `HEAD /ipfs/{cid}[/{path}][?{params}]`
4751

4852
Same as GET, but does not return any payload.
4953

5054
## `GET /ipns/{key}[?{params}]`
5155

52-
Downloads data at specified IPNS Key. Verifiable :cite[ipns-record] can be requested via `?format=ipns-record`
56+
Downloads data at specified IPNS Key. Verifiable :cite[ipns-record] can be requested via `?format=ipns-record` or `Accept: application/vnd.ipfs.ipns-record`.
5357

5458
## `HEAD /ipns/{key}[?{params}]`
5559

@@ -63,17 +67,26 @@ Same as in :cite[path-gateway], but with limited number of supported response ty
6367

6468
### `Accept` (request header)
6569

66-
This HTTP header is required when running in a strict, trustless mode.
70+
A Client SHOULD send this HTTP header to leverage content type negotiation
71+
based on section 12.5.1 of :cite[rfc9110].
72+
73+
Below response types MUST be supported:
6774

68-
Below response types MUST to be supported:
69-
- [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw) – requests a single, verifiable raw block to be returned
75+
- [application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw)
76+
- A single, verifiable raw block to be returned.
7077

71-
Below response types SHOULD to be supported:
72-
- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car) – disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be returned
73-
- [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record) – requests a verifiable :cite[ipns-record] (multicodec `0x0300`).
78+
Below response types SHOULD be supported:
7479

75-
Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless
76-
mode (no deserialized responses) and `Accept` header is missing.
80+
- [application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car)
81+
- Disables IPLD/IPFS deserialization, requests a verifiable CAR stream to be
82+
returned, implementations MAY support optional CAR content type parameters
83+
(:cite[ipip-0412]) and the explicit [CAR format signaling in HTTP Request](#car-format-signaling-in-request).
84+
85+
- [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record)
86+
- A verifiable :cite[ipns-record] (multicodec `0x0300`).
87+
88+
A Gateway SHOULD return HTTP 400 Bad Request when running in strict trustless
89+
mode (no deserialized responses) and `Accept` header is missing.
7790

7891
## Request Query Parameters
7992

@@ -113,7 +126,7 @@ When the terminating entity at the end of the specified content path:
113126
specified byte range of that entity.
114127

115128
- When dealing with a sharded UnixFS file (`dag-pb`, `0x70`) and a non-zero
116-
`from` value, the UnixFS data and `blocksizes` determine the
129+
`from` value, the UnixFS data and `blocksizes` determine the
117130
corresponding starting block for a given `from` offset.
118131

119132
- cannot be interpreted as a continuous array of bytes (such as a DAG-CBOR/JSON
@@ -150,14 +163,14 @@ that includes enough blocks for the client to understand why the requested
150163
returned:
151164

152165
- If the requested `entity-bytes` resolves to a range that partially falls
153-
outside of the entity's byte range, the response MUST include the subset of
166+
outside the entity's byte range, the response MUST include the subset of
154167
blocks within the entity's bytes.
155168
- This allows clients to request valid ranges of the entity without needing
156169
to know its total size beforehand, and it does not require the Gateway to
157170
buffer the entire entity before returning the response.
158171

159172
- If the requested `entity-bytes` resolves to a zero-length range or falls
160-
fully outside of the entity's bytes, the response is equivalent to
173+
fully outside the entity's bytes, the response is equivalent to
161174
`dag-scope=block`.
162175
- This allows client to produce a meaningful error (e.g, in case of UnixFS,
163176
leverage `Data.blocksizes` information present in the root `dag-pb` block).
@@ -180,69 +193,194 @@ Below MUST be implemented **in addition** to "HTTP Response" of :cite[path-gatew
180193

181194
MUST be returned and include additional format-specific parameters when possible.
182195

183-
If a CAR stream was requested, the response MUST include the parameter specifying CAR version.
184-
For example: `Content-Type: application/vnd.ipld.car; version=1`
196+
If a CAR stream was requested:
197+
- the response MUST include the parameter specifying CAR version. For example:
198+
`Content-Type: application/vnd.ipld.car; version=1`
199+
- the response SHOULD include additional content type parameters, as noted in
200+
[CAR format signaling in Response](#car-format-signaling-in-response).
185201

186202
### `Content-Disposition` (response header)
187203

188204
MUST be returned and set to `attachment` to ensure requested bytes are not rendered by a web browser.
189205

190-
## Response Payload
191-
192-
### Block Response
206+
# Block Responses (application/vnd.ipld.raw)
193207

194208
An opaque bytes matching the requested block CID
195209
([application/vnd.ipld.raw](https://www.iana.org/assignments/media-types/application/vnd.ipld.raw)).
196210

197211
The Body hash MUST match the Multihash from the requested CID.
198212

199-
### CAR Response
213+
# CAR Responses (application/vnd.ipld.car)
200214

201215
A CAR stream for the requested
202216
[application/vnd.ipld.car](https://www.iana.org/assignments/media-types/application/vnd.ipld.car)
203-
content type, path and optional `dag-scope` and `entity-bytes` URL parameters.
217+
content type (with optional `order` and `dups` params), path and optional
218+
`dag-scope` and `entity-bytes` URL parameters.
204219

205-
#### CAR version
220+
## CAR version
206221

207222
Value returned in
208223
[`CarV1Header.version`](https://ipld.io/specs/transport/car/carv1/#header)
209224
field MUST match the `version` parameter returned in `Content-Type` header.
210225

211-
#### CAR roots
226+
## CAR roots
212227

213228
The behavior associated with the
214229
[`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field
215230
is not currently specified.
216231

217-
Clients MAY ignore it.
232+
The lack of standard here means a client MUST assume different Gateways could return a different value.
233+
234+
A Client SHOULD ignore this field.
218235

219236
:::issue
220237

221238
As of 2023-06-20, the behavior of the `roots` CAR field remains an [unresolved item within the CARv1 specification](https://web.archive.org/web/20230328013837/https://ipld.io/specs/transport/car/carv1/#unresolved-items).
222239

223240
:::
224241

225-
#### CAR determinism
242+
## CAR `order` (content type parameter)
243+
244+
The `order` parameter allows clients to specify the desired block order in the
245+
response. It supports the following values:
246+
247+
- `dfs`: [Depth-First Search](https://en.wikipedia.org/wiki/Depth-first_search)
248+
order, enables streaming responses with minimal memory usage.
249+
- `unk` (or missing): Unknown order, which serves as the implicit default when the `order`
250+
parameter is unspecified. In this case, the client cannot make any assumptions
251+
about the block order: blocks may arrive in a random order or be a result of
252+
a custom DAG traversal algorithm.
253+
254+
A Gateway SHOULD always return explicit `order` in CAR's `Content-Type` response header.
255+
256+
A Gateway MAY skip `order` in CAR response if no order was explicitly requested
257+
by the client and the default order is unknown.
258+
259+
A Client MUST assume implicit `order=unk` when `order` is missing, unknown, or empty.
260+
261+
## CAR `dups` (content type parameter)
262+
263+
The `dups` parameter specifies whether duplicate blocks (the same block
264+
occurring multiple times in the requested DAG) will be present in the CAR
265+
response. Useful when a deterministic block order is used.
266+
267+
It accepts two values:
268+
- `y`: Duplicate blocks MUST be sent every time they occur during the DAG walk.
269+
- `n`: Duplicate blocks MUST be sent only once.
270+
271+
When set to `y`, light clients are able to discard blocks after
272+
reading them, removing the need for caching in-memory or on-disk.
273+
274+
Setting to `n` allows for more efficient data transfer of certain types of
275+
data, but introduces additional resource cost on the receiving end, as each
276+
block needs to be kept around in case its CID appears again.
277+
278+
If the `dups` parameter is absent from the `Accept` request header, the
279+
behavior is unspecified. In such cases, a Gateway should respond with `dups=n`
280+
if it has control over the duplicate status, or without `dups` parameter if it
281+
does not.
282+
Defaulting to the inclusion of duplicate blocks (`dups=y`) SHOULD only be
283+
implemented by Gateway systems that exclusively support `dups=y` and do not
284+
support any other behavior.
285+
286+
A Client MUST not assume any implicit behavior when `dups` is missing.
287+
288+
If the `dups` parameter is absent from the `Content-Type` response header, the
289+
behavior is unspecified, and the CAR response includes an arbitrary list of
290+
blocks. In this unknown state, the client MUST assume duplicates are not sent,
291+
but also MUST ignore duplicates and other unexpected blocks if they are present.
292+
293+
A Gateway MUST always return `dups` in `Content-Type` response header
294+
when the duplicate status is known at the time of processing the request.
295+
A Gateway SHOULD not return `dups` if determining the duplicate status is not
296+
possible at the time of processing the request.
297+
298+
A Gateway MUST NOT include virtual blocks identified by identity CIDs
299+
(multihash with `0x00` code) in CAR responses. This exclusion applies regardless
300+
of their presence in the DAG or the value assigned to the "dups" parameter, as
301+
the raw data is already present in the parent block that links to the identity
302+
CID.
226303

227-
The default CAR header and block order in a CAR response is not specified and is non-deterministic.
304+
## CAR format parameters and determinism
305+
306+
The default header and block order in a CAR format is not specified by IPLD specifications.
228307

229308
Clients MUST NOT assume that CAR responses are deterministic (byte-for-byte identical) across different gateways.
230309

231310
Clients MUST NOT assume that CAR includes CIDs and their blocks in the same order across different gateways.
232311

312+
Clients MUST assume block order and duplicate status only if `Content-Type` returned with CAR responses includes optional `order` or `dups` parameters, as specified by :cite[ipip-0412].
313+
314+
A Gateway SHOULD support some aspects of determinism by implementing content type negotiation and signaling via `Accept` and `Content-Type` headers.
315+
233316
:::issue
234317

235-
In controlled environments, clients MAY choose to rely on undocumented CAR determinism,
236-
subject to the agreement of the following conditions between the client and the
237-
gateway:
318+
In controlled environments, clients MAY choose to rely on implicit and
319+
undocumented CAR determinism, subject to the agreement of the following
320+
conditions between the client and the gateway:
238321
- CAR version
239322
- content of [`CarV1Header.roots`](https://ipld.io/specs/transport/car/carv1/#header) field
240-
- order of blocks
241-
- status of duplicate blocks
323+
- order of blocks (`order` from :cite[ipip-0412])
324+
- status of duplicate blocks (`dups` from :cite[ipip-0412])
242325

243-
In the future, there may be an introduction of a convention to indicate aspects
244-
of determinism in CAR responses. Please refer to
245-
[IPIP-412](https://github.com/ipfs/specs/pull/412) for potential developments
246-
in this area.
326+
Mind this is undocumented behavior, and MUST NOT be used on public networks.
247327

248328
:::
329+
330+
### CAR format signaling in Request
331+
332+
Content type negotiation is based on section 12.5.1 of :cite[rfc9110].
333+
334+
Clients MAY indicate their preferred block order by sending an `Accept` header in
335+
the HTTP request. The `Accept` header format is as follows:
336+
337+
```
338+
Accept: application/vnd.ipld.car; version=1; order=dfs; dups=y
339+
```
340+
341+
In the future, when more orders or parameters exist, clients will be able to
342+
specify a list of preferences, for example:
343+
344+
```
345+
Accept: application/vnd.ipld.car;order=foo, application/vnd.ipld.car;order=dfs;dups=y;q=0.5
346+
```
347+
348+
The above example is a list of preferences, the client would really like to use
349+
the hypothetical `order=foo` however if this isn't available it would accept
350+
`order=dfs` with `dups=y` instead (lower priority indicated via `q` parameter,
351+
as noted in :cite[rfc9110]).
352+
353+
### CAR format signaling in Response
354+
355+
The Trustless Gateway MUST always respond with a `Content-Type` header that includes
356+
information about all supported and known parameters, even if the client did not
357+
specify them in the request.
358+
359+
The `Content-Type` header format is as follows:
360+
361+
```
362+
Content-Type: application/vnd.ipld.car;version=1;order=dfs;dups=n
363+
```
364+
365+
Gateway implementations SHOULD decide on the implicit default ordering or
366+
other parameters, and use it in responses when client did not explicitly
367+
specify any matching preference.
368+
369+
A Gateway MAY choose to implement only some parameters and return HTTP
370+
400 Bad Request or 406 Not Acceptable when a client requested a response with
371+
unsupported content type variant.
372+
373+
A Client MUST verify `Content-Type` returned with CAR response before
374+
processing the payload, as the legacy gateway may not support optional content
375+
type parameters like `order` an `dups` and return plain
376+
`application/vnd.ipld.car`.
377+
378+
# IPNS Record Responses (application/vnd.ipfs.ipns-record)
379+
380+
An opaque bytes matching the [Signed IPNS Record](https://specs.ipfs.tech/ipns/ipns-record/#ipns-record)
381+
for the requested [IPNS Name](https://specs.ipfs.tech/ipns/ipns-record/#ipns-name)
382+
returned as [application/vnd.ipfs.ipns-record](https://www.iana.org/assignments/media-types/application/vnd.ipfs.ipns-record).
383+
384+
A Client MUST confirm the record signature match `libp2p-key` from the requested IPNS Name.
385+
386+
A Client MUST [perform additional record verification according to the IPNS specification](https://specs.ipfs.tech/ipns/ipns-record/#record-verification).

0 commit comments

Comments
 (0)