@@ -110,61 +110,67 @@ <h2>Fetching RASL</h2>
110110 retrieved, without having to worry about operating any infrastructure
111111 beyond the web server they already have.
112112 </ p >
113- < div class ="flag ">
114- < p >
115- RASL retrieval works this way:
116- </ p >
117- < ul >
118- < li >
119- Obtain the [< a href ="#ref-cid " class ="ref "> cid</ a > ] by extracting the authority from the URL (or
120- whatever other way).
121- </ li >
122- < li >
123- If there are hints, you can use them as hosts to construct a
124- retrieval request from. But you don't have to.
125- </ li >
126- < li >
127- Constructing a request works by constructing an HTTPS URL this way:
128- < ul >
129- < li > Always use < code > https</ code > </ li >
130- < li > Use the host you have (from hint or yours)</ li >
131- < li > Path is < code > /.well-known/rasl/${cid}</ code > </ li >
132- < li > No further pathing information is provided</ li >
133- </ ul >
134- </ li >
135- < li >
136- Use that URL to make a stateless HTTP request (no cookies, nothing
137- gets saved), don't use conneg, just the most vanilla side-effect free
138- < code > GET</ code > that money can buy.
139- </ li >
140- <!--
141- - also support HEAD
142- -->
143- < li >
144- The < code > .well-known</ code > path may redirect, so be ready to handle
145- that. This makes it possible to create sites that are published
146- the usual way and to have a RASL that is simply a redirect to the
147- resource. So for instance, you may have an existing
148- < code > https://berjon.com/kitten.jpg</ code > the CID for which is
149- < code > bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</ code > .
150- This can be published as this RASL URL:
151- < code > web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com/</ code > .
152- A client can retrieve it by constructing the a request to this URL:
153- < code > https://berjon.com/.well-known/rasl/bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</ code > .
154- In turn, the latter may simply 307 back to < code > https://berjon.com/kitten.jpg</ code > .
155- (Yes, this is HTTP with extra steps, but the extra steps get you
156- self-certifying content.)
157- </ li >
158- < li >
159- If there's a redirect and it's not a 307, the client should treat
160- it as such anyway.
161- </ li >
162- < li >
163- Note that the response media type for ALL RASL requests is < code > application/octet-stream</ code > .
164- This is done explicitly to avoid people using RASL endpoints to serve sites directly.
165- </ li >
166- </ ul >
167- </ div >
113+ < p >
114+ Use the following steps to < dfn id ="dfn-fetch-a-rasl-url "> fetch a RASL URL</ dfn > :
115+ </ p >
116+ < ol >
117+ < li > Accept a string < var > url</ var > and parse it according to the steps to < a href ="#dfn-parse-a-rasl-url " class ="dfn-ref "> parse a RASL URL</ a > .</ li >
118+ < li >
119+ Construct a < var > request</ var > using < var > cid</ var > from the < var > url</ var > as well as < var > hints</ var > that may
120+ be from the URL or from elsewhere (this is entirely up to you):
121+ < ol >
122+ < li >
123+ For each hint, construct a request URL that is the concatenation of < code > https://</ code > ,
124+ the hint as host, < code > /.well-known/rasl/</ code > , and the < var > cid</ var > .
125+ </ li >
126+ < li >
127+ Prepare the request such that it has a method of either < code > GET</ code > or < code > HEAD</ code > ,
128+ that it is stateless (no cookies, no credentials of any kind), and that it uses no content
129+ negotiation.
130+ </ li >
131+ </ ol >
132+ </ li >
133+ < li >
134+ Fetch the < var > request</ var > s. How these get prioritised is entirely up to the implementation. It
135+ is common to run them all in parallel and abort them with the first success response.
136+ Note that the < code > .well-known</ code > path may redirect, so be ready to handle
137+ that. This makes it possible to create sites that are published
138+ the usual way and to have a RASL that is simply a redirect to the
139+ resource. So for instance, you may have an existing
140+ < code > https://berjon.com/kitten.jpg</ code > the CID for which is
141+ < code > bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</ code > .
142+ This can be published as this RASL URL:
143+ < code > web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com/</ code > .
144+ A client can retrieve it by constructing the a request to this URL:
145+ < code > https://berjon.com/.well-known/rasl/bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</ code > .
146+ In turn, the latter may simply 307 back to < code > https://berjon.com/kitten.jpg</ code > .
147+ (Yes, this is HTTP with extra steps, but the extra steps get you
148+ self-certifying content.)
149+ </ li >
150+ < li >
151+ If the response is a redirect but not a 307, the client should treat it as if it
152+ had been a 307 anyway.
153+ </ li >
154+ < li >
155+ If none of the responses are successful, return failure.
156+ </ li >
157+ < li >
158+ Set the response's media type to < code > application/octet-stream</ code > . (The server should have
159+ done that already, but may not have done so, notably if it relied on a redirect.) The purpose
160+ of RASL is to retrieve data in ways that are independent of the server — any media type
161+ processing must therefore take place at another layer. Without this, we lose the self-certifying
162+ nature of the system. (Note that servers are encouraged to enforce that so as not to have their
163+ RASL endpoints used for general-purpose web serving, which can be a security vector depending on
164+ where the data being served came from.)
165+ </ li >
166+ < li >
167+ Produce a CID for the retrieved data. If that CID does not match the requested < var > cid</ var > ,
168+ return failure.
169+ </ li >
170+ < li >
171+ Return the data.
172+ </ li >
173+ </ ol >
168174 </ section >
169175 </ section >
170176 < section >
0 commit comments