Skip to content

Commit 7700949

Browse files
author
Robin Berjon
committed
done with RASL
1 parent 9a75ef0 commit 7700949

File tree

2 files changed

+122
-110
lines changed

2 files changed

+122
-110
lines changed

rasl.html

Lines changed: 61 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -110,61 +110,67 @@ <h2>Fetching RASL</h2>
110110
retrieved, without having to worry about operating any infrastructure
111111
beyond the web server they already have.
112112
</p>
113-
<div class="flag">
114-
<p>
115-
RASL retrieval works this way:
116-
</p>
117-
<ul>
118-
<li>
119-
Obtain the [<a href="#ref-cid" class="ref">cid</a>] by extracting the authority from the URL (or
120-
whatever other way).
121-
</li>
122-
<li>
123-
If there are hints, you can use them as hosts to construct a
124-
retrieval request from. But you don't have to.
125-
</li>
126-
<li>
127-
Constructing a request works by constructing an HTTPS URL this way:
128-
<ul>
129-
<li>Always use <code>https</code></li>
130-
<li>Use the host you have (from hint or yours)</li>
131-
<li>Path is <code>/.well-known/rasl/${cid}</code></li>
132-
<li>No further pathing information is provided</li>
133-
</ul>
134-
</li>
135-
<li>
136-
Use that URL to make a stateless HTTP request (no cookies, nothing
137-
gets saved), don't use conneg, just the most vanilla side-effect free
138-
<code>GET</code> that money can buy.
139-
</li>
140-
<!--
141-
- also support HEAD
142-
-->
143-
<li>
144-
The <code>.well-known</code> path may redirect, so be ready to handle
145-
that. This makes it possible to create sites that are published
146-
the usual way and to have a RASL that is simply a redirect to the
147-
resource. So for instance, you may have an existing
148-
<code>https://berjon.com/kitten.jpg</code> the CID for which is
149-
<code>bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code>.
150-
This can be published as this RASL URL:
151-
<code>web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com/</code>.
152-
A client can retrieve it by constructing the a request to this URL:
153-
<code>https://berjon.com/.well-known/rasl/bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code>.
154-
In turn, the latter may simply 307 back to <code>https://berjon.com/kitten.jpg</code>.
155-
(Yes, this is HTTP with extra steps, but the extra steps get you
156-
self-certifying content.)
157-
</li>
158-
<li>
159-
If there's a redirect and it's not a 307, the client should treat
160-
it as such anyway.
161-
</li>
162-
<li>
163-
Note that the response media type for ALL RASL requests is <code>application/octet-stream</code>.
164-
This is done explicitly to avoid people using RASL endpoints to serve sites directly.
165-
</li>
166-
</ul>
167-
</div>
113+
<p>
114+
Use the following steps to <dfn id="dfn-fetch-a-rasl-url">fetch a RASL URL</dfn>:
115+
</p>
116+
<ol>
117+
<li>Accept a string <var>url</var> and parse it according to the steps to <a href="#dfn-parse-a-rasl-url" class="dfn-ref">parse a RASL URL</a>.</li>
118+
<li>
119+
Construct a <var>request</var> using <var>cid</var> from the <var>url</var> as well as <var>hints</var> that may
120+
be from the URL or from elsewhere (this is entirely up to you):
121+
<ol>
122+
<li>
123+
For each hint, construct a request URL that is the concatenation of <code>https://</code>,
124+
the hint as host, <code>/.well-known/rasl/</code>, and the <var>cid</var>.
125+
</li>
126+
<li>
127+
Prepare the request such that it has a method of either <code>GET</code> or <code>HEAD</code>,
128+
that it is stateless (no cookies, no credentials of any kind), and that it uses no content
129+
negotiation.
130+
</li>
131+
</ol>
132+
</li>
133+
<li>
134+
Fetch the <var>request</var>s. How these get prioritised is entirely up to the implementation. It
135+
is common to run them all in parallel and abort them with the first success response.
136+
Note that the <code>.well-known</code> path may redirect, so be ready to handle
137+
that. This makes it possible to create sites that are published
138+
the usual way and to have a RASL that is simply a redirect to the
139+
resource. So for instance, you may have an existing
140+
<code>https://berjon.com/kitten.jpg</code> the CID for which is
141+
<code>bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code>.
142+
This can be published as this RASL URL:
143+
<code>web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com/</code>.
144+
A client can retrieve it by constructing the a request to this URL:
145+
<code>https://berjon.com/.well-known/rasl/bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code>.
146+
In turn, the latter may simply 307 back to <code>https://berjon.com/kitten.jpg</code>.
147+
(Yes, this is HTTP with extra steps, but the extra steps get you
148+
self-certifying content.)
149+
</li>
150+
<li>
151+
If the response is a redirect but not a 307, the client should treat it as if it
152+
had been a 307 anyway.
153+
</li>
154+
<li>
155+
If none of the responses are successful, return failure.
156+
</li>
157+
<li>
158+
Set the response's media type to <code>application/octet-stream</code>. (The server should have
159+
done that already, but may not have done so, notably if it relied on a redirect.) The purpose
160+
of RASL is to retrieve data in ways that are independent of the server — any media type
161+
processing must therefore take place at another layer. Without this, we lose the self-certifying
162+
nature of the system. (Note that servers are encouraged to enforce that so as not to have their
163+
RASL endpoints used for general-purpose web serving, which can be a security vector depending on
164+
where the data being served came from.)
165+
</li>
166+
<li>
167+
Produce a CID for the retrieved data. If that CID does not match the requested <var>cid</var>,
168+
return failure.
169+
</li>
170+
<li>
171+
Return the data.
172+
</li>
173+
</ol>
168174
</section>
169175
</section>
170176
<section>

rasl.src.html

Lines changed: 61 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -110,61 +110,67 @@ <h2>Fetching RASL</h2>
110110
retrieved, without having to worry about operating any infrastructure
111111
beyond the web server they already have.
112112
</p>
113-
<div class="flag">
114-
<p>
115-
RASL retrieval works this way:
116-
</p>
117-
<ul>
118-
<li>
119-
Obtain the [[cid]] by extracting the authority from the URL (or
120-
whatever other way).
121-
</li>
122-
<li>
123-
If there are hints, you can use them as hosts to construct a
124-
retrieval request from. But you don't have to.
125-
</li>
126-
<li>
127-
Constructing a request works by constructing an HTTPS URL this way:
128-
<ul>
129-
<li>Always use <code>https</code></li>
130-
<li>Use the host you have (from hint or yours)</li>
131-
<li>Path is <code>/.well-known/rasl/${cid}</code></li>
132-
<li>No further pathing information is provided</li>
133-
</ul>
134-
</li>
135-
<li>
136-
Use that URL to make a stateless HTTP request (no cookies, nothing
137-
gets saved), don't use conneg, just the most vanilla side-effect free
138-
<code>GET</code> that money can buy.
139-
</li>
140-
<!--
141-
- also support HEAD
142-
-->
143-
<li>
144-
The <code>.well-known</code> path may redirect, so be ready to handle
145-
that. This makes it possible to create sites that are published
146-
the usual way and to have a RASL that is simply a redirect to the
147-
resource. So for instance, you may have an existing
148-
<code>https://berjon.com/kitten.jpg</code> the CID for which is
149-
<code>bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code>.
150-
This can be published as this RASL URL:
151-
<code>web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com/</code>.
152-
A client can retrieve it by constructing the a request to this URL:
153-
<code>https://berjon.com/.well-known/rasl/bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code>.
154-
In turn, the latter may simply 307 back to <code>https://berjon.com/kitten.jpg</code>.
155-
(Yes, this is HTTP with extra steps, but the extra steps get you
156-
self-certifying content.)
157-
</li>
158-
<li>
159-
If there's a redirect and it's not a 307, the client should treat
160-
it as such anyway.
161-
</li>
162-
<li>
163-
Note that the response media type for ALL RASL requests is <code>application/octet-stream</code>.
164-
This is done explicitly to avoid people using RASL endpoints to serve sites directly.
165-
</li>
166-
</ul>
167-
</div>
113+
<p>
114+
Use the following steps to <dfn>fetch a RASL URL</dfn>:
115+
</p>
116+
<ol>
117+
<li>Accept a string <var>url</var> and parse it according to the steps to <a>parse a RASL URL</a>.</li>
118+
<li>
119+
Construct a <var>request</var> using <var>cid</var> from the <var>url</var> as well as <var>hints</var> that may
120+
be from the URL or from elsewhere (this is entirely up to you):
121+
<ol>
122+
<li>
123+
For each hint, construct a request URL that is the concatenation of <code>https://</code>,
124+
the hint as host, <code>/.well-known/rasl/</code>, and the <var>cid</var>.
125+
</li>
126+
<li>
127+
Prepare the request such that it has a method of either <code>GET</code> or <code>HEAD</code>,
128+
that it is stateless (no cookies, no credentials of any kind), and that it uses no content
129+
negotiation.
130+
</li>
131+
</ol>
132+
</li>
133+
<li>
134+
Fetch the <var>request</var>s. How these get prioritised is entirely up to the implementation. It
135+
is common to run them all in parallel and abort them with the first success response.
136+
Note that the <code>.well-known</code> path may redirect, so be ready to handle
137+
that. This makes it possible to create sites that are published
138+
the usual way and to have a RASL that is simply a redirect to the
139+
resource. So for instance, you may have an existing
140+
<code>https://berjon.com/kitten.jpg</code> the CID for which is
141+
<code>bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code>.
142+
This can be published as this RASL URL:
143+
<code>web+rasl://bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4;berjon.com/</code>.
144+
A client can retrieve it by constructing the a request to this URL:
145+
<code>https://berjon.com/.well-known/rasl/bafkreifn5yxi7nkftsn46b6x26grda57ict7md2xuvfbsgkiahe2e7vnq4</code>.
146+
In turn, the latter may simply 307 back to <code>https://berjon.com/kitten.jpg</code>.
147+
(Yes, this is HTTP with extra steps, but the extra steps get you
148+
self-certifying content.)
149+
</li>
150+
<li>
151+
If the response is a redirect but not a 307, the client should treat it as if it
152+
had been a 307 anyway.
153+
</li>
154+
<li>
155+
If none of the responses are successful, return failure.
156+
</li>
157+
<li>
158+
Set the response's media type to <code>application/octet-stream</code>. (The server should have
159+
done that already, but may not have done so, notably if it relied on a redirect.) The purpose
160+
of RASL is to retrieve data in ways that are independent of the server — any media type
161+
processing must therefore take place at another layer. Without this, we lose the self-certifying
162+
nature of the system. (Note that servers are encouraged to enforce that so as not to have their
163+
RASL endpoints used for general-purpose web serving, which can be a security vector depending on
164+
where the data being served came from.)
165+
</li>
166+
<li>
167+
Produce a CID for the retrieved data. If that CID does not match the requested <var>cid</var>,
168+
return failure.
169+
</li>
170+
<li>
171+
Return the data.
172+
</li>
173+
</ol>
168174
</section>
169175
</section>
170176
<section>

0 commit comments

Comments
 (0)