Skip to content

Commit c33a96f

Browse files
authored
Merge pull request #310 from WICG/access-handle-proposal
Add AccessHandle proposal
2 parents c3ef0e2 + e88a265 commit c33a96f

File tree

1 file changed

+277
-0
lines changed

1 file changed

+277
-0
lines changed

AccessHandle.md

Lines changed: 277 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,277 @@
1+
# AccessHandle Proposal
2+
3+
## Authors:
4+
5+
* Emanuel Krivoy ([email protected])
6+
* Richard Stotz ([email protected])
7+
8+
## Participate
9+
10+
* [Issue tracker](https://github.com/WICG/file-system-access/issues)
11+
12+
## Table of Contents
13+
14+
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
15+
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
16+
17+
- [Introduction](#introduction)
18+
- [Goals & Use Cases](#goals--use-cases)
19+
- [Non-goals](#non-goals)
20+
- [What makes the new surface fast?](#what-makes-the-new-surface-fast)
21+
- [Proposed API](#proposed-api)
22+
- [New data access surface](#new-data-access-surface)
23+
- [Locking semantics](#locking-semantics)
24+
- [Open Questions](#open-questions)
25+
- [Naming](#naming)
26+
- [Assurances on non-awaited consistency](#assurances-on-non-awaited-consistency)
27+
- [Appendix](#appendix)
28+
- [AccessHandle IDL](#accesshandle-idl)
29+
- [References & acknowledgements](#references--acknowledgements)
30+
31+
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
32+
33+
## Introduction
34+
35+
We propose augmenting the Origin Private File System (OPFS) with a new surface
36+
that brings very performant access to data. This new surface differs from
37+
existing ones by offering in-place and exclusive write access to a file’s
38+
content. This change, along with the ability to consistently read unflushed
39+
modifications and the availability of a synchronous variant on dedicated
40+
workers, significantly improves performance and unblocks new use cases for the
41+
File System Access API.
42+
43+
More concretely, we would add a *createAccessHandle()* method to the
44+
*FileSystemFileHandle* object. It would return an *AccessHandle* that contains
45+
a [duplex stream](https://streams.spec.whatwg.org/#other-specs-duplex) and
46+
auxiliary methods. The readable/writable pair in the duplex stream communicates
47+
with the same backing file, allowing the user to read unflushed contents.
48+
Another new method, *createSyncAccessHandle()*, would only be exposed on Worker
49+
threads. This method would offer a more buffer-based surface with synchronous
50+
reading and writing. The creation of AccessHandle also creates a lock that
51+
prevents write access to the file across (and within the same) execution
52+
contexts.
53+
54+
This proposal is part of our effort to integrate [Storage Foundation
55+
API](https://github.com/WICG/storage-foundation-api-explainer) into File System
56+
Access API. For more context the origins of this proposal, and alternatives
57+
considered, please check out: [Merging Storage Foundation API and the Origin
58+
Private File
59+
System](https://docs.google.com/document/d/121OZpRk7bKSF7qU3kQLqAEUVSNxqREnE98malHYwWec),
60+
[Recommendation for Augmented
61+
OPFS](https://docs.google.com/document/d/1g7ZCqZ5NdiU7oqyCpsc2iZ7rRAY1ZXO-9VoG4LfP7fM).
62+
63+
## Goals & Use Cases
64+
65+
Our goal is to give developers flexibility by providing generic, simple, and
66+
performant primitives upon which they can build higher-level storage
67+
components. The new surface is particularly well suited for Wasm-based
68+
libraries and applications that want to use custom storage algorithms to
69+
fine-tune execution speed and memory usage.
70+
71+
A few examples of what could be done with *AccessHandles*:
72+
73+
* Distribute a performant Wasm port of SQLite. This gives developers the
74+
ability to use a persistent and fast SQL engine without having to rely on
75+
the deprecated WebSQL API.
76+
* Allow a music production website to operate on large amounts of media, by
77+
relying on the new surface's performance and direct buffered access to
78+
offload sound segments to disk instead of holding them in memory.
79+
* Provide a fast and persistent [Emscripten](https://emscripten.org/)
80+
filesystem to act as generic and easily accessible storage for Wasm.
81+
82+
## Non-goals
83+
84+
This proposal is focused only on additions to the [Origin Private File
85+
System](https://wicg.github.io/file-system-access/#sandboxed-filesystem), and
86+
doesn't currently consider changes to the rest of File System Access API or how
87+
files in the host machine are accessed.
88+
89+
## What makes the new surface fast?
90+
91+
There are a few design choices that primarily contribute to the performance of
92+
AccessHandles:
93+
94+
* Write operations are not guaranteed to be immediately persistent, rather
95+
persistency is achieved through calls to *flush()*. At the same time, data
96+
can be consistently read before flushing. This allows applications to only
97+
schedule time consuming flushes when they are required for long-term data
98+
storage, and not as a precondition to operate on recently written data.
99+
* The exclusive write lock held by the AccessHandle saves implementations
100+
from having to provide a central data access point across execution
101+
contexts. In multi-process browsers, such as Chrome, this helps avoid costly
102+
inter-process communication (IPCs) between renderer and browser processes.
103+
* Data copies are avoided when reading or writing. In the async surface this
104+
is achieved through SharedArrayBuffers and BYOB readers. In the sync
105+
surface, we rely on user-allocated buffers to hold the data.
106+
107+
For more information on what affects the performance of similar storage APIs,
108+
see [Design considerations for the Storage Foundation
109+
API](https://docs.google.com/document/d/1cOdnvuNIWWyJHz1uu8K_9DEgntMtedxfCzShI7d01cs)
110+
111+
## Proposed API
112+
113+
### New data access surface
114+
115+
```javascript
116+
// In all contexts
117+
// For details on the `mode` parameter see "Exposing AccessHandles on all
118+
// filesystems" below
119+
const handle = await file.createAccessHandle({ mode: 'in-place' });
120+
await handle.writable.getWriter().write(buffer);
121+
const reader = handle.readable.getReader({mode: "byob"});
122+
// Assumes seekable streams, and SharedArrayBuffer support are available
123+
await reader.read(buffer, {at: 1});
124+
125+
// Only in a worker context
126+
const handle = await file.createSyncAccessHandle();
127+
var writtenBytes = handle.write(buffer);
128+
var readBytes = handle.read(buffer {at: 1});
129+
```
130+
131+
As mentioned above, a new *createAccessHandle()* method would be added to
132+
*FileSystemFileHandle*. Another method, *createSyncAccessHandle()*, would be
133+
only exposed on Worker threads. An IDL description of the new interface can be
134+
found in the [Appendix](#appendix).
135+
136+
The reason for offering a Worker-only synchronous interface, is that consuming
137+
asynchronous APIs from Wasm has severe performance implications (more details
138+
[here](https://docs.google.com/document/d/1lsQhTsfcVIeOW80dr467Auud_VCeAUv2ZOkC63oSyKo)).
139+
Since this overhead is most impactful on methods that are called often, we've
140+
only made *read()* and *write()* synchronous. This allows us to keep a simpler
141+
mental model (where the sync and async handle are identical, except reading and
142+
writing) and reduce the number of new sync methods, while avoiding the most
143+
important perfomance penalties.
144+
145+
This proposal assumes that [seekable
146+
streams](https://github.com/whatwg/streams/issues/1128) will be available. If
147+
this doesn’t happen, we can emulate the seeking behavior by extending the
148+
default reader and writer with a *seek()* method.
149+
150+
### Locking semantics
151+
152+
```javascript
153+
const handle1 = await file.createAccessHandle({ mode: 'in-place' });
154+
try {
155+
const handle2 = await file.createAccessHandle({ mode: 'in-place' });
156+
} catch(e) {
157+
// This catch will always be executed, since there is an open access handle
158+
}
159+
await handle1.close();
160+
// Now a new access handle may be created
161+
```
162+
163+
*createAccessHandle()* would take an exclusive write lock on the file that
164+
prevents the creation of any other access handles or *WritableFileStreams*.
165+
Similarly *createWritable()* would take a shared write lock that blocks the
166+
creation of access handles, but not of other writable streams. This prevents
167+
the file from being modified from multiple contexts, while still being
168+
backwards compatible with the current OPFS spec and supporting multiple
169+
*WritableFileStreams* at once.
170+
171+
Creating a [File](https://www.w3.org/TR/FileAPI/#dfn-file) through *getFile()*
172+
would be possible when a lock is in place. The returned File behaves as it
173+
currently does in OPFS i.e., it is invalidated if file contents are changed
174+
after it was created. It is worth noting that these Files could be used to
175+
observe changes done through the new API, even if a lock is still being held.
176+
177+
## Open Questions
178+
179+
### Naming
180+
181+
The exact name of the new methods hasn’t been defined. The current placeholder
182+
for data access is *createAccessHandle()* and *createSyncAccessHandle()*.
183+
*createUnflushedStreams()* and *createDuplexStream()* have been suggested.
184+
185+
### Exposing AccessHandles on all filesystems
186+
187+
This proposal only currently considers additions to OPFS, but it would probably
188+
be worthwhile to expand the new functionality to arbitrary file handles. While
189+
the exact behavior of *AccessHandles* outside of OPFS would need to be defined
190+
in detail, it's almost certain that the one described in this proposal should
191+
not be the default. To avoid setting it as such, we propose adding an optional
192+
*mode* string parameter to *createAccessHandle()* and
193+
*createSyncAccessHandle()*. Some possible values *mode* could take are:
194+
195+
* 'shared': The current behavior seen in File System Access API in general,
196+
there is no locking and modifications are atomic (meaning that they would
197+
only actually change the file when the *AccessHandle* is closed). This mode
198+
would be a safe choice as a default value.
199+
* 'exclusive': An exclusive write lock is taken on the file, but modifications
200+
are still atomic. This is a useful mode for developers that want to
201+
coordinate various writing threads but still want "all or nothing" writes.
202+
* 'in-place': The behavior described in this proposal, allowing developers to
203+
use high performance access to files at the cost of not having atomic writes.
204+
It's possible that this mode would only be allowed in OPFS.
205+
206+
Both the naming and semantics of the *mode* parameter have to be more concretely
207+
defined.
208+
209+
### Assurances on non-awaited consistency
210+
211+
It would be possible to clearly specify the behavior of an immediate async read
212+
operation after a non-awaited write operation, by serializing file operations
213+
(as is currently done in Storage Foundation API). We should decide if this is
214+
convenient, both from a specification and performance point of view.
215+
216+
## Appendix
217+
218+
### AccessHandle IDL
219+
220+
```webidl
221+
interface FileSystemFileHandle : FileSystemHandle {
222+
Promise<File> getFile();
223+
Promise<FileSystemWritableFileStream> createWritable(optional FileSystemCreateWritableOptions options = {});
224+
225+
Promise<FileSystemAccessHandle> createAccessHandle(optional FileSystemFileHandleCreateAccessHandleOptions options = {});
226+
[Exposed=DedicatedWorker]
227+
Promise<FileSystemSyncAccessHandle> createSyncAccessHandle(optional FileSystemFileHandleCreateAccessHandleOptions options = {});
228+
};
229+
230+
dictionary FileSystemFileHandleCreateAccessHandleOptions {
231+
AccessHandleMode mode;
232+
};
233+
234+
// For more details and possible modes, see "Exposing AccessHandles on all
235+
// filesystems" above
236+
enum AccessHandleMode { "in-place" };
237+
238+
interface FileSystemAccessHandle {
239+
// Assumes seekable streams are available. The
240+
// Seekable extended attribute is ad-hoc notation for this proposal.
241+
[Seekable] readonly attribute WritableStream writable;
242+
[Seekable] readonly attribute ReadableStream readable;
243+
244+
// Resizes the file to be size bytes long. If size is larger than the current
245+
// size the file is padded with null bytes, otherwise it is truncated.
246+
Promise<undefined> truncate([EnforceRange] unsigned long long size);
247+
// Returns the current size of the file.
248+
Promise<unsigned long long> getSize();
249+
// Persists the changes that have been written to disk
250+
Promise<undefined> flush();
251+
// Flushes and closes the streams, then releases the lock on the file
252+
Promise<undefined> close();
253+
};
254+
255+
[Exposed=DedicatedWorker]
256+
interface FileSystemSyncAccessHandle {
257+
unsigned long long read([AllowShared] BufferSource buffer,
258+
FilesystemReadWriteOptions options);
259+
unsigned long long write([AllowShared] BufferSource buffer,
260+
FilesystemReadWriteOptions options);
261+
262+
Promise<undefined> truncate([EnforceRange] unsigned long long size);
263+
Promise<unsigned long long> getSize();
264+
Promise<undefined> flush();
265+
Promise<undefined> close();
266+
};
267+
268+
dictionary FilesystemReadWriteOptions {
269+
[EnforceRange] unsigned long long at;
270+
};
271+
```
272+
273+
## References & acknowledgements
274+
275+
Many thanks for valuable feedback and advice from:
276+
277+
Domenic Denicola, Marijn Kruisselbrink, Victor Costan

0 commit comments

Comments
 (0)