Skip to content

Commit 6c5f744

Browse files
authored
Update documentation (#794)
1 parent c6d1c8b commit 6c5f744

File tree

5 files changed

+15
-15
lines changed

5 files changed

+15
-15
lines changed

docs/source/code-of-conduct.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,8 +42,8 @@ incident to the fsspec core team.
4242
Reporting
4343
---------
4444

45-
If you believe someone is violating theCode of Conduct we ask that you report it
46-
to the Project by emailing community@anaconda.com. All reports will be kept
45+
If you believe someone is violating the Code of Conduct we ask that you report it
46+
to the Project by emailing community@anaconda.com. All reports will be kept
4747
confidential. In some cases we may determine that a public statement will need
4848
to be made. If that's the case, the identities of all victims and reporters
4949
will remain confidential unless those individuals instruct us otherwise.
@@ -93,7 +93,7 @@ Following this declaration, they will not be provided with any confidential
9393
details from the reporter.
9494

9595
Once the working group has a complete account of the events they will make a
96-
decision as to how to response. Responses may include:
96+
decision as to how to respond. Responses may include:
9797

9898
- Nothing (if we determine no violation occurred).
9999
- A private reprimand from the working group to the individual(s) involved.

docs/source/fuse.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ expect exceptions.
4949
Furthermore:
5050

5151
- although mutation operations tentatively work, you should not at the moment
52-
depend on gcsfuse as a reliable system that won't loose your data.
52+
depend on gcsfuse as a reliable system that won't lose your data.
5353

5454
- permissions on GCS are complicated, so all files will be shown as fully-open
5555
0o777, regardless of state. If a read fails, you likely don't have the right

docs/source/hns_buckets.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ Important Differences to Keep in Mind
5353
While ``gcsfs`` aims to abstract the differences via the ``fsspec`` API, you should be aware of standard HNS limitations imposed by the Google Cloud Storage API:
5454

5555
1. **Implicit directories:** In standard GCS, you can create an object ``a/b/c.txt`` without the directories ``a/`` or ``a/b/`` physically existing. In HNS, the parent folder resources must exist (or be created) before the object can be written. ``gcsfs`` handles parent folder creation natively under the hood.
56-
2. **``mkdir`` behavior:** Previously, in a flat namespace, calling ``mkdir`` on a path could only ensure the underlying bucket exists. With HNS enabled, calling ``mkdir`` will create an actual folder resource in GCS. Furthermore, if you want to create nested folders (eg: bucket/a/b/c/d) pass ``create_parents=True``, it will physically create all intermediate folder resources along the specified path.
56+
2. **``mkdir`` behavior:** Previously, in a flat namespace, calling ``mkdir`` on a path could only ensure the underlying bucket exists. With HNS enabled, calling ``mkdir`` will create an actual folder resource in GCS. Furthermore, if you want to create nested folders (eg: bucket/a/b/c/d), pass ``create_parents=True``, it will physically create all intermediate folder resources along the specified path.
5757
3. **No mixing or toggling:** You cannot toggle HNS on an existing flat-namespace bucket. You must create a new HNS bucket and migrate your data.
5858
4. **Object naming:** Object names in HNS cannot end with a slash (``/``) unless without the creation of physical folder resources.
5959
5. **Rename Operation Benchmarks**

docs/source/index.rst

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ Async
171171
-----
172172

173173
``gcsfs`` is implemented using ``aiohttp``, and offers async functionality.
174-
A number of methods of ``GCSFileSystem`` are ``async``, for for each of these,
174+
A number of methods of ``GCSFileSystem`` are ``async``, and for each of these,
175175
there is also a synchronous version with the same name and lack of a "_"
176176
prefix.
177177

@@ -195,10 +195,10 @@ from normal code. If you are *not*
195195
using async-style programming, you do not need to know about how this
196196
works, but you might find the implementation interesting.
197197

198-
For every synchronous function there is asynchronous one prefixed by ``_``, but
198+
For every synchronous function there is an asynchronous one prefixed by ``_``, but
199199
the ``open`` operation does not support async operation. If you need it to open
200-
some file in async manner, it's better to asynchronously download it to
201-
temporary location and working with it from there.
200+
some file in an async manner, it's better to asynchronously download it to
201+
a temporary location and work with it from there.
202202

203203
Proxy
204204
-----

docs/source/rapid_storage_support.rst

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -99,14 +99,14 @@ The table below highlights how core filesystem and file-level operations change
9999
- Closes streams but leaves the object unfinalized (appendable) by default. Use ``finalize_on_close=True`` when opening file or calling ``close()`` or use ``.commit()`` to finalize. Note that ``autocommit`` does not work for Rapid buckets.
100100
* - **mv**
101101
- Object-level copy-and-delete logic.
102-
- Uses native, atomic ``rename_folder`` API for folders. All directory semantics described in the :doc:`HNS documentation <hns_buckets>` also apply For Rapid.
102+
- Uses native, atomic ``rename_folder`` API for folders. All directory semantics described in the :doc:`HNS documentation <hns_buckets>` also apply for Rapid.
103103

104104
Performance Benchmarks
105105
----------------------
106106

107107
Rapid Storage via gRPC significantly improves read and write performance compared to standard HTTP regional buckets.
108-
Here are the microbenchmarks
109-
Rapid drastically outperform standard buckets across different read patterns, including both sequential and random reads, as well as for writes.
108+
Here are the microbenchmarks.
109+
Rapid drastically outperforms standard buckets across different read patterns, including both sequential and random reads, as well as for writes.
110110
To reproduce using more combinations, please see the `gcsfs/perf/microbenchmarks <https://github.com/fsspec/gcsfs/tree/main/gcsfs/tests/perf/microbenchmarks>`_ directory.
111111

112112
.. list-table:: **Sequential Reads**
@@ -182,11 +182,11 @@ Because `gcsfs` relies on gRPC to interact with Rapid storage, developers must b
182182
However, gRPC Python wraps gRPC core, which uses internal multithreading for performance, and hence doesn't support `fork()`.
183183
Using `fork()` for multi-processing can lead to hangs or segmentation faults when child processes attempt to use the network layer
184184
where the application creates gRPC Python objects (e.g., client channel)before invoking `fork()`. However, if the application only
185-
instantiate gRPC Python objects after calling `fork()`, then `fork()` will work normally, since there is no C extension binding at this point.
185+
instantiates gRPC Python objects after calling `fork()`, then `fork()` will work normally, since there is no C extension binding at this point.
186186

187187
**Alternative: Use `forkserver` or `spawn` instead of `fork`**
188188

189-
To resolve `fork` issue, you can use `forkserver` or `spawn` instead of `fork` where the child process will create their own grpc connection.
189+
To resolve the `fork` issue, you can use `forkserver` or `spawn` instead of `fork` where the child processes will create their own gRPC connections.
190190
You can configure Python's `multiprocessing` module to override the start method as shown in the snippet below.
191191
For example while using data loaders in frameworks like PyTorch
192192
(e.g., `torch.utils.data.DataLoader` with `num_workers > 0`) alongside `gcsfs` with Rapid storage:
@@ -198,7 +198,7 @@ For example while using data loaders in frameworks like PyTorch
198198
# This must be done before other imports or initialization
199199
try:
200200
torch.multiprocessing.set_start_method('forkserver', force=True)
201-
# or use torch.multiprocessing.set_start_method('forkserver', force=True)
201+
# or use torch.multiprocessing.set_start_method('spawn', force=True)
202202
except RuntimeError:
203203
pass # Context already set
204204

0 commit comments

Comments
 (0)