You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/architecture.md
+9Lines changed: 9 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -56,6 +56,15 @@ This is implemented using the `S3ClientMap` in `src/s3_client.rs` and benchmarke
56
56
57
57
Downloaded storage chunk data is returned to the request handler as a [Bytes](https://docs.rs/bytes/latest/bytes/struct.Bytes.html) object, which is a wrapper around a `u8` (byte) array.
58
58
59
+
## S3 object caching
60
+
61
+
A cache can be optionally enabled to store downloaded S3 objects to disk, this allows the Reductionist to repeat operations on already downloaded data objects utilising faster disk I/O over network I/O.
62
+
Authenticaiton is passed through to the S3 object store and access to cached data by users other than the original requestor is allowed if S3 authentication permits. Authentication can be optionally disabled for further cache speedup in trusted environments.
63
+
64
+
A [Tokio MPSC channel](https://docs.rs/tokio/latest/tokio/sync/mpsc/index.html) bridges write access between the requests of the asynchronous [Axum](https://docs.rs/axum) web framework and synchronous writes to the disk cache; this allows requests to the Reductionist to continue unblocked along their operation pipeline whilst being queued for cache storage.
65
+
66
+
The disk cache can be managed overall by size and by time to live (TTL) on individual data objects with automatic pruning removing expired objects. Cache state is maintained on disk allowing the cache to be reused across restarts of the Reductionist.
67
+
59
68
## Filters and compression
60
69
61
70
When a variable in a netCDF, HDF5 or Zarr dataset is created, it may be compressed to reduce storage requirements.
Copy file name to clipboardExpand all lines: docs/deployment.md
+52-1Lines changed: 52 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -193,13 +193,64 @@ Note, this is the default.
193
193
Create a `certs` directory under the home directory of the non-privileged deployment user, this will be done automatically and the following files will be added if Step is deployed.
194
194
If using third party certificates the following files must be added manually using the file names shown:
195
195
196
-
| Filename | Description |
196
+
| Filename | Description |
197
197
| -------- | ------- |
198
198
| certs/key.pem | Private key file |
199
199
| certs/cert.pem | Certificate file including any intermediates |
200
200
201
201
Certificates can be added post Reductionist deployment but the Reductionist's container will need to be restarted afterwards.
202
202
203
+
## Reductionist Configuration
204
+
205
+
In addition to the `certs` configuration above the file `deployment/group_vars/all` covers the following configuration.
206
+
207
+
| Ansible Parameter | Description |
208
+
| - | - |
209
+
| reductionist_build_image | Whether to locally build the Reductionist container |
210
+
| reductionist_src_url | Source URL for the Reductionist repository |
211
+
| reductionist_src_version | Repository branch to use for local builds |
212
+
| reductionist_repo_location | Where to clone the Reductionist repository |
213
+
| reductionist_clone_repo | By default the repository cloning overwrites local changes, this disables |
214
+
| reductionist_name | Name for Reductionist container |
215
+
| reductionist_image | Container URL if downloading and not building |
216
+
| reductionist_tag | Container tag |
217
+
| reductionist_networks | List of container networks |
218
+
| reductionist_env | Configures the Reductionist environment, see table of environment variables below |
219
+
| reductionist_remote_certs_path | Path to certificates on the host |
220
+
| reductionist_container_certs_path | Path to certificates within the container |
221
+
| reductionist_remote_cache_path | Path to cache on host filesystem |
222
+
| reductionist_container_cache_path | Path to cache within the container |
223
+
| reductionist_volumes | Volumes to map from host to container |
224
+
| reductionist_host | Used when deploying HAProxy to test connectivity to backend Reductionist(s) |
| REDUCTIONIST_THREAD_LIMIT | Thread limit for CPU-bound tasks |
242
+
| REDUCTIONIST_USE_CHUNK_CACHE | Whether to enable caching of downloaded data objects to disk |
243
+
| REDUCTIONIST_CHUNK_CACHE_PATH | Absolute filesystem path used for the cache. Defaults to container cache path, see Ansible Parameters above |
244
+
| REDUCTIONIST_CHUNK_CACHE_AGE | Time in seconds a chunk is kept in the cache |
245
+
| REDUCTIONIST_CHUNK_CACHE_PRUNE_INTERVAL | Time in seconds between periodic pruning of the cache |
246
+
| REDUCTIONIST_CHUNK_CACHE_SIZE_LIMIT | Maximum cache size, i.e. "100GB" |
247
+
| REDUCTIONIST_CHUNK_CACHE_QUEUE_SIZE | Tokio MPSC buffer size used to queue downloaded objects between the asynchronous web engine and the synchronous cache |
248
+
| REDUCTIONIST_CHUNK_CACHE_BYPASS_AUTH | Allow bypassing of S3 authentication when accessing cached data |
249
+
250
+
251
+
Note, after changing any of the above parameters the Reductionist must be deployed, or redeployed, using the ansible playbook for the change to take effect.
252
+
The idempotent nature of ansible necessitates that if redeploying then a running Reductionist container must be removed first.
253
+
203
254
## Usage
204
255
205
256
Once deployed, the Reductionist API is accessible on port 8080 by HAProxy. The Prometheus UI is accessible on port 9090 on the host running Prometheus. The Jaeger UI is accessible on port 16686 on the host running Jaeger.
0 commit comments