Commit 84c8aa7
feat: add background processing jobs (#5432)
# Description
This PR add the following changes:
- [x] Add `rq` to help us execute background jobs.
- [x] Add a background job to update all records for a dataset when the
dataset distribution strategy is updated.
- [x] Change HuggingFace Dockerfile to install Redis and run `rq`
workers inside honcho Procfile.
- [x] Add documentation about new `ARGILLA_REDIS_URL` environment
variable.
- [x] Add ping to Redis so Argilla server is not started if Redis is not
ready.
- [x] Change Argilla docker compose file to include a container with
Redis and rq workers.
- [x] Update Argilla server `README.md` file adding Redis as dependency
to install.
- [x] Add documentation about Redis being a new Argilla server
dependency.
- [x] Add `BACKGROUND_NUM_WORKERS` environment variable to specify the
number of workers in the HF Space container.
- [ ] ~~Modify `Dockerfile` template on HF to include the environment
variable #5443
```
# (since: v2.2.0) Uncomment the next line to specify the number of background job workers to run (default: 2).
# ENV BACKGROUND_NUM_WORKERS=2
```
- [ ] Remove some `TODO` sections before merging.
- [ ] Review K8s documentation (maybe delete it?).
- [ ] If we want to persist Redis data on HF Spaces we can change our
`Procfile` Redis process to the following:
```
redis: /usr/bin/redis-server --dbfilename argilla-redis.rdb --dir ${ARGILLA_HOME_PATH}
```
- [ ] <del>Allow tests job workers synchronously (with pytest)</del>
It's not working due to asyncio stuff (running an asynchronous loop
inside another one, more info here:
rq/rq#1986).
Closes #5431
# Benchmarks
The following timings were obtained updating the distribution strategy
of a dataset with 100 and 10.000 records, using a basic and an upgraded
CPU on HF Spaces, with and without persistent storage and measuring how
much time the background job takes to complete:
CPU basic: 2 vCPU, 16GB RAM
CPU upgrade: 8 vCPU, 32GB RAM
* CPU basic (with persistent storage):
* 100 records dataset: ~8 seconds.
* 10.000 records dataset: ~9 minutes.
* CPU upgrade (with persistent storage):
* 100 records dataset: ~5 seconds.
* 10.000 records dataset: ~6 minutes.
* CPU basic (no persistent storage):
* 10.000 records dataset: ~101 seconds.
* CPU upgrade (no persistent storage):
* 10.000 records dataset: ~62 seconds.
**Type of change**
- New feature (non-breaking change which adds functionality)
**How Has This Been Tested**
- [x] Testing it on HF Spaces.
**Checklist**
- I added relevant documentation
- I followed the style guidelines of this project
- I did a self-review of my code
- I made corresponding changes to the documentation
- I confirm My changes generate no new warnings
- I have added tests that prove my fix is effective or that my feature
works
- I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)
---------
Co-authored-by: Damián Pumar <[email protected]>1 parent fee1f5a commit 84c8aa7
File tree
24 files changed
+308
-107
lines changed- .github/workflows
- argilla-frontend/v1/infrastructure/repositories
- argilla-server
- docker
- argilla-hf-spaces
- server
- src/argilla_server
- cli
- contexts
- jobs
- models
- validators
- tests/unit/api/handlers/v1/datasets
- argilla/docs/reference/argilla-server
- docs/_source/community
- examples/deployments/docker
24 files changed
+308
-107
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
54 | 64 | | |
55 | 65 | | |
56 | 66 | | |
| |||
Lines changed: 1 addition & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
100 | 100 | | |
101 | 101 | | |
102 | 102 | | |
| 103 | + | |
103 | 104 | | |
104 | 105 | | |
105 | 106 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
19 | 24 | | |
20 | 25 | | |
21 | 26 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
118 | 124 | | |
119 | 125 | | |
120 | 126 | | |
| |||
271 | 277 | | |
272 | 278 | | |
273 | 279 | | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
274 | 290 | | |
275 | 291 | | |
276 | 292 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
13 | | - | |
14 | | - | |
15 | | - | |
16 | | - | |
| 13 | + | |
| 14 | + | |
17 | 15 | | |
18 | 16 | | |
19 | 17 | | |
20 | | - | |
21 | 18 | | |
22 | 19 | | |
23 | 20 | | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
24 | 27 | | |
25 | 28 | | |
26 | 29 | | |
| 30 | + | |
27 | 31 | | |
28 | | - | |
| 32 | + | |
29 | 33 | | |
30 | 34 | | |
| 35 | + | |
| 36 | + | |
31 | 37 | | |
32 | 38 | | |
33 | 39 | | |
| |||
52 | 58 | | |
53 | 59 | | |
54 | 60 | | |
| 61 | + | |
55 | 62 | | |
56 | 63 | | |
57 | 64 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
| 3 | + | |
2 | 4 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
| 2 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
28 | | - | |
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
| 50 | + | |
| 51 | + | |
50 | 52 | | |
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
54 | 56 | | |
55 | 57 | | |
56 | | - | |
| 58 | + | |
57 | 59 | | |
58 | 60 | | |
59 | 61 | | |
| |||
169 | 171 | | |
170 | 172 | | |
171 | 173 | | |
| 174 | + | |
172 | 175 | | |
173 | 176 | | |
174 | 177 | | |
175 | | - | |
| 178 | + | |
176 | 179 | | |
177 | 180 | | |
178 | 181 | | |
| |||
0 commit comments