Commit bc1e08c
authored
make user data persistent and standardized (#175)
* delete old notebooks
* delete old models dir+artifacts
* delete old pdfestrian test data artifacts
* add todo comments to study ranker model
this is unrelated to the current changes, i just want to commit it somewhere
* delete old citations test data
* delete old dedupe model artifacts
* delete old train deduper script
* docs: update todos deployment readme
* refactor: move ranker classes for readability
* remove old ranker model
* tests: delete old ranker model tests
* fix: use consistent ranker model dir
* refactor: avoid hard-coded ranker col names
* include bigrams in study ranker features
* add sranker prop to get num texts learned
and retrain from scratch every 100 texts learned
* change+hide file handling in study ranker api
* build: add persistent file data volume
* update file storage dirs in app config
* start storing raw citation upload files to storage
* tests: update fs config in conftest
* fix: create/remove fs dirs in cli+api
* ci: add fs root dir to checks env
* use filesystem in study ranker io
* tests: update study ranker init calls1 parent aeb6928 commit bc1e08c
File tree
33 files changed
+204
-818062
lines changed- .github/workflows
- colandr_data
- citations
- dedupe
- colandr
- api/v1/routes
- lib/models
- deployment
- models
- notebooks
- pdfestrian/json
- scripts
- tests
- lib/models
33 files changed
+204
-818062
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
20 | | - | |
21 | | - | |
22 | | - | |
23 | | - | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
24 | 28 | | |
25 | 29 | | |
26 | 30 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
| 8 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
1 | 2 | | |
2 | 3 | | |
3 | 4 | | |
| |||
163 | 164 | | |
164 | 165 | | |
165 | 166 | | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
166 | 171 | | |
167 | 172 | | |
168 | | - | |
| 173 | + | |
169 | 174 | | |
170 | 175 | | |
171 | 176 | | |
| |||
216 | 221 | | |
217 | 222 | | |
218 | 223 | | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
219 | 240 | | |
220 | 241 | | |
221 | 242 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
155 | 155 | | |
156 | 156 | | |
157 | 157 | | |
| 158 | + | |
158 | 159 | | |
159 | 160 | | |
160 | 161 | | |
161 | 162 | | |
162 | 163 | | |
163 | 164 | | |
164 | 165 | | |
165 | | - | |
166 | 166 | | |
167 | 167 | | |
168 | 168 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
76 | | - | |
| 76 | + | |
| 77 | + | |
77 | 78 | | |
78 | 79 | | |
79 | 80 | | |
| |||
171 | 172 | | |
172 | 173 | | |
173 | 174 | | |
174 | | - | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
175 | 181 | | |
176 | 182 | | |
177 | 183 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
395 | 395 | | |
396 | 396 | | |
397 | 397 | | |
398 | | - | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
399 | 401 | | |
400 | | - | |
| 402 | + | |
401 | 403 | | |
402 | 404 | | |
403 | 405 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
95 | 95 | | |
96 | 96 | | |
97 | 97 | | |
98 | | - | |
| 98 | + | |
99 | 99 | | |
100 | 100 | | |
101 | 101 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
88 | | - | |
89 | | - | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
90 | 92 | | |
91 | 93 | | |
92 | | - | |
93 | | - | |
| 94 | + | |
| 95 | + | |
94 | 96 | | |
95 | 97 | | |
96 | 98 | | |
97 | | - | |
98 | | - | |
99 | 99 | | |
100 | 100 | | |
101 | 101 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | 1 | | |
2 | | - | |
3 | 2 | | |
0 commit comments