Commit 350bb1d
enhancement: clean pdf elements (bump unstructured-inference) (#790)
More deterministic element ordering when using hi_res PDF parsing strategy (from unstructured-inference bump to 0.5.4)
Make large model available (from unstructured-inference bump to 0.5.3)
Combine inferred elements with extracted elements (from unstructured-inference bump to 0.5.2)
---------
Co-authored-by: Roman Isecke <[email protected]>
Co-authored-by: Crag Wolfe <[email protected]>1 parent 642562b commit 350bb1d
File tree
26 files changed
+6475
-2832
lines changed- requirements
- test_unstructured_ingest/expected-structured-output
- azure
- biomed-api
- 65/11
- 75/29
- biomed-path/07/07
- s3/small-pdf-set
- test_unstructured/partition
- unstructured
- file_utils
- partition
26 files changed
+6475
-2832
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
5 | 8 | | |
6 | 9 | | |
7 | 10 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
| 9 | + | |
10 | 10 | | |
11 | 11 | | |
12 | 12 | | |
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
| 56 | + | |
57 | 57 | | |
58 | 58 | | |
59 | 59 | | |
| |||
130 | 130 | | |
131 | 131 | | |
132 | 132 | | |
133 | | - | |
| 133 | + | |
134 | 134 | | |
135 | | - | |
| 135 | + | |
136 | 136 | | |
137 | 137 | | |
138 | 138 | | |
139 | | - | |
140 | 139 | | |
141 | 140 | | |
142 | 141 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
85 | | - | |
| 85 | + | |
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| |||
121 | 121 | | |
122 | 122 | | |
123 | 123 | | |
124 | | - | |
| 124 | + | |
125 | 125 | | |
126 | 126 | | |
127 | 127 | | |
| |||
147 | 147 | | |
148 | 148 | | |
149 | 149 | | |
150 | | - | |
| 150 | + | |
151 | 151 | | |
152 | 152 | | |
153 | 153 | | |
| |||
219 | 219 | | |
220 | 220 | | |
221 | 221 | | |
222 | | - | |
| 222 | + | |
223 | 223 | | |
224 | 224 | | |
225 | 225 | | |
| |||
352 | 352 | | |
353 | 353 | | |
354 | 354 | | |
355 | | - | |
| 355 | + | |
356 | 356 | | |
357 | 357 | | |
358 | 358 | | |
359 | 359 | | |
360 | | - | |
| 360 | + | |
361 | 361 | | |
362 | 362 | | |
363 | 363 | | |
| |||
369 | 369 | | |
370 | 370 | | |
371 | 371 | | |
372 | | - | |
| 372 | + | |
373 | 373 | | |
374 | 374 | | |
375 | 375 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | | - | |
| 35 | + | |
36 | 36 | | |
37 | 37 | | |
38 | 38 | | |
| |||
92 | 92 | | |
93 | 93 | | |
94 | 94 | | |
95 | | - | |
| 95 | + | |
96 | 96 | | |
97 | 97 | | |
98 | 98 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
93 | 93 | | |
94 | 94 | | |
95 | 95 | | |
96 | | - | |
| 96 | + | |
97 | 97 | | |
98 | 98 | | |
99 | 99 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
| 19 | + | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
40 | 40 | | |
41 | 41 | | |
42 | 42 | | |
43 | | - | |
| 43 | + | |
44 | 44 | | |
45 | 45 | | |
46 | 46 | | |
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
54 | | - | |
| 54 | + | |
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
20 | | - | |
| 20 | + | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | | - | |
| 36 | + | |
37 | 37 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
55 | | - | |
| 55 | + | |
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| |||
0 commit comments