Commit bdfd975
chore: change table extraction defaults (#2588)
Change default values for table extraction - works in pair with
[this](Unstructured-IO/unstructured-api#370)
`unstructured-api` PR
We want to move away from `pdf_infer_table_structure` parameter, in this
PR:
- We change how it's treated wrt `skip_infer_table_types` parameter.
Whether to extract tables from pdf now follows from the rule:
`pdf_infer_table_structure && "pdf" not in skip_infer_table_types`
- We set it to `pdf_infer_table_structure=True` and
`skip_infer_table_types=[]` by default
- We remove it from the examples in documentation
- We describe it as deprecated in favor of `skip_infer_table_types` in
documentation
More detailed description of how we want parameters to interact
- if `pdf_infer_table_structure` is False tables will never extracted
from pdf
- if `pdf_infer_table_structure` is True tables will be extracted from
pdf unless it's skipped via `skip_infer_table_types`
- on default `pdf_infer_table_structure=True` and
`skip_infer_table_types=[]`
---------
Co-authored-by: Filip Knefel <[email protected]>
Co-authored-by: ryannikolaidis <[email protected]>
Co-authored-by: ds-filipknefel <[email protected]>
Co-authored-by: Ronny H <[email protected]>1 parent 4ff6a5b commit bdfd975
File tree
16 files changed
+55
-41
lines changed- docs/source
- apis
- best_practices
- core
- examples
- ingest/configs
- test_unstructured_ingest/expected-structured-output
- gcs/nested-2
- onedrive/utic-test-ingest-fixtures
- test_unstructured/partition
- unstructured
- ingest
- partition
16 files changed
+55
-41
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
3 | | - | |
| 3 | + | |
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| |||
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| 16 | + | |
16 | 17 | | |
17 | 18 | | |
18 | 19 | | |
| |||
66 | 67 | | |
67 | 68 | | |
68 | 69 | | |
| 70 | + | |
69 | 71 | | |
70 | 72 | | |
71 | 73 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
63 | | - | |
| 63 | + | |
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
27 | | - | |
| 27 | + | |
28 | 28 | | |
29 | 29 | | |
30 | 30 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
36 | | - | |
| 36 | + | |
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
| |||
46 | 46 | | |
47 | 47 | | |
48 | 48 | | |
49 | | - | |
50 | 49 | | |
51 | 50 | | |
52 | 51 | | |
| |||
65 | 64 | | |
66 | 65 | | |
67 | 66 | | |
68 | | - | |
69 | 67 | | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
872 | 872 | | |
873 | 873 | | |
874 | 874 | | |
875 | | - | |
| 875 | + | |
876 | 876 | | |
877 | 877 | | |
878 | 878 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
47 | 47 | | |
48 | 48 | | |
49 | 49 | | |
50 | | - | |
51 | 50 | | |
52 | 51 | | |
53 | 52 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
77 | | - | |
78 | 77 | | |
79 | 78 | | |
80 | 79 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
| 12 | + | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
356 | 356 | | |
357 | 357 | | |
358 | 358 | | |
359 | | - | |
| 359 | + | |
360 | 360 | | |
361 | 361 | | |
362 | 362 | | |
| |||
Lines changed: 8 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
| 21 | + | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
42 | 43 | | |
43 | 44 | | |
44 | 45 | | |
45 | | - | |
| 46 | + | |
| 47 | + | |
46 | 48 | | |
47 | 49 | | |
48 | 50 | | |
| |||
66 | 68 | | |
67 | 69 | | |
68 | 70 | | |
69 | | - | |
| 71 | + | |
| 72 | + | |
70 | 73 | | |
71 | 74 | | |
72 | 75 | | |
| |||
90 | 93 | | |
91 | 94 | | |
92 | 95 | | |
93 | | - | |
| 96 | + | |
| 97 | + | |
94 | 98 | | |
95 | 99 | | |
96 | 100 | | |
| |||
0 commit comments