Commit b515630
committed
fix(paddle-ocr): correct dict index offset, default angle cls off, configurable padding (#395)
- Prepend CTC blank token and append space token in read_keys_from_file()
to match get_keys() layout, fixing off-by-one character mapping errors
- Default use_angle_cls to false (misfires on short text regions)
- Replace hardcoded padding=50 with configurable padding (default 10)
- Add unit tests for key loading and CTC decoding1 parent 5bb4af2 commit b515630
File tree
4 files changed
+91
-7
lines changed- crates
- kreuzberg-paddle-ocr/src
- kreuzberg/src/paddle_ocr
4 files changed
+91
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
| 18 | + | |
17 | 19 | | |
18 | 20 | | |
19 | 21 | | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
20 | 26 | | |
21 | 27 | | |
22 | 28 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
107 | 107 | | |
108 | 108 | | |
109 | 109 | | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
110 | 113 | | |
| 114 | + | |
| 115 | + | |
111 | 116 | | |
112 | 117 | | |
113 | 118 | | |
| |||
223 | 228 | | |
224 | 229 | | |
225 | 230 | | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
260 | 260 | | |
261 | 261 | | |
262 | 262 | | |
263 | | - | |
| 263 | + | |
264 | 264 | | |
265 | 265 | | |
266 | 266 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
37 | 37 | | |
38 | 38 | | |
39 | 39 | | |
40 | | - | |
| 40 | + | |
| 41 | + | |
41 | 42 | | |
42 | 43 | | |
43 | 44 | | |
| |||
62 | 63 | | |
63 | 64 | | |
64 | 65 | | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
65 | 70 | | |
66 | 71 | | |
67 | 72 | | |
| |||
82 | 87 | | |
83 | 88 | | |
84 | 89 | | |
85 | | - | |
| 90 | + | |
86 | 91 | | |
87 | 92 | | |
88 | 93 | | |
89 | 94 | | |
90 | 95 | | |
91 | 96 | | |
| 97 | + | |
92 | 98 | | |
93 | 99 | | |
94 | 100 | | |
| |||
191 | 197 | | |
192 | 198 | | |
193 | 199 | | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
194 | 210 | | |
195 | 211 | | |
196 | 212 | | |
| |||
383 | 399 | | |
384 | 400 | | |
385 | 401 | | |
386 | | - | |
| 402 | + | |
387 | 403 | | |
| 404 | + | |
388 | 405 | | |
389 | 406 | | |
390 | 407 | | |
| |||
396 | 413 | | |
397 | 414 | | |
398 | 415 | | |
| 416 | + | |
399 | 417 | | |
400 | 418 | | |
401 | 419 | | |
402 | 420 | | |
403 | 421 | | |
404 | | - | |
| 422 | + | |
405 | 423 | | |
406 | 424 | | |
407 | | - | |
| 425 | + | |
| 426 | + | |
408 | 427 | | |
409 | 428 | | |
410 | | - | |
| 429 | + | |
411 | 430 | | |
412 | 431 | | |
413 | 432 | | |
| 433 | + | |
414 | 434 | | |
415 | 435 | | |
416 | 436 | | |
| |||
0 commit comments