Commit e7e3b42
Implement dataset export/import (#1946)
This pull request adds support to import and export datasets. An
exported dataset will include all the images, where references to
callables for images/masks are saved as PNG and image paths will be
copied. The exported dataset also includes metadata, such as the
categories and schema of the dataset and a version number in a JSON
file. The polars dataframe is stored in parquet. Additionally, there is
an option to export everything in a zip. This zip can also be imported
as is without the need to manually extract it first.
**Serialization and Deserialization Enhancements**
* Added `to_dict` and `from_dict` methods to `Categories`,
`LabelCategories`, `HierarchicalLabelCategories`, and `MaskCategories`,
enabling polymorphic serialization and reconstruction of category
objects.
[[1]](diffhunk://#diff-f0858b860d68536c04476766ef0512d873b7f7cc20dde377a42ec13a836d9622L58-R101)
[[2]](diffhunk://#diff-f0858b860d68536c04476766ef0512d873b7f7cc20dde377a42ec13a836d9622R181-R223)
[[3]](diffhunk://#diff-f0858b860d68536c04476766ef0512d873b7f7cc20dde377a42ec13a836d9622R407-R496)
[[4]](diffhunk://#diff-f0858b860d68536c04476766ef0512d873b7f7cc20dde377a42ec13a836d9622R565-R601)
* Implemented `to_dict` and `from_dict` methods for the `Field` base
class, using dataclass introspection to automatically serialize and
reconstruct field attributes, including special handling for semantic
and dtype fields.
* Added serialization and deserialization logic to the `Schema` class,
including storing type information and reconstructing attribute types
via module introspection.
**Dataset Import/Export API**
* Exposed `export_dataset` and `import_dataset` functions at the module
level, and added `export` and `from_file` methods to the `Dataset` class
for saving/loading datasets in a structured format.
[[1]](diffhunk://#diff-6c66a059cb4075cd7afbc29b4f34236f8101a1a47cdcc429c7a2d25868cc74b5R8)
[[2]](diffhunk://#diff-4ac196ddc4dc8e6d33daf684ded18886ff8774fadb8b6cbd4bfa88ca424bb34fR500-R554)
These changes lay the groundwork for robust dataset interchange and
future compatibility across different category and schema types.
<!-- Contributing guide:
https://github.com/open-edge-platform/datumaro/blob/develop/contributing.md
-->
<!--
Please add a summary of changes. You may use Copilot to auto-generate
the PR description but please consider including any other relevant
facts which Copilot may be unaware of (such as design choices and
testing procedure).
Add references to the relevant issues and pull requests if any like so:
Resolves #111 and #222.
Depends on #1000 (for series of dependent commits).
-->
### Checklist
<!-- Put an 'x' in all the boxes that apply -->
- [ ] I have added tests to cover my changes or documented any manual
tests.
- [ ] I have updated the
[documentation](https://github.com/open-edge-platform/datumaro/tree/develop/docs)
accordingly
---------
Signed-off-by: Albert van Houten <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>1 parent 411dcb5 commit e7e3b42
File tree
7 files changed
+2458
-5
lines changed- src/datumaro/experimental
- tests
- integration/experimental
- unit/experimental
7 files changed
+2458
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| |||
55 | 55 | | |
56 | 56 | | |
57 | 57 | | |
58 | | - | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
59 | 102 | | |
60 | 103 | | |
61 | 104 | | |
| |||
135 | 178 | | |
136 | 179 | | |
137 | 180 | | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
138 | 224 | | |
139 | 225 | | |
140 | 226 | | |
| |||
318 | 404 | | |
319 | 405 | | |
320 | 406 | | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
321 | 497 | | |
322 | 498 | | |
323 | 499 | | |
| |||
386 | 562 | | |
387 | 563 | | |
388 | 564 | | |
| 565 | + | |
| 566 | + | |
| 567 | + | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
| 578 | + | |
| 579 | + | |
| 580 | + | |
| 581 | + | |
| 582 | + | |
| 583 | + | |
| 584 | + | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
| 588 | + | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
389 | 602 | | |
390 | 603 | | |
391 | 604 | | |
| |||
0 commit comments