Skip to content

Restore XML source from UIMA CAS files with XML structure during import #5970

@reckart

Description

@reckart

Is your feature request related to a problem? Please describe.
We can import XML files in which case the structure of the XML file gets encoded into the UIMA CAS. We have the option of exporting UIMA CAS JSON/XML files with the XML structure intact. However, when we the import those CAS+XML files again, the file type for those in INCEpTION becomes CAS JSON/XML. The result is that INCEpTION no longer knows that the proper editor for those files is an HTML-based editor (or possibly a custom editor plugin).

Describe the solution you'd like
It would be nice if there was an option to import CAS+XML files in such a way that the XML structure is extracted from the CAS during import and used as the source document for the file, while the CAS (including any pre-annotations potentially included in it) becomes the initial CAS.

Additional context
We could consider a similar approach for PDF+CAS files where the CAS is embedded as an attachment in the PDF. The PDF (maybe minus the attachment) could become the source document and the embedded CAS could become the INITIAL_CAS.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions