Is your feature request related to a problem? Please describe.
We can import XML files in which case the structure of the XML file gets encoded into the UIMA CAS. We have the option of exporting UIMA CAS JSON/XML files with the XML structure intact. However, when we the import those CAS+XML files again, the file type for those in INCEpTION becomes CAS JSON/XML. The result is that INCEpTION no longer knows that the proper editor for those files is an HTML-based editor (or possibly a custom editor plugin).
Describe the solution you'd like
It would be nice if there was an option to import CAS+XML files in such a way that the XML structure is extracted from the CAS during import and used as the source document for the file, while the CAS (including any pre-annotations potentially included in it) becomes the initial CAS.
Additional context
We could consider a similar approach for PDF+CAS files where the CAS is embedded as an attachment in the PDF. The PDF (maybe minus the attachment) could become the source document and the embedded CAS could become the INITIAL_CAS.
Is your feature request related to a problem? Please describe.
We can import XML files in which case the structure of the XML file gets encoded into the UIMA CAS. We have the option of exporting UIMA CAS JSON/XML files with the XML structure intact. However, when we the import those CAS+XML files again, the file type for those in INCEpTION becomes CAS JSON/XML. The result is that INCEpTION no longer knows that the proper editor for those files is an HTML-based editor (or possibly a custom editor plugin).
Describe the solution you'd like
It would be nice if there was an option to import CAS+XML files in such a way that the XML structure is extracted from the CAS during import and used as the source document for the file, while the CAS (including any pre-annotations potentially included in it) becomes the initial CAS.
Additional context
We could consider a similar approach for PDF+CAS files where the CAS is embedded as an attachment in the PDF. The PDF (maybe minus the attachment) could become the source document and the embedded CAS could become the INITIAL_CAS.