-
Notifications
You must be signed in to change notification settings - Fork 267
Description
UD_Italian-Valico, a treebank of L2 Italian, treats syntactic calques (typically from the learners' L1s) similarly to how foreign material is treated under the guidelines for code-switched analysis (example from Valico here). We followed the same approach in UD_Swedish-SweLL, e.g.:
The problem (which hasn't occurred yet, but is bound to happen) is that "borrowing" guidelines from other languages might result in validation errors, as the categories and structures used don't match the language-specific guidelines.
A solution could be to mark syntactic calques with Lang=CODE_OF_THE_CALQUED_LANG in the MISC field of each of the tokens that make them up, but I'm afraid that could be misleading, as that that is currently reserved for actual foreign words. Alternatively, these cases could be assimilated to those mentioned in #1178 (there is definitely an overlap!), but that would make the rationale for the chosen analysis less transparent.
Any thoughts?