All unicode data read by the [JacksonCsvFileReader](https://github.com/gbif/gbif-common/blob/master/src/main/java/org/gbif/utils/file/tabular/JacksonCsvFileReader.java) should at least be converted to the NFC format, see https://www.unicode.org/reports/tr15