This repository is meant to accompany the paper titled Annotating Personal Information in Swedish Texts with SPARV by Maria Irena Szawerna, David Alfter, and Elena Volodina.
This repository includes the code necessary to convert SPARV XML outputs to JSONL required by (Im)Personal Data visualization tool, the code used to obtain predictions from Microsoft Presidio and the LLM Gemma, as well as the sample text used in the article alongside its translation. It also contains a directory with files based off of the sample text that can be used directly in the visualization tool.
💡 Sparv-plugin