This is the repository accompanying the paper: A systematic survey of natural language processing for the Greek language by Juli Bakagianni, Kanella Pouli, Maria Gavriilidou, John Pavlopoulos, published at Patterns, 2025 📄 Open-access article
Comprehensive monolingual natural language processing (NLP) surveys are essential for assessing language-specific challenges, resource availability, and research gaps. This work introduces a generalizable framework for systematic monolingual NLP surveys, applied here to Greek NLP (2012–2023).
This repository contains the structured data collected during the survey, which is continuously updated to provide an evergreen resource for the community.
-
greek_nlp_articles.csv
A curated list of research articles (2012–2023) relevant to Greek NLP, including metadata such as title, year, venue, and task coverage. -
greek_nlp_datasets.csv
A catalog of datasets used in Greek NLP research, annotated with task type, availability, and other key attributes.
You can easily explore the data using Python and pandas:
# Comment out below to download (inside a notebook)
#!git clone https://github.com/greek-nlp/survey.git
#%cd survey
import pandas as pd
articles = pd.read_csv("greek_nlp_articles.csv")
datasets = pd.read_csv("greek_nlp_datasets.csv")
print("Articles:", articles.shape)
print(articles.shape)
print("Datasets:", datasets.shape)
print(datasets.shape)
datasets.sample(3)If you use this resource, please cite the paper:
@article{bakagianni2025systematic,
title={A systematic survey of natural language processing for the Greek language},
author={Bakagianni, Juli and Pouli, Kanella and Gavriilidou, Maria and Pavlopoulos, John},
journal={Patterns},
year={2025},
publisher={Elsevier}
}This work is licensed under a Creative Commons Attribution 4.0 International License.
