Repository accompanying our paper titled Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora, presented at the 6th Workshop on Gender Bias in Natural Language Processing at ACL 2025. The research and experimental evaluation has been conducted for Spanish and Valencian.
Paper Authors: Erik Derner, Sara Sansalvador de la Fuente, Yoan Gutiérrez, Paloma Moreda, Nuria Oliver
Code and Data Authors: Erik Derner, Sara Sansalvador de la Fuente, Elena Maestre Hernández + sampled data from OPUS
Contact: erik@ellisalicante.org
code: Code for dataset analysis, continual pretraining, inference, and validationbias-quantification: Gender representation bias quantification in a given datasetcontinual-pretraining: Continual pretraining and model inference to evaluate how gender representation bias in training data propagates to the model inferencevalidation: Validation of the gender representation bias quantification method on an annotated dataset
data: Samples of corpora, annotated datasets, prompts, few-shot examples, word skiplist, and stories for continual pretrainingcontinual-pretraining: Biased and balanced stories datasets generated for continual pretraining experimentscorpora-en-es: Samples of aligned parallel corpora in English and Spanish used in the experiments in the paperdataset-analysis: Prompts, few-shot examples, and skiplist for bias evaluationvalidation: Annotated (ground truth) data for gender representation bias quantification method validation
- Python 3.12
- CUDA 12.1 to use GPU
-
Clone or download the repository.
-
Install the required packages:
pip install -r requirements.txt
-
Set the environment variables if you want to use API inference with either of the providers:
OPENAI_API_KEY– OpenAI API keyGROQ_API_KEY– Groq API key
In code/bias-quantification, use:
dataset-analysis-openai.ipynbto analyze datasets in gendered languages using OpenAI APIdataset-analysis-groq.ipynbto analyze datasets in gendered languages using Groq APIdataset-analysis-gp.ipynbto analyze datasets in English using the Gender Polarity method
To extract a sample subset from a (potentially multilingual aligned) text corpus, use dataset-extraction.ipynb in code/bias-quantification.
In code/continual-pretraining, use:
continual-pretraining.ipynbto continually pretrain a HuggingFace model on a given raw text datasetinference.ipynbto run inference with a base or continually pretrained model
To validate the gender representation bias quantification method on an annotated dataset, use in code/validation:
validation-gt-openai.ipynbto perform the validation using OpenAI APIvalidation-gt-groq.ipynbto perform the validation using Groq API
This project is licensed under the MIT License. See the LICENSE.txt file for details.
If you use this code or data, please cite our paper:
@inproceedings{derner2025leveraging,
author = {Derner, Erik and Sansalvador de la Fuente, Sara and Guti{\'e}rrez, Yoan and Moreda, Paloma and Oliver, Nuria},
title = {Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora},
booktitle = {Proceedings of the 6th Workshop on Gender Bias in Natural Language Processing (GeBNLP)},
pages = {468--483},
publisher = {Association for Computational Linguistics},
address = {Vienna, Austria},
year = {2025},
month = {Aug}
}