Fix occurrences of David Adelani & co-authors

mbollmann · mbollmann · commit 7d99decdb369 · 2025-02-09T12:09:58.000+01:00
diff --git a/data/xml/2020.emnlp.xml b/data/xml/2020.emnlp.xml
@@ -3118,7 +3118,7 @@
     <paper id="204">
       <title>Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on <fixed-case>A</fixed-case>frican Languages</title>
       <author><first>Michael A.</first><last>Hedderich</last></author>
-      <author><first>David</first><last>Adelani</last></author>
+      <author><first>David I.</first><last>Adelani</last></author>
       <author><first>Dawei</first><last>Zhu</last></author>
       <author><first>Jesujoba</first><last>Alabi</last></author>
       <author><first>Udia</first><last>Markus</last></author>
diff --git a/data/xml/2020.lrec.xml b/data/xml/2020.lrec.xml
@@ -4123,9 +4123,9 @@
     </paper>
     <paper id="335">
       <title>Massive vs. Curated Embeddings for Low-Resourced Languages: the Case of <fixed-case>Y</fixed-case>orùbá and <fixed-case>T</fixed-case>wi</title>
-      <author><first>Jesujoba</first><last>Alabi</last></author>
+      <author><first>Jesujoba O.</first><last>Alabi</last></author>
       <author><first>Kwabena</first><last>Amponsah-Kaakyire</last></author>
-      <author><first>David</first><last>Adelani</last></author>
+      <author><first>David I.</first><last>Adelani</last></author>
       <author><first>Cristina</first><last>España-Bonet</last></author>
       <pages>2754–2762</pages>
       <abstract>The success of several architectures to learn semantic representations from unannotated text and the availability of these kind of texts in online multilingual resources such as Wikipedia has facilitated the massive and automatic creation of resources for multiple languages. The evaluation of such resources is usually done for the high-resourced languages, where one has a smorgasbord of tasks and test sets to evaluate on. For low-resourced languages, the evaluation is more difficult and normally ignored, with the hope that the impressive capability of deep learning architectures to learn (multilingual) representations in the high-resourced setting holds in the low-resourced setting too. In this paper we focus on two African languages, Yorùbá and Twi, and compare the word embeddings obtained in this way, with word embeddings obtained from curated corpora and a language-dependent processing. We analyse the noise in the publicly available corpora, collect high quality and noisy data for the two languages and quantify the improvements that depend not only on the amount of data but on the quality too. We also use different architectures that learn word representations both from surface forms and characters to further exploit all the available information which showed to be important for these languages. For the evaluation, we manually translate the wordsim-353 word pairs dataset from English into Yorùbá and Twi. We extend the analysis to contextual word embeddings and evaluate multilingual BERT on a named entity recognition task. For this, we annotate with named entities the Global Voices corpus for Yorùbá. As output of the work, we provide corpora, embeddings and the test suits for both languages.</abstract>
diff --git a/data/xml/2021.emnlp.xml b/data/xml/2021.emnlp.xml
@@ -10761,7 +10761,7 @@
     </paper>
     <paper id="684">
       <title>Preventing Author Profiling through Zero-Shot Multilingual Back-Translation</title>
-      <author><first>David</first><last>Adelani</last></author>
+      <author><first>David Ifeoluwa</first><last>Adelani</last></author>
       <author><first>Miaoran</first><last>Zhang</last></author>
       <author><first>Xiaoyu</first><last>Shen</last></author>
       <author><first>Ali</first><last>Davody</last></author>
diff --git a/data/xml/2021.mtsummit.xml b/data/xml/2021.mtsummit.xml
@@ -76,9 +76,9 @@
     </paper>
     <paper id="6">
       <title>The Effect of Domain and Diacritics in <fixed-case>Y</fixed-case>oruba–<fixed-case>E</fixed-case>nglish Neural Machine Translation</title>
-      <author><first>David</first><last>Adelani</last></author>
+      <author><first>David Ifeoluwa</first><last>Adelani</last></author>
       <author><first>Dana</first><last>Ruiter</last></author>
-      <author><first>Jesujoba</first><last>Alabi</last></author>
+      <author><first>Jesujoba O.</first><last>Alabi</last></author>
       <author><first>Damilola</first><last>Adebonojo</last></author>
       <author><first>Adesina</first><last>Ayeni</last></author>
       <author><first>Mofe</first><last>Adeyemi</last></author>
diff --git a/data/xml/2022.emnlp.xml b/data/xml/2022.emnlp.xml
@@ -3912,15 +3912,15 @@
     </paper>
     <paper id="298">
       <title><fixed-case>M</fixed-case>asakha<fixed-case>NER</fixed-case> 2.0: <fixed-case>A</fixed-case>frica-centric Transfer Learning for Named Entity Recognition</title>
-      <author><first>David</first><last>Adelani</last><affiliation>University College London</affiliation></author>
+      <author><first>David Ifeoluwa</first><last>Adelani</last><affiliation>University College London</affiliation></author>
       <author><first>Graham</first><last>Neubig</last><affiliation>Carnegie Mellon University</affiliation></author>
       <author><first>Sebastian</first><last>Ruder</last><affiliation>Google</affiliation></author>
       <author><first>Shruti</first><last>Rijhwani</last><affiliation>Carnegie Mellon University</affiliation></author>
       <author><first>Michael</first><last>Beukman</last><affiliation>University of the Witwatersrand</affiliation></author>
       <author><first>Chester</first><last>Palen-Michel</last><affiliation>Brandeis University</affiliation></author>
       <author><first>Constantine</first><last>Lignos</last><affiliation>Brandeis University</affiliation></author>
-      <author><first>Jesujoba</first><last>Alabi</last><affiliation>Saarland University</affiliation></author>
-      <author><first>Shamsuddeen</first><last>Muhammad</last><affiliation>Bayero University, Kano</affiliation></author>
+      <author><first>Jesujoba O.</first><last>Alabi</last><affiliation>Saarland University</affiliation></author>
+      <author><first>Shamsuddeen H.</first><last>Muhammad</last><affiliation>Bayero University, Kano</affiliation></author>
       <author><first>Peter</first><last>Nabende</last><affiliation>Makerere University</affiliation></author>
       <author><first>Cheikh M. Bamba</first><last>Dione</last><affiliation>University of Bergen</affiliation></author>
       <author><first>Andiswa</first><last>Bukula</last><affiliation>SADiLaR</affiliation></author>
@@ -3942,11 +3942,21 @@
       <author><first>Allahsera Auguste</first><last>Tapo</last><affiliation>Rochester Institute of Technology</affiliation></author>
       <author><first>Tebogo</first><last>Macucwa</last><affiliation>University of Pretoria, Masakhane</affiliation></author>
       <author><first>Vukosi</first><last>Marivate</last><affiliation>Department of Computer Science, University of Pretoria</affiliation></author>
-      <author><first>Mboning Tchiaze</first><last>Elvis</last><affiliation>NTeALan</affiliation></author>
+      <author><first>Elvis</first><last>Mboning</last><affiliation>NTeALan</affiliation></author>
       <author><first>Tajuddeen</first><last>Gwadabe</last><affiliation>University of Chinese Academy of Science</affiliation></author>
       <author><first>Tosin</first><last>Adewumi</last><affiliation>Luleå University of Technology</affiliation></author>
       <author><first>Orevaoghene</first><last>Ahia</last><affiliation>University of Washington</affiliation></author>
       <author><first>Joyce</first><last>Nakatumba-Nabende</last><affiliation>Makerere University</affiliation></author>
+      <author><first>Neo L.</first><last>Mokono</last></author>
+      <author><first>Ignatius</first><last>Ezeani</last></author>
+      <author><first>Chiamaka</first><last>Chukwuneke</last></author>
+      <author><first>Mofetoluwa</first><last>Adeyemi</last></author>
+      <author><first>Gilles Q.</first><last>Hacheme</last></author>
+      <author><first>Idris</first><last>Abdulmumim</last></author>
+      <author><first>Odunayo</first><last>Ogundepo</last></author>
+      <author><first>Oreen</first><last>Yousuf</last></author>
+      <author><first>Tatiana</first><last>Moteu Ngoli</last></author>
+      <author><first>Dietrich</first><last>Klakow</last></author>
       <pages>4488-4508</pages>
       <abstract>African languages are spoken by over a billion people, but they are under-represented in NLP research and development. Multiple challenges exist, including the limited availability of annotated training and evaluation datasets as well as the lack of understanding of which settings, languages, and recently proposed methods like cross-lingual transfer will be effective. In this paper, we aim to move towards solutions for these challenges, focusing on the task of named entity recognition (NER). We present the creation of the largest to-date human-annotated NER dataset for 20 African languages. We study the behaviour of state-of-the-art cross-lingual transfer methods in an Africa-centric setting, empirically demonstrating that the choice of source transfer language significantly affects performance. While much previous work defaults to using English as the source language, our results show that choosing the best transfer language improves zero-shot F1 scores by an average of 14% over 20 languages as compared to using English.</abstract>
       <url hash="51dbe768">2022.emnlp-main.298</url>
diff --git a/data/xml/2022.findings.xml b/data/xml/2022.findings.xml
@@ -106,13 +106,13 @@
     </paper>
     <paper id="6">
       <title>Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?</title>
-      <author><first>En-Shiun</first><last>Lee</last></author>
+      <author><first>En-Shiun Annie</first><last>Lee</last></author>
       <author><first>Sarubi</first><last>Thillainathan</last></author>
       <author><first>Shravan</first><last>Nayak</last></author>
       <author><first>Surangika</first><last>Ranathunga</last></author>
-      <author><first>David</first><last>Adelani</last></author>
+      <author><first>David Ifeoluwa</first><last>Adelani</last></author>
       <author><first>Ruisi</first><last>Su</last></author>
-      <author><first>Arya</first><last>McCarthy</last></author>
+      <author><first>Arya D.</first><last>McCarthy</last></author>
       <pages>58-67</pages>
       <abstract>What can pre-trained multilingual sequence-to-sequence models like mBART contribute to translating low-resource languages? We conduct a thorough empirical experiment in 10 languages to ascertain this, considering five factors: (1) the amount of fine-tuning data, (2) the noise in the fine-tuning data, (3) the amount of pre-training data in the model, (4) the impact of domain mismatch, and (5) language typology. In addition to yielding several heuristics, the experiments form a framework for evaluating the data sensitivities of machine translation systems. While mBART is robust to domain differences, its translations for unseen and typologically distant languages remain below 3.0 BLEU. In answer to our title’s question, mBART is not a low-resource panacea; we therefore encourage shifting the emphasis from new models to new data.</abstract>
       <url hash="57ee5031">2022.findings-acl.6</url>
diff --git a/data/xml/2022.insights.xml b/data/xml/2022.insights.xml
@@ -114,7 +114,7 @@
       <author><first>Dawei</first><last>Zhu</last></author>
       <author><first>Michael A.</first><last>Hedderich</last></author>
       <author><first>Fangzhou</first><last>Zhai</last></author>
-      <author><first>David</first><last>Adelani</last></author>
+      <author><first>David Ifeoluwa</first><last>Adelani</last></author>
       <author><first>Dietrich</first><last>Klakow</last></author>
       <pages>62-67</pages>
       <abstract>Incorrect labels in training data occur when human annotators make mistakes or when the data is generated via weak or distant supervision. It has been shown that complex noise-handling techniques - by modeling, cleaning or filtering the noisy instances - are required to prevent models from fitting this label noise. However, we show in this work that, for text classification tasks with modern NLP models like BERT, over a variety of noise types, existing noise-handling methods do not always improve its performance, and may even deteriorate it, suggesting the need for further investigation. We also back our observations with a comprehensive analysis.</abstract>
diff --git a/data/xml/2022.naacl.xml b/data/xml/2022.naacl.xml
@@ -3574,8 +3574,8 @@
     </paper>
     <paper id="223">
       <title>A Few Thousand Translations Go a Long Way! Leveraging Pre-trained Models for <fixed-case>A</fixed-case>frican News Translation</title>
-      <author><first>David</first><last>Adelani</last></author>
-      <author><first>Jesujoba</first><last>Alabi</last></author>
+      <author><first>David Ifeoluwa</first><last>Adelani</last></author>
+      <author><first>Jesujoba Oluwadara</first><last>Alabi</last></author>
       <author><first>Angela</first><last>Fan</last></author>
       <author><first>Julia</first><last>Kreutzer</last></author>
       <author><first>Xiaoyu</first><last>Shen</last></author>
@@ -3590,27 +3590,27 @@
       <author><first>Chris</first><last>Emezue</last></author>
       <author><first>Colin</first><last>Leong</last></author>
       <author><first>Michael</first><last>Beukman</last></author>
-      <author><first>Shamsuddeen</first><last>Muhammad</last></author>
-      <author><first>Guyo</first><last>Jarso</last></author>
+      <author><first>Shamsuddeen H.</first><last>Muhammad</last></author>
+      <author><first>Guyo D.</first><last>Jarso</last></author>
       <author><first>Oreen</first><last>Yousuf</last></author>
-      <author><first>Andre</first><last>Niyongabo Rubungo</last></author>
+      <author><first>Andre N.</first><last>Niyongabo Rubungo</last></author>
       <author><first>Gilles</first><last>Hacheme</last></author>
       <author><first>Eric Peter</first><last>Wairagala</last></author>
       <author><first>Muhammad Umair</first><last>Nasir</last></author>
-      <author><first>Benjamin</first><last>Ajibade</last></author>
-      <author><first>Tunde</first><last>Ajayi</last></author>
-      <author><first>Yvonne</first><last>Gitau</last></author>
+      <author><first>Benjamin A.</first><last>Ajibade</last></author>
+      <author><first>Tunde Oluwaseyi</first><last>Ajayi</last></author>
+      <author><first>Yvonne Wambui</first><last>Gitau</last></author>
       <author><first>Jade</first><last>Abbott</last></author>
       <author><first>Mohamed</first><last>Ahmed</last></author>
       <author><first>Millicent</first><last>Ochieng</last></author>
       <author><first>Anuoluwapo</first><last>Aremu</last></author>
       <author><first>Perez</first><last>Ogayo</last></author>
       <author><first>Jonathan</first><last>Mukiibi</last></author>
       <author><first>Fatoumata</first><last>Ouoba Kabore</last></author>
-      <author><first>Godson</first><last>Kalipe</last></author>
+      <author><first>Godson Koffi</first><last>Kalipe</last></author>
       <author><first>Derguene</first><last>Mbaye</last></author>
       <author><first>Allahsera Auguste</first><last>Tapo</last></author>
-      <author><first>Victoire</first><last>Memdjokam Koagne</last></author>
+      <author><first>Victoire M.</first><last>Memdjokam Koagne</last></author>
       <author><first>Edwin</first><last>Munkoh-Buabeng</last></author>
       <author><first>Valencia</first><last>Wagner</last></author>
       <author><first>Idris</first><last>Abdulmumin</last></author>
@@ -7076,8 +7076,8 @@
       <title><fixed-case>MCSE</fixed-case>: <fixed-case>M</fixed-case>ultimodal Contrastive Learning of Sentence Embeddings</title>
       <author><first>Miaoran</first><last>Zhang</last></author>
       <author><first>Marius</first><last>Mosbach</last></author>
-      <author><first>David</first><last>Adelani</last></author>
-      <author><first>Michael</first><last>Hedderich</last></author>
+      <author><first>David Ifeoluwa</first><last>Adelani</last></author>
+      <author><first>Michael A.</first><last>Hedderich</last></author>
       <author><first>Dietrich</first><last>Klakow</last></author>
       <pages>5959-5969</pages>
       <abstract>Learning semantically meaningful sentence embeddings is an open problem in natural language processing. In this work, we propose a sentence embedding learning approach that exploits both visual and textual information via a multimodal contrastive objective. Through experiments on a variety of semantic textual similarity tasks, we demonstrate that our approach consistently improves the performance across various datasets and pre-trained encoders. In particular, combining a small amount of multimodal data with a large text-only corpus, we improve the state-of-the-art average Spearman’s correlation by 1.7%. By analyzing the properties of the textual embedding space, we show that our model excels in aligning semantically similar sentences, providing an explanation for its improved performance.</abstract>
diff --git a/data/xml/2022.wmt.xml b/data/xml/2022.wmt.xml
@@ -982,7 +982,7 @@
     </paper>
     <paper id="72">
       <title>Findings of the <fixed-case>WMT</fixed-case>’22 Shared Task on Large-Scale Machine Translation Evaluation for <fixed-case>A</fixed-case>frican Languages</title>
-      <author><first>David</first><last>Adelani</last><affiliation>University College London</affiliation></author>
+      <author><first>David Ifeoluwa</first><last>Adelani</last><affiliation>University College London</affiliation></author>
       <author><first>Md Mahfuz Ibn</first><last>Alam</last><affiliation>George Mason University</affiliation></author>
       <author><first>Antonios</first><last>Anastasopoulos</last><affiliation>George Mason University</affiliation></author>
       <author><first>Akshita</first><last>Bhagia</last><affiliation>Ai2</affiliation></author>
diff --git a/data/xml/2023.c3nlp.xml b/data/xml/2023.c3nlp.xml
@@ -5,7 +5,7 @@
       <booktitle>Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)</booktitle>
       <editor><first>Sunipa</first><last>Dev</last></editor>
       <editor><first>Vinodkumar</first><last>Prabhakaran</last></editor>
-      <editor><first>David</first><last>Adelani</last></editor>
+      <editor><first>David Ifeoluwa</first><last>Adelani</last></editor>
       <editor><first>Dirk</first><last>Hovy</last></editor>
       <editor><first>Luciana</first><last>Benotti</last></editor>
       <publisher>Association for Computational Linguistics</publisher>
@@ -22,8 +22,8 @@
     <paper id="1">
       <title>Varepsilon kú mask: Integrating <fixed-case>Y</fixed-case>orùbá cultural greetings into machine translation</title>
       <author><first>Idris</first><last>Akinade</last><affiliation>University of Ibadan</affiliation></author>
-      <author><first>Jesujoba</first><last>Alabi</last><affiliation>Saarland University</affiliation></author>
-      <author><first>David</first><last>Adelani</last><affiliation>University College London</affiliation></author>
+      <author><first>Jesujoba O.</first><last>Alabi</last><affiliation>Saarland University</affiliation></author>
+      <author><first>David Ifeoluwa</first><last>Adelani</last><affiliation>University College London</affiliation></author>
       <author><first>Clement</first><last>Odoje</last><affiliation>University of Ibadan</affiliation></author>
       <author><first>Dietrich</first><last>Klakow</last><affiliation>Saarland University</affiliation></author>
       <pages>1-7</pages>
diff --git a/data/xml/2023.emnlp.xml b/data/xml/2023.emnlp.xml
diff --git a/data/xml/2023.findings.xml b/data/xml/2023.findings.xml
diff --git a/data/xml/2024.eacl.xml b/data/xml/2024.eacl.xml
diff --git a/data/xml/2024.naacl.xml b/data/xml/2024.naacl.xml
diff --git a/data/yaml/name_variants.yaml b/data/yaml/name_variants.yaml