You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Process metadata corrections for 2023.ldk-1.55 (closes#5342)
* Process metadata corrections for 2024.arabicnlp-1.6 (closes#5341)
* Process metadata corrections for 2024.paclic-1.29 (closes#5339)
* Process metadata corrections for 2024.fieldmatters-1.4 (closes#5338)
* Process metadata corrections for 2020.vlsp-1.2 (closes#5334)
* Process metadata corrections for 2024.paclic-1.0 (closes#5330)
* Process metadata corrections for 2024.sigdial-1.60 (closes#5323)
* Process metadata corrections for 2023.emnlp-main.1011 (closes#5319)
* Process metadata corrections for 2024.propor-1.35 (closes#5316)
* Process metadata corrections for 2022.emnlp-industry.44 (closes#5315)
* Process metadata corrections for 2025.naacl-long.385 (closes#5314)
* Process metadata corrections for 2020.icon-workshop.3 (closes#5311)
* Process metadata corrections for 2023.arabicnlp-1.6 (closes#5310)
* Process metadata corrections for 2024.findings-acl.967 (closes#5307)
* Process metadata corrections for 2024.emnlp-main.259 (closes#5306)
* Process metadata corrections for 2024.privatenlp-1.7 (closes#5305)
* Process metadata corrections for 2024.wmt-1.11 (closes#5302)
* Process metadata corrections for W19-5410 (closes#5297)
* Process metadata corrections for 2025.coling-main.421 (closes#5296)
* Process metadata corrections for 2024.acl-long.39 (closes#5292)
* Process metadata corrections for 2023.semeval-1.172 (closes#5291)
* Process metadata corrections for 2023.findings-emnlp.98 (closes#5290)
* Process metadata corrections for 2023.arabicnlp-1.25 (closes#5289)
* Process metadata corrections for 2024.arabicnlp-1.13 (closes#5287)
* Process metadata corrections for 2024.findings-naacl.156 (closes#5280)
* Process metadata corrections for 2024.eamt-2.27 (closes#5248)
* Process metadata corrections for W17-0901 (closes#5181)
* Process metadata corrections for 2022.findings-emnlp.75 (closes#5176)
* Process metadata corrections for 2025.dravidianlangtech-1.75 (closes#5075)
* Process metadata corrections for 2022.wanlp-1.6 (closes#4947)
* Process metadata corrections for 2023.nodalida-1.34 (closes#4716)
* Remove tab
<abstract>Word emphasis in textual content aims at conveying the desired intention by changing the size, color, typeface, style (bold, italic, etc.), and other typographical features. The emphasized words are extremely helpful in drawing the readers’ attention to specific information that the authors wish to emphasize. However, performing such emphasis using a soft keyboard for social media interactions is time-consuming and has an associated learning curve. In this paper, we propose a novel approach to automate the emphasis word detection on short written texts. To the best of our knowledge, this work presents the first lightweight deep learning approach for smartphone deployment of emphasis selection. Experimental results show that our approach achieves comparable accuracy at a much lower model size than existing models. Our best lightweight model has a memory footprint of 2.82 MB with a matching score of 0.716 on SemEval-2020 public benchmark dataset.</abstract>
<abstract>Building Agent Assistants that can help improve customer service support requires inputs from industry users and their customers, as well as knowledge about state-of-the-art Natural Language Processing (NLP) technology. We combine expertise from academia and industry to bridge the gap and build task/domain-specific Neural Agent Assistants (NAA) with three high-level components for: (1) Intent Identification, (2) Context Retrieval, and (3) Response Generation. In this paper, we outline the pipeline of the NAA’s core system and also present three case studies in which three industry partners successfully adapt the framework to find solutions to their unique challenges. Our findings suggest that a collaborative process is instrumental in spurring the development of emerging NLP models for Conversational AI tasks in industry. The full reference implementation code and results are available at <url>https://github.com/VectorInstitute/NAA</url>.</abstract>
<abstract>To build a conversational agent that interacts fluently with humans, previous studies blend knowledge or personal profile into the pre-trained language model. However, the model that considers knowledge and persona at the same time is still limited, leading to hallucination and a passive way of using personas. We propose an effective dialogue agent that grounds external knowledge and persona simultaneously. The agent selects the proper knowledge and persona to use for generating the answers with our candidate scoring implemented with a poly-encoder. Then, our model generates the utterance with lesser hallucination and more engagingness utilizing retrieval augmented generation with knowledge-persona enhanced query. We conduct experiments on the persona-knowledge chat and achieve state-of-the-art performance in grounding and generation tasks on the automatic metrics. Moreover, we validate the answers from the models regarding hallucination and engagingness through human evaluation and qualitative results. We show our retriever’s effectiveness in extracting relevant documents compared to the other previous retrievers, along with the comparison of multiple candidate scoring methods. Code is available at <url>https://github.com/dlawjddn803/INFO</url></abstract>
<abstract>Poetry generation tends to be a complicated task given meter and rhyme constraints. Previous work resorted to exhaustive methods in-order to employ poetic elements. In this paper we leave pre-trained models, GPT-J and BERTShared to recognize patterns of meters and rhyme to generate classical Arabic poetry and present our findings and results on how well both models could pick up on these classical Arabic poetic elements.</abstract>
Copy file name to clipboardExpand all lines: data/xml/2023.arabicnlp.xml
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -97,10 +97,10 @@
97
97
<paperid="6">
98
98
<title><fixed-case>TARJAMAT</fixed-case>: Evaluation of Bard and <fixed-case>C</fixed-case>hat<fixed-case>GPT</fixed-case> on Machine Translation of Ten <fixed-case>A</fixed-case>rabic Varieties</title>
<abstract>Traditional NER systems are typically trained to recognize coarse-grained categories of entities, and less attention is given to classifying entities into a hierarchy of fine-grained lower-level sub-types. This article aims to advance Arabic NER with fine-grained entities. We chose to extend Wojood (an open-source Nested Arabic Named Entity Corpus) with sub-types. In particular, four main entity types in Wojood (geopolitical entity (GPE), location (LOC), organization (ORG), and facility (FAC) are extended with 31 sub-types of entities. To do this, we first revised Wojood’s annotations of GPE, LOC, ORG, and FAC to be compatible with the LDC’s ACE guidelines, which yielded 5, 614 changes. Second, all mentions of GPE, LOC, ORG, and FAC (~ 44K) in Wojood are manually annotated with the LDC’s ACE subtypes. This extended version of Wojood is called WojoodFine. To evaluate our annotations, we measured the inter-annotator agreement (IAA) using both Cohen’s Kappa and F1 score, resulting in 0.9861 and 0.9889, respectively. To compute the baselines of WojoodFine, we fine-tune three pre-trained Arabic BERT encoders in three settings: flat NER, nested NER and nested NER with sub-types and achieved F1 score of 0.920, 0.866, and 0.885, respectively. Our corpus and models are open source and available at https://sina.birzeit.edu/wojood/.</abstract>
<abstract>Continuous learning from free-text human feedback, such as error corrections, new knowledge, or alternative responses, is essential for today’s chatbots and virtual assistants to stay up-to-date, engaging, and socially acceptable. However, for research on methods for learning from such data, annotated data is scarce. To address this, we examine the error and user response types of six popular dialogue datasets from various types, including MultiWoZ, PersonaChat, Wizards-of-Wikipedia, and others, to assess their extendibility with the needed annotations. For this corpus study, we manually annotate a subset of each dataset with error and user response types using an improved version of the Integrated Error Taxonomy and a newly proposed user response type taxonomy. We provide the resulting dataset (EURTAD) to the community. Our findings provide new insights into dataset composition, including error types, user response types, and the relations between them.</abstract>
14132
+
<abstract>Learning from free-text human feedback is essential for dialog systems, but annotated data is scarce and usually covers only a small fraction of error types known in conversational AI. Instead of collecting and annotating new datasets from scratch, recent advances in synthetic dialog generation could be used to augment existing dialog datasets with the necessary annotations. However, to assess the feasibility of such an effort, it is important to know the types and frequency of free-text human feedback included in these datasets. In this work, we investigate this question for a variety of commonly used dialog datasets, including MultiWoZ, SGD, BABI, PersonaChat, Wizardsof-Wikipedia, and the human-bot split of the Self-Feeding Chatbot. Using our observations, we derive new taxonomies for the annotation of free-text human feedback in dialogs and investigate the impact of including such data in response generation for three SOTA language generation models, including GPT-2, LLAMA, and Flan-T5. Our findings provide new insights into the composition of the datasets examined, including error types, user response types, and the relations between them.</abstract>
<abstract>We present Dolphin, a novel benchmark that addresses the need for a natural language generation (NLG) evaluation framework dedicated to the wide collection of Arabic languages and varieties. The proposed benchmark encompasses a broad range of 13 different NLG tasks, including dialogue generation, question answering, machine translation, summarization, among others. Dolphin comprises a substantial corpus of 40 diverse and representative public datasets across 50 test splits, carefully curated to reflect real-world scenarios and the linguistic richness of Arabic. It sets a new standard for evaluating the performance and generalization capabilities of Arabic and multilingual models, promising to enable researchers to push the boundaries of current methodologies. We provide an extensive analysis of Dolphin, highlighting its diversity and identifying gaps in current Arabic NLG research. We also offer a public leaderboard that is both interactive and modular and evaluate several Arabic and multilingual models on our benchmark, allowing us to set strong baselines against which researchers can compare.</abstract>
0 commit comments