Skip to content

salpdikmen/Discrepancies-between-Rhetoric-Reality

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Discrepancies between Rhetoric and Reality

An Analysis of SDGs in UN General Debate Speeches and Their Practical Implementation

Group project — UCD Connected Politics Lab
Alp Dikmen · Kun Dong · Mohamed Moheeb · Moises A. Silva Servin


Overview

This project investigates whether the SDG topics that countries emphasise in their UN General Debate (UNGD) speeches align with how they implement the SDGs in practice. Using a fine-tuned BERT classifier, we scored diplomatic speeches against all 17 UN Sustainable Development Goals and compared those scores against real-world implementation data.

Research question: Which SDG topics are mentioned in UNGD speeches, and how do the topics mentioned differ from the SDGs prioritised in national implementation?


The Model

Existing SDG classifiers had a critical limitation: the widely-used BERT-based baseline only covered SDGs 1–16, with no residual "No SDG" category — causing the model to over-classify almost all content as SDG-relevant.

We addressed this by fine-tuning a new classifier on the OSDG Community Dataset, augmented with a synthetic corpus of non-SDG-related speech segments. The synthetic data gave the model negative examples to learn from, allowing it to distinguish genuine SDG discourse from general diplomatic language.

The final model classifies text into 18 categories: SDGs 1–17 plus a "No SDG" residual class.

Stack: Python · Hugging Face Transformers · PyTorch · Azure ML


Data

Dataset Source Coverage
UN General Debate Corpus Baturo, Dasandi & Mikhaylov (2017) 1946–2024
SDG Implementation Scores United Nations 2015–2023
GDP Rankings (PPP-based) World Bank 2023

Repository Structure

├── Base papers/          # Reference literature + SDG keyword dictionary
├── Datasets and code/    # Preprocessing, merging, and modelling scripts
├── UN Corpus/            # UN General Assembly speeches corpus (2008–2023)
└── Methodology.docx      # Detailed methodology documentation

Presentation

📎 View Presentation


Developed as part of the UCD Connected Politics Lab module, 2024–2025.

About

BERT-based classifier for SDG detection in UN General Debate speeches (2008–2023). Includes synthetic data augmentation, gap analysis, and panel regression.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 81.8%
  • R 18.2%