Multi-label classification for Narrative and Subnarrative labels using a BERT encoder (bert-base-multilingual-cased) with:
- Hierarchical conditioning: subnarrative head uses narrative logits as additional input
- Hierarchical consistency loss: encourages predicted subnarratives to align with the active narrative
- Focal loss + pos_weight: handles class imbalance
- Oversampling with
WeightedRandomSampler - Separate scripts for training, inference, and evaluation
Data is not included in this repo. Put your local files under
data/as shown below.
.
├── README.md
├── requirements.txt
├── .gitignore
├── scripts/
│ ├── train.py
│ ├── infer.py
│ └── eval.py
├── src/
│ ├── training.py
│ ├── inference.py
│ └── evaluation.py
├── data/ # not committed (placeholder folders via .gitkeep)
│ ├── annotations/
│ │ └── annotation.txt
│ ├── articles/
│ │ └── <article_id files...>
│ └── validation/
│ └── <article_id files...>
├── models/ # created by training (ignored unless using LFS)
│ └── final_model/
│ ├── config.json
│ ├── pytorch_model.bin # or model.safetensors (optional)
│ ├── tokenizer files...
│ ├── narrative_mapping.json
│ └── subnarrative_mapping.json
└── outputs/ # predictions + logs (not committed)
├── submission.txt
└── output/ # trainer checkpoints/logs
Tab-separated with 3 columns:
article_id<TAB>narrative_labels<TAB>subnarrative_labels
Rules:
- Multiple labels are separated by
; - Subnarratives follow
Narrative: Subnarrativeformat Example:Economy: Inflation
python -m venv .venv
# Windows:
.venv\Scripts\activate
# Linux/Mac:
source .venv/bin/activatepip install -U pip
pip install -r requirements.txtGPU is optional. The code will automatically use CUDA if available.
Run via the wrapper scripts in
scripts/(recommended).
python scripts/train.pyThis will:
- Read
data/annotations/annotation.txt - Load article texts from
data/articles/ - Train with evaluation each epoch
- Save the final model and label mappings to
models/final_model/
Outputs:
models/final_model/(model weights + tokenizer + mappings)outputs/output/(trainer checkpoints/logs)
Put dev/validation articles in:
data/validation/
Run:
python scripts/infer.pyThis will:
- Load model + tokenizer from
models/final_model/ - Predict labels for each file in
data/validation/ - Enforce hierarchical consistency on subnarratives
Output:
outputs/submission.txt(tab-separated:article_id narrative_labels subnarrative_labels)
python scripts/eval.pyThis evaluates:
- gold:
data/annotations/annotation.txt - predictions:
outputs/submission.txt
Metrics printed:
-
Averaged sample F1 for:
(narrative:subnarrative)pairs- narrative-only
- subnarrative-only
-
Macro F1 for:
- narrative-only
- subnarrative-only
The model predicts narrative_logits first, then concatenates them with the pooled BERT output to predict subnarrative_logits:
- Narrative head:
BERT -> narrative_logits - Subnarrative head:
concat(BERT_pooled, narrative_logits) -> subnarrative_logits
inference.py enforces:
- If narrative is empty or only
Other→ set subnarrative toOther - Otherwise, for each predicted narrative, ensure at least one matching subnarrative exists
If not → append
Narrative: Other
Inference uses thresholds + fallback:
- Pick labels above primary threshold
- If none, force top label and optionally add a 2nd if above a fallback threshold
- File not found: ensure
data/annotations/annotation.txtand article text files exist underdata/articles/anddata/validation/. - Mismatch in
article_idnames:article_idis used as a file name directly. - Long texts: model uses
max_length=512with truncation.
Implemented end-to-end by Abdul Wahab Madni (training + inference + evaluation).