Skip to content

Commit d893317

Browse files
committed
initial commit
1 parent f6278b6 commit d893317

File tree

11 files changed

+3403
-2
lines changed

11 files changed

+3403
-2
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
*~
2+
__pycache__/*
3+
.venv

README.md

Lines changed: 135 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,135 @@
1-
# Tibert
2-
End-to-End BERT-Based Coreference System
1+
# Current Status
2+
3+
**Note that this is an early release. Don't hesitate to report bugs/possible improvements! There are surely many.**
4+
5+
6+
# tibert
7+
8+
`tibert` is a transformers-compatible reproduction from the paper [End-to-end Neural Coreference Resolution](https://aclanthology.org/D17-1018/) with several modifications. Among these:
9+
10+
- Usage of BERT (or any BERT variant) as an encoder as in [BERT for Coreference Resolution: Baselines and Analysis](https://aclanthology.org/D19-1588/)
11+
- batch size can be greater than 1
12+
- Support of singletons as in [Adapted End-to-End Coreference Resolution System for Anaphoric Identities in Dialogues](https://aclanthology.org/2021.codi-sharedtask.6)
13+
14+
15+
It can be installed with `pip install tibert`.
16+
17+
18+
# Documentation
19+
20+
## Simple Prediction Example
21+
22+
Here is an example of using the simple prediction interface:
23+
24+
```python
25+
from tibert import BertForCoreferenceResolution, predict_coref_simple
26+
from tibert.utils import pprint_coreference_document
27+
from transformers import BertTokenizerFast
28+
29+
model = BertForCoreferenceResolution.from_pretrained(
30+
"compnet-renard/bert-base-cased-literary-coref"
31+
)
32+
tokenizer = BertTokenizerFast.from_pretrained("bert-base-cased")
33+
34+
coref_out = predict_coref_simple(
35+
"Sli did not want the earpods. He didn't like them.", model, tokenizer
36+
)
37+
38+
pprint_coreference_document(coref_out)
39+
```
40+
41+
results in:
42+
43+
`>>> (0 Sli ) did not want the earpods. (0 He ) didn't like them.`
44+
45+
46+
## Batched Predictions for Performance
47+
48+
A more advanced prediction interface is available:
49+
50+
```python
51+
from transformers import BertTokenizerFast
52+
from tibert import predict_coref, BertForCoreferenceResolution
53+
54+
model = BertForCoreferenceResolution.from_pretrained(
55+
"compnet-renard/bert-base-cased-literary-coref"
56+
)
57+
tokenizer = BertTokenizerFast.from_pretrained("bert-base-cased")
58+
59+
documents = [
60+
"Sli did not want the earpods. He didn't like them.",
61+
"Princess Liana felt sad, because Zarth Arn was gone. The princess went to sleep.",
62+
]
63+
64+
annotated_docs = predict_coref(documents, model, tokenizer, batch_size=2)
65+
66+
for doc in annotated_docs:
67+
pprint_coreference_document(doc)
68+
```
69+
70+
results in:
71+
72+
`>>> (0 Sli ) did not want the earpods . (0 He ) didn't like them .`
73+
74+
`>>> (0 Princess Liana ) felt sad , because (1 Zarth Arn ) was gone . (0 The princess) went to sleep .`
75+
76+
77+
## Training a model
78+
79+
Aside from the `tibert.train.train_coref_model` function, it is possible to train a model from the command line. Training a model requires installing the `sacred` library. Here is the most basic example:
80+
81+
```sh
82+
python -m tibert.run_train with\
83+
dataset_path=/path/to/litbank/repository\
84+
out_model_path=/path/to/output/model/directory
85+
```
86+
87+
The following parameters can be set (taken from `./tibert/run_train.py` config function):
88+
89+
| Parameter | Default Value |
90+
|------------------------------|---------------------|
91+
| `batch_size` | `1` |
92+
| `epochs_nb` | `30` |
93+
| `dataset_path` | `"~/litbank"` |
94+
| `mentions_per_tokens` | `0.4` |
95+
| `antecedents_nb` | `350` |
96+
| `max_span_size` | `10` |
97+
| `mention_scorer_hidden_size` | `3000` |
98+
| `sents_per_documents_train` | `11` |
99+
| `mention_loss_coeff` | `0.1` |
100+
| `bert_lr` | `1e-5` |
101+
| `task_lr` | `2e-4` |
102+
| `dropout` | `0.3` |
103+
| `segment_size` | `128` |
104+
| `encoder` | `"bert-base-cased"` |
105+
| `out_model_path` | `"~/tibert/model"` |
106+
107+
108+
One can monitor training metrics by adding run observers using command line flags - see `sacred` documentation for more details.
109+
110+
111+
# Method
112+
113+
We reimplemented the model from [Lee et al., 2017](https://aclanthology.org/D17-1018/) from scratch, but used BERT as the encoder as in [Joshi et al., 2019](https://aclanthology.org/D19-1588/). We do not use higher order inference as in [Lee et al., 2018](https://aclanthology.org/N18-2108/) since it was found to be not necessarily useful by [Xu and Choi, 2020](https://aclanthology.org/2020.emnlp-main.686/).
114+
115+
## Singletons
116+
117+
Unfortunately, the framework from [Lee et al., 2017](https://aclanthology.org/D17-1018/) cannot represent singletons. This is because the authors were working on the OntoNotes dataset, where singletons are not annotated. We wanted to work on Litbank, so we had to find a way to represent singletons.
118+
119+
We opted to do as in [Xu and Choi, 2021](https://aclanthology.org/2021.codi-sharedtask.6/): we consider mention with a high enough mention scores as singletons, even when they are in no clusters. To force the model to learn proper mention scores, we add an auxiliary loss on mention score (as in [Xu and Choi, 2021](https://aclanthology.org/2021.codi-sharedtask.6/)). To counter dataset imbalance between positive and negative mentions, we opt to compute a weighted loss instead of performing sampling.
120+
121+
## Additional Features
122+
123+
Several work make use of additional features. For now, only the distance between spans is implemented.
124+
125+
126+
# Results
127+
128+
The following table presents the results we obtained by training this model (for now, it has only one entry !). Note that:
129+
130+
- the reported result was obtained with a limitation were documents are truncated to 512 tokens, so they may not be accurate with the performance on full documents
131+
- the reported results can not be directly compared to the performance in [the original Litbank paper](https://arxiv.org/abs/1912.01140) since we only compute performance on one split of the datas
132+
133+
| Dataset | Base model | MUC | B3 | CEAF | CoNLL F1 |
134+
|---------|-------------------|-------|-------|-------|----------|
135+
| Litbank | `bert-base-cased` | 75.49 | 65.69 | 55.56 | 65.58 |

0 commit comments

Comments
 (0)