Adding BERT for MS-MARCO Passage re-ranking#277
Adding BERT for MS-MARCO Passage re-ranking#277atif93 wants to merge 12 commits intoasyml:masterfrom
Conversation
Codecov Report
@@ Coverage Diff @@
## master #277 +/- ##
==========================================
- Coverage 83.07% 83.01% -0.06%
==========================================
Files 195 195
Lines 15323 15338 +15
==========================================
+ Hits 12729 12733 +4
- Misses 2594 2605 +11
Continue to review full report at Codecov.
|
|
Wanted to get an idea of what you guys think of this design for loading a pretrained BERTClassifier config. |
|
|
||
| super().__init__(hparams=hparams) | ||
|
|
||
| self.load_pretrained_config(pretrained_model_name, cache_dir) |
There was a problem hiding this comment.
Will load_pretrained_config and init_pretrained_weights be called twice (once in BERTClassifier, and once in BERTEncoder)?
There was a problem hiding this comment.
If that is the case, we probably should not load the pre-trained weights in self._encoder (BERTEncoder).
There was a problem hiding this comment.
Discussed offline.
We can pass pretrained_model_name as None while instantiating the encoder in BERTClassifier.
There was a problem hiding this comment.
If you set pretrained_model_name and pretrained_model_name in hparams to be None, BERTEncoder won't load the pre-trained weights.
There was a problem hiding this comment.
Made both the changes.
|
|
||
| # BERT for MS-MARCO | ||
| 'bert-msmarco-base': 512, | ||
| 'bert-msmarco-large': 512, |
There was a problem hiding this comment.
This won't be the last/best Bert model for MS-Marco, so probably we'd come up with more specific names, say bert-msmarco-nguyen2019
There was a problem hiding this comment.
Sure let me change that
Adding BERT fine-tuned on MS-MARCO for passage re-ranking task (https://arxiv.org/abs/1901.04085)
Since this is a pretrained classifier, we had to add final linear layer parameters in
PretrainedBERTMixin. Based on thepretrained_model_name, the weights of the final classifier layer will be loaded if they are present.resolve #254