Skip to content

Implement Pooler layer in BertModelLayer#82

Open
mrinaald wants to merge 3 commits intokpe:masterfrom
mrinaald:pooler-layer
Open

Implement Pooler layer in BertModelLayer#82
mrinaald wants to merge 3 commits intokpe:masterfrom
mrinaald:pooler-layer

Conversation

@mrinaald
Copy link

@mrinaald mrinaald commented Nov 9, 2020

Implement the Pooler layer from the BERT model architecture, which creates a pooled feature vector using the first token from the output sequence. In many of the online blogs and examples, they mention to take the pooled output from BERT directly and add dense layers (or other layers) on this pooled output.

With this change, the pooler layer weights available in the downloaded checkpoint files of various models can also be loaded into the BertModelLayer object.

  • Original Behaviour:
    Done loading 37 BERT weights from: ~/Downloads/BERT/BERT-Weights/uncased_L-2_H-128_A-2/bert_model.ckpt into <bert.model.BertModelLayer object at 0x7f6a9c64df40> (prefix:bert_orig). Count of weights not found in the checkpoint was: [0]. Count of weights with mismatched shape: [0]
    Unused weights from checkpoint:
    bert/pooler/dense/bias
    bert/pooler/dense/kernel
    cls/predictions/output_bias
    cls/predictions/transform/LayerNorm/beta
    cls/predictions/transform/LayerNorm/gamma
    cls/predictions/transform/dense/bias
    cls/predictions/transform/dense/kernel
    cls/seq_relationship/output_bias
    cls/seq_relationship/output_weights

  • Modified Behaviour:
    Done loading 39 BERT weights from: ~/Downloads/BERT/BERT-Weights/uncased_L-2_H-128_A-2/bert_model.ckpt into <bert.model.BertModelLayer object at 0x7f6a9d026a30> (prefix:bert_pooled). Count of weights not found in the checkpoint was: [0]. Count of weights with mismatched shape: [0]
    Unused weights from checkpoint:
    cls/predictions/output_bias
    cls/predictions/transform/LayerNorm/beta
    cls/predictions/transform/LayerNorm/gamma
    cls/predictions/transform/dense/bias
    cls/predictions/transform/dense/kernel
    cls/seq_relationship/output_bias
    cls/seq_relationship/output_weights

To get the pooler layer output, we need to initialize the BertModelLayer as follows:
bert_params.return_pooler_output = True
l_bert = bert.BertModelLayer.from_params(bert_params, name="bert")

@Ahmedn1
Copy link

Ahmedn1 commented Aug 31, 2021

Would someone merge this branch, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants