import torch.nn as nn
class SegmentEmbedding(nn.Embedding):
def __init__(self, embed_size=512):
super().__init__(3, embed_size, padding_idx=0)
This is the source code. First idx is padding, thus only 2 segment is supported. Why does Bert support 2 segments only?