训练过程具体是怎么样的

我用您的模型在自己的模型上训练， 请问训练过程是
1. random initialization -> fine-tuning with a fixed learning rate
还是
2. freezing the BERT part and training the BiLSTM-CRF part -> fine-tuning the whole network with a small learning rate

因为有看到测试的时候似乎用用原BERT的representation 的