Do you need to tokenize your data when using a BERT/ROBERTA model?

Considering that these models have their own tokenization and BPE models, what is the format of the input files to train a QE model using any of this LM? Should you apply any kind of previous tokenization/casing model?

Thanks in advance for your help!