The following PDF extracts no text despite there being a clear text layer, PyPdfium works fine for this:
https://www.has-sante.fr/upload/docs/application/pdf/ct031458.pdf
I have seen a few other examples where the text layer is garbled using this project as a backend, thought it may be helpful.
Thanks,
Herman