Skip to content
Discussion options

You must be logged in to vote
  1. Initially, I used a pure ViT (6ecc3f4). But the encoder was just not performing very well. The model produced latex code but it has nothing to do with the input image. Given that the model is very small, I can confirm the authors findings.
  2. Well it still is a 1D embedding. I use the same strategy as the authors only in a more generalized way.
  3. That is all the data I rendered out for training. The reason the latex file has more entries is that it also contains formula of equations I wasn't able to render (because they threw some errors). I had to leave them in the file because I was lazy and my dataloader would break without them. I will probably return to that problem and compile a better,…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by TITC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants