Dear authors, Thanks for your outstanding work. I have a question for you. When you want to predict a node, you don't know whether it is a terminal node or a non terminal node in advance,and this two kinds of nodes are predicted in different ways(described in the article as two methods:Predicting AST Nodes and Predicting Subtokens). So, how to distinguish these two nodes in order to use different prediction methods in code implementation?