-
Notifications
You must be signed in to change notification settings - Fork 176
Description
Given a rule:
rule: sub_rule ';' ;
when successfully matched sub_rule but encounter rather than ';', an error message is yielded like this:
-end of input-(1),error 1 : Unexpected token, at offset 6226064, at <EOF> : syntax error...
(Although 6226064 seems not correct to me, the process not gen core dump)
Under some condition, the process will core before , at offset 6226064, at <EOF> : syntax error... showed.
By digging into the source code I found it crash at newStr8 (line 587 of antlr3string.c ) when apply strlen at ptr.
newStr8 is called by getText at line 371 of antlr3commontoken.c.
In normal cases newStr8 is called by getText at 389 of antlr3commontoken.c because token->textState equals neither ANTLR3_TEXT_STRING nor ANTLR3_TEXT_CHARP.
A little deeper digging showed that token is pointing to eofToken of ANTLR3_TOKEN_SOURCE, which is not initialized and both eofToken->textState and eofToken->tokText.chars contain random value. When eofToken->textState happen to equal to ANTLR3_TEXT_CHARP and eofToken->tokText.chars hold the address of inaccessible address, core dump shows up.
Below is the call stack when token points to eofToken.
_tokLT(ANTLR3_TOKEN_STREAM_struct * ts=0x003e6f70, int k=1) Line 377 C_
antlr3RecognitionExceptionNew(ANTLR3_BASE_RECOGNIZER_struct * recognizer=0x00754378) Line 347 + 0x14 bytes C_
_recoverFromMismatchedToken(ANTLR3_BASE_RECOGNIZER_struct * recognizer=0x00754378, unsigned int ttype=64, ANTLR3_BITSET_LIST_struct * follow=0x004c9608) Line 1471 + 0x9 bytes C_
_match(ANTLR3_BASE_RECOGNIZER_struct * recognizer=0x00754378, unsigned int ttype=64, ANTLR3_BITSET_LIST_struct * follow=0x004c9608) Line 478 + 0x16 bytes C_
The next is the call stack just before process crash:
_newStr8(ANTLR3_STRING_FACTORY_struct * factory=0x003e6590, unsigned char * ptr=0x00473894) Line 587 C_
_getText(ANTLR3_COMMON_TOKEN_struct * token=0x003e6c50) Line 389 + 0x19 bytes C_
_toString(ANTLR3_COMMON_TOKEN_struct * token=0x003e6c50) Line 538 + 0xe bytes C_
_displayRecognitionError(ANTLR3_BASE_RECOGNIZER_struct * recognizer=0x00754378, unsigned char * * tokenNames=0x004c9460) Line 1066 + 0x11 bytes C_
_reportError(ANTLR3_BASE_RECOGNIZER_struct * recognizer=0x00754378) Line 747 + 0x18 bytes C_