15 Dec 2012, 09:49
Generic-user-small

Dmitry Maslennikov (3 posts)

in test grammar T.g4


grammar T;

def     : command id ;
command    : CMD ;
id    : SYMBOL+;
CMD    : 'cmd';
TEST    : 'test';
SYMBOL    : [a-zA-Z];
WS      : [ |\r|\n]+ -> skip;

tokens CMD and TEST, is not a keyword
in test text
cmd test
error
line 1:4 mismatched input ‘test’ expecting SYMBOL

what am I doing wrong

17 Dec 2012, 22:23
Australia-3_pragsmall

Terence Parr (33 posts)

Hi. try this:

SYMBOL : [a-zA-Z]+ ;

and get rid of id.
Ter
18 Dec 2012, 20:50
Generic-user-small

Dmitry Maslennikov (3 posts)


grammar T;

def     : command ID ;
command    : CMD;
CMD    : 'cmd';
TEST    : 'test';
ID    : [a-zA-Z]+;
WS      : [ |\r|\n]+ -> skip;

and so also a error
[@0,0:2='cmd',<1>,1:0]
[@1,4:7='test',<2>,1:4]
[@2,8:7='<EOF>',<-1>,1:8]
line 1:4 mismatched input 'test' expecting ID
21 Dec 2012, 18:51
Generic-user-small

Bernard Kaiflin (8 posts)

I don’t visit the forum frequently. For such questions, you’ll have a quicker answer on Stackoverflow : http://stackoverflow.com/questions/tagged/antlr4

Given your grammar, the lexer can match the input t-e-s-t with both rules TEST and ID. It is ambiguous, and ANTLR4 chooses the first rule TEST. Hence the parser receives a token of type TEST while it is expecting a token ID.
You can simplify the grammar like this :


grammar T;

def : 'cmd' ID {System.out.println("found <" + $def.text + ">");};
ID  : [a-zA-Z]+;
WS  : [ \r\n]+ -> skip;

$ grun T def -diagnostics t.txt 
found <cmdtest>


Note that vertical bars are not necessary within a charset [].

If you prefer explicit tokens :


grammar T;

def : CMD ID {System.out.println("found <" + $def.text + ">");}
    ;
CMD : 'cmd' ;
ID  : [a-zA-Z]+ ;
WS  : [ \r\n]+ -> skip ; 


Again the input c-m-d can be matched by CMD and ID, thus CMD must precede ID.

  You must be logged in to comment