javaantlrantlr4antlrworks

Force ANTLR to read only first commented section from input file and skip the rest comments


My input file is having multiple JavaDoc style comments (/** ..... */), I need to read only the very first commented section and skip the rest all commented sections.

Input.txt

/** 
  @Description("Hi there")
  @Input("String")

 */

/**
 * This is the a
 * commented section that we
 * don't want to read.
 */

/**
 * This is the another
 * commented section that we
 * don't want to read.
 */

My Lexer Grammar is as below:-

lexer grammar AnnotationLexer;

ANNOTATION_START
 : '/**' -> mode(INSIDE), skip
 ;

IGNORE
 : . -> skip
 ;

mode INSIDE;


KEY : '@' [a-zA-Z]+ ;


STRING: '"' (~'"' | ',')* '"' ;


ANNOTATION_END
 : '*/' -> mode(DEFAULT_MODE), skip
 ;

IGNORE_INSIDE
 : [ \t\r\n] -> skip

Solution

  • Here is my try (I have not tried it though). I am afraid it will not be satisfactory unless you really only read just javadocs and nothing else:

    lexer grammar AnnotationLexer;
    
    ANNOTATION_START
     : '/**' -> mode(INSIDE), skip
     ;
    
    IGNORE
     : . -> skip
     ;
    
    mode INSIDE;
    
    
    KEY : '@' [a-zA-Z]+ ;
    
    
    STRING: '"' (~'"' | ',')* '"' ;
    
    
    ANNOTATION_END
     : '*/' -> mode(READ_JAVADOC), skip
     ;
    
    IGNORE_INSIDE
     : [ \t\r\n] -> skip
    
    mode READ_JAVADOC;
    
    JAVADOC_START_AFTER_FIRST
     : '/**' skip
     ;
    
    IGNORE_INSIDE_AFTER_FIRST
     : [ \t\r\n] -> skip
     ;
    
    JAVADOC_END_AFTER_FIRST
     : '*/' skip
     ;
    

    Practically this way you have to create all lexer rules two times. Probably it is better to use semantic predicates in this case (with mutable member fields for the state describing how many javadoc were read) instead of modes.