pegpegjs

Use PEG.js generated parser to beautify code


I want to create a formatter/linter for a custom program language and was reading about it but seems that im missing something.

Was looking/playing with PEG.js and seems that it will do the work. Ive wrote a small parser and when ran - it correctly returns the syntax tree (AST)

And the main question now is: how to use the generated parser to create (for example) a VSCode/Atom/CodeMirror/etc. extension(s) that will beautify/format the code?

Is this the right approach in general? (using a parser or need to write specific parsers for each tool)


Solution

  • Beautifying code is basically just converting the AST back into code, throwing away the original white space and replacing it with the desired formatting.

    The following grammar converts a list of case-insensitive a characters into an array:

    Expression = _ array:( a:"a"i _ {return a} )+ _ {return array}
    _ = [ \t\n]*
    

    So given this input:

    aa
    aaaAa
    
    a
    

    You get this output:

    [
       "a",
       "a",
       "a",
       "a",
       "a",
       "A",
       "a",
       "a"
    ]
    

    To "beautify" this list, you'd simply convert the array back into a list, except with a more regular spacing:

    result.join(" ");
    // produces "a a a a a A a a", which is syntactically identical