pythonlisppretty-prints-expression

Pretty-print Lisp using Python


Is there a way to pretty-print Lisp-style code string (in other words, a bunch of balanced parentheses and text within) in Python without re-inventing a wheel?


Solution

  • Short answer

    I think a reasonable approach, if you can, is to generate Python lists or custom objects instead of strings and use the pprint module, as suggested by @saulspatz.

    Long answer

    The whole question look like an instance of an XY-problem. Why? because you are using Python (why not Lisp?) to manipulate strings (why not data-structures?) representing generated Lisp-style code, where Lisp-style is defined as "a bunch of parentheses and text within". To the question "how to pretty-print?", I would thus respond "I wouldn't start from here!". The best way to not reinvent the wheel in your case, apart from using existing wheels, is to stick to a simple output format.

    But first of all all, why do you need to pretty-print? who will look at the resulting code?

    Depending on the exact Lisp dialect you are using and the intended usage of the code, you could format your code very differently. Think about newlines, indentation and maximum width of your text, for example. The Common Lisp pretty-printer is particulary evolved and I doubt you want to have the same level of configurability. If you used Lisp, a simple call to pprint would solve your problem, but you are using Python, so stick with the most reasonable output for the moment because pretty-printing is a can of worms.

    If your code is intended for human readers, please:

    This is ugly:

     ( * ( + 3 x )
         (f 
            x
            y
         )
     )
    

    This is better:

    (* (+ 3 x)
       (f x y))
    

    Or simply:

    (* (+ 3 x) (f x y))
    

    See here for more details.

    But before printing, you have to parse your input string and make sure it is well-formed. Maybe you are sure it is well-formed, due to how you generate your forms, but I'd argue that the printer should ignore that and not make too many assumptions. If you passed the pretty-printer an AST represented by Python objects instead of just strings, this would be easier, as suggested in comments. You could build a data-structure or custom classes and use the pprint (python) module. That, as said above, seems to be the way to go in your case, if you can change how you generate your Lisp-style code.

    With strings, you are supposed to handle any possible input and reject invalid ones. This means checking that parenthesis and quotes are balanced (beware of escape characters), etc. Actually, you don't need to really build an intermediate tree for printing (though it would probably help for other parts of your program), because Lisp-style code is made of forms that are easily nested and use a prefix notation: you can scan your input string from left-to-right and print as required when seeing parenthesis (open parenthesis: recurse; close parenthesis, return from recursion). When you first encounter an unescaped double-quote ", read until the next one ", ... This, coupled with a simple printing method, could be sufficient for your needs.