I'm writing my own x86-64 assembler in Common Lisp and it produces correct binary code for a subset of x86-64. I use a custom reader macro to convert assembly code to a syntax tree, and it works as expected.
What I am attempting to accomplish is to allow using Lisp code inside assembly code, that way I could use Lisp as a macro language for my assembler. I use #a
as the macro dispatch character and #e
to signal end for the reader. Inside reader #l
changes to Lisp mode and #a
back to assembly mode, #e
(to signal end for the reader macro) should work in both modes.
What I don't understand is how to output the results of the evaluated code back to the input stream (to be processed before the rest of the code), or otherwise how to get the Lisp code output be read again, so that the output of Lisp code (it would be assembly code) could be processed appropriately (the same way as the rest of the assembly code). How can I reach that goal?
A sidenote: this is my first reader macro, so there may be design flaws. I think my approach to read Lisp code into a string is not necessarily the best way, if there is some shorter and more idiomatic way to do it.
Here's a simplified version of my reader macro:
(eval-when (:compile-toplevel :load-toplevel :execute) (defun get-last-character-string (my-string) "This function returns a string consisting of the last character of the input string." (subseq my-string (1- (length my-string)))) (defun get-string-without-last-character (my-string) "This function returns a string without the last character of the input string." (subseq my-string 0 (1- (length my-string)))) (defun get-string-without-invalid-last-character (my-string invalid-last-characters) "If the last character of the string is invalid, the string is returned without it, otherwise completely." (loop for invalid-last-character in invalid-last-characters do (if (equal (get-last-character-string my-string) invalid-last-character) (setf my-string (get-string-without-last-character my-string)))) my-string) (defun transform-code-to-string (stream sub-char numarg) "This function converts assembly code into a string. #l marks change to Lisp code. #a marks return to asm. #e marks end. Partially based on: http://weitz.de/macros.lisp" (declare (ignore sub-char numarg)) (let* ((invalid-last-characters (list "'" " " "(" ")")) (current-mode "asm") (is-there-code-on-this-line nil) (current-phase "beginning-of-line") (my-string "(list ") (lisp-code-string "")) ;; loop through stream. (loop for my-char = (coerce (list (read-char stream t nil t)) 'string) do (cond ((equal current-mode "asm") (cond ((equal current-phase "hash-sign-read") ;; is character e ? ;; if yes, we're done, fix closing parentheses and return. (cond ((equal my-char "e") (return-from transform-code-to-string (concatenate 'string (get-string-without-invalid-last-character (get-string-without-invalid-last-character my-string invalid-last-characters) invalid-last-characters) "))"))) ;; is character l ? ;; if yes, change to Lisp mode. ((equal my-char "l") ;; could Lisp code could be read and evaluated here ;; without reading it into a string? (progn (setf current-mode "Lisp") (setf is-there-code-on-this-line nil) (setf lisp-code-string "") (setf current-phase "beginning-of-line"))) ;; otherwise, print error. (t (error "in asm mode undefined control character after #")))) ;; is character # ? ;; if yes, mark hash sign read. ((equal my-char "#") (setf current-phase "hash-sign-read")) ;; is character newline? ((equal my-char (coerce (list #\Newline) 'string)) (progn (cond ;; is there _no_ code on this line? ;; if true, do not output anything. ((not is-there-code-on-this-line) (setf current-phase "beginning-of-line")) ;; are we inside instruction or inside a parameter? ;; if true, output ") ((or (equal current-phase "inside-instruction") (equal current-phase "inside-parameters")) (progn (setf current-phase "beginning-of-line") (setf is-there-code-on-this-line nil) (setf my-string (concatenate 'string my-string "\")")))) ;; otherwise output ) (t (progn (setf current-phase "beginning-of-line") (setf is-there-code-on-this-line nil) (setf my-string (concatenate 'string my-string ")"))))))) ;; are we inside a comment? ;; if yes, don't output anything. ((equal current-phase "inside-comment") nil) ;; are we in the beginning of the line? ((equal current-phase "beginning-of-line") (cond ;; is this a space in the beginning of the line? ;; if yes, do not output anything. ((equal my-char " ") nil) ;; is this the first character of instruction and not ( or ) ? ;; if yes, mark there is code on this line, mark first character as printed, output " and current character. ((and (not (equal my-char "(")) (not (equal my-char ")"))) (progn (setf current-phase "inside-instruction") (setf is-there-code-on-this-line t) (setf my-string (concatenate 'string my-string "'(\"" my-char)))) (t nil))) ;; is character ; ? ;; if yes, don't output anything, begin comment. ((equal my-char ";") (setf current-phase "inside-comment")) ;; is character space or comma? ((or (equal my-char " ") (equal my-char ",")) (cond ;; is character space or comma, and last character was _not_ space, comma or opening parenthesis? ;; if yes, output " and space. ((and (not (equal (get-last-character-string my-string) " ")) (not (equal (get-last-character-string my-string) ",")) (not (equal (get-last-character-string my-string) "("))) (progn (setf current-phase "in-space") (setf my-string (concatenate 'string my-string "\" ")))) (t nil))) ;; is instruction printed and this is the 1st character of a parameter? ((and (not (equal current-phase "inside-instruction")) (or (equal (get-last-character-string my-string) " ") (equal (get-last-character-string my-string) ","))) (cond ;; mark we're inside parameters, output " and current character. (t (progn (setf current-phase "inside-parameters") (setf my-string (concatenate 'string my-string "\"" my-char)))))) ;; otherwise output the character. (t (setf my-string (concatenate 'string my-string my-char))))) ((equal current-mode "Lisp") ;; in Lisp mode, read text until #e or #a is reached and eval it. (cond ((equal current-phase "hash-sign-read") (cond ;; is character e ? ;; if yes, we're done, fix closing parentheses and return. ((equal my-char "e") (progn (concatenate 'string "#a" (eval lisp-code-string) "#e") ; this should be something different. (return-from transform-code-to-string (concatenate 'string (get-string-without-invalid-last-character (get-string-without-invalid-last-character my-string invalid-last-characters) invalid-last-characters) "))")))) ;; is character a ? ;; if yes, change to asm mode. ((equal my-char "a") (progn (setf current-mode "asm") (setf is-there-code-on-this-line nil) (setf current-phase "beginning-of-line") (concatenate 'string "#a" (eval lisp-code-string) "#e") ; this should be something different. ;; otherwise, add # and the character to the Lisp code to be evaluated. (t (progn (setf current-phase "") (setf my-string (concatenate 'string lisp-code-string "#" my-char)))))) ;; is character # ? ;; if yes, mark hash sign read. ((equal my-char "#") (setf current-phase "hash-sign-read")) ;; otherwise add the character to the Lisp code to be evaluated. (t (setf my-string (concatenate 'string lisp-code-string my-char))))) (t (error "invalid current mode")))))) ;;; #a is the input which starts the custom reader. (set-dispatch-macro-character #\# #\a #'transform-code-to-string))
Here's some example assembly code without Lisp code inside, works:
(defparameter *example-code-x64* #a inc r10 ; increment register r10. mov r11,r12 ; store value of r12 into r11. #e)
And here's some assembly code with Lisp code inside, fails (see compiling error further below). In this one the Lisp code is after assembly code, but assembly and Lisp code should be allowed to be mixed freely using #a
and #l
as separators.
(defparameter *example-code-x64-with-lisp-fails* #a inc r10 ; increment register r10. mov r11,r12 ; store value of r12 into r11. #l (loop for current-instruction in (list "inc" "dec") do (loop for current-arg in (list "r13" "r14" "r15") do (princ (concatenate 'string current-instruction " " current-arg (coerce (list #\Newline) 'string))))) #e)
The Lisp part of the above code should be evaluated in the custom reader, so that it should produce identical results as the code below:
(defparameter *example-code-x64-with-lisp-fails* #a inc r10 ; increment register r10. mov r11,r12 ; store value of r12 into r11. inc r13 inc r14 inc r15 dec r13 dec r14 dec r15 #e)
But instead the compiling fails:
CL-USER> ; compiling file "/home/user/code/lisp/lisp-asm-reader-for-stackoverflow.lisp" (written 28 MAR 2014 10:11:29 PM): ; ; caught ERROR: ; READ error during COMPILE-FILE: ; ; The value -1 is not of type (MOD 4611686018427387901). ; ; (in form starting at line: 1, column: 0, file-position: 0) ; ; compilation unit aborted ; caught 1 fatal ERROR condition ; caught 1 ERROR condition ; compilation aborted after 0:00:00.004 1 compiler notes: /home/user/code/lisp/lisp-asm-reader-for-stackoverflow.lisp:10487 read-error: READ error during COMPILE-FILE: The value -1 is not of type (MOD 4611686018427387901). (in form starting at line: 1, column: 0, file-position: 0) CL-USER>
The idiomatic way to read lisp code from within a reader macro is to call cl:read. In your example, calling read after consuming #L will return the list whose car is loop, and that list can be passed to eval.
To collect the output created during the eval, you can bind *standard-output*. So an option is to use something akin to the following within your reader macro:
(let ((lisp-printed-string
(with-output-to-string (*standard-output*)
(eval (read stream t t t)))))
;; concatenate the lisp printed string onto your
;; hand parsed string here
)
An alternative is to have the user input a lisp form which returns a string {e.g. (concatenate "bar" "baz")}, and collect eval's return value instead of its printed output.