I've completed the last question of my assignment which asked for the following
"Add a local definition construct “let v=e in f” where v is a variable and e and f are expressions. This means that v has the value e with in f, and this over-rides any value of v in the environment (or an enclosing let, too). As with the previous example, you’ll need to think of a syntax for this that you can easily parse with an extension of the existing parser"
My Issue is that the predefined expr type outlined below used by the parser and compiler in the given codebase is unrecognized in my implementation of the let statement and I receive the following error:
Error: This expression has type expr but an expression was expected of type char
I've added functionality to the parser which translates a let statement of the form "~ var1 = exp1 > exp2" wherein the output of the parser is of the form Bes "(v1, e1, e2)". This is all verified and works. My issues arise where I have added a case to the compiler match statement which recognises the 2nd form just mentioned; upon matching my "preprocessor" function is called also shown below. It should take v1, e1 and e2 matched in the compiler and recursively match on e2 replacing any instance of the variable v1 with the expression e1 before returning this updated form of the expression e2 for further compilation. Instead, I receive the match error above.
EXPRESSION DEFINITION
type expr =
Num of int
| Var of char
| Add of expr*expr
| Mul of expr*expr
| Con of expr*expr*expr
| Bes of expr*expr*expr
INSTRUCTION DEFINITION
type instr =
| Push of int
| Fetch of char
| Add2
| Mul2
| Con2
OTHER TYPES
type program = instr list
type stack = int list
MY PREPROCESSOR FUNCTION
let rec preprocessor eo2 eo1 vo1 : expr =
match eo2 with
| Num n -> eo2
| Var v -> if (v = vo1) then eo1 else eo2
| Add (e1, e2) -> Add ((preprocessor e1 eo1 vo1),(preprocessor e2 eo1 vo1))
| Mul (e1, e2) -> Mul ((preprocessor e1 eo1 vo1),(preprocessor e2 eo1 vo1))
| Con (e1,e2,e3) -> Con ((preprocessor e1 eo1 vo1), (preprocessor e2 eo1 vo1), (preprocessor e3 eo1 vo1))
(* | Bes (v1,e1,e2) -> (preprocessor e2 e1 v1) *)
COMPILER FUNCTION
(*
compile : expr -> instr list
*)
let rec compile e =
match e with
| Num n -> [Push n]
| Var v -> [Fetch v]
| Add (e1,e2) -> compile e2 @ compile e1 @ [Add2]
| Mul (e1,e2) -> compile e2 @ compile e1 @ [Mul2]
| Con (e1,e2,e3) -> compile e3 @ compile e2 @ compile e1 @ [Con2]
| Bes (v1,e1,e2) -> compile (preprocessor e2 e1 v1) @ [] (*ERROR SOURCE LINE*)
I expect the preprocessor function to change any variables in the sub-expression e2 to the expression e1 which can then be continued within the compilation process. Instead, I get the error telling me an expr was provided instead of an expected char. I have tried a few things such as the using let statements to assign e2 e1 and v1 to new variables where I explicitly provided them expr types (e2:expr) etc and I also tried to explicitly return expr from preprocessor but I'm still stuck. The code base seemed much too large to just dump here so If I should have posted the functional parsers let me know. Thanks for any help
The error comes from v1
having a type expr
while you need a char
since you're comparing v1
with each v
or Var v
, which have type char
. The root of the problem is that your Bes
constructor has wrong types, it should be Bes of char * expr * expr
, since v
in the let v = x in y
construction of your language should be a variable, which is represented with the char
type in your implementation.
Using preprocessing to implement let is not a good idea:
2.1. It will explode your code, consider the following example:
let x = <very-big-expr> in
let y = x + x + x in
y + y
will end up in duplicating <very-big-exp>
6 times. In general, it will be an exponential explosion, that will lead to gigabytes of AST
2.2. Duplicating an expression is not always valid, for example in languages where expressions may have side effects, this will break the semantics of a program, consider the following example
let x = read_int () in
let y = read_int () in
x*x + y
provided that the input is "3\n4\n"
, the correct implementation shall return 3*3+4=13, while your implementation will lead to the code,
read_int () * read_int () + read_int ()
which will read two integers multiply them and ask for the third one and fail with the end of channel error.
That means, that you have to keep Bes
as a primitive of your language and properly compile it.
If you will decide to stick to the preprocessing stage, then you shouldn't add Bes
to the set of primitives on the first hand, and perform the AST->AST translation every time your parser sees the let statement. Then you will never see Bes
in the compiler code. This will essentially make let
a syntactic sugar or a macro, which is, again, an incorrect semantics of Let, see (1) and (2) above.