[SOLVED] Why is semicolon required in this single line Standard ML code: Int.toString 5?

Why is semicolon required in this single line Standard ML code: Int.toString 5?

I have a file foo.sml with a single line of Standard ML code:

Int.toString 5

This runs fine in SML/NJ but not in MLton:

$ cat foo.sml
Int.toString 5

$ sml < foo.sml
Standard ML of New Jersey (64-bit) v110.99.5 [built: Thu Mar 14 17:56:03 2024]
- = [autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
[autoloading done]
val it = "5" : strin

$ mlton foo.sml
Error: foo.sml 3.0.
  Syntax error found at EOF.
Error: foo.sml 3.0-3.0.
  Parse error.

If I add a semicolon, the problem gets resolved:

$ cat foo.sml
Int.toString 5;

$ sml < foo.sml
Standard ML of New Jersey (64-bit) v110.99.5 [built: Thu Mar 14 17:56:03 2024]
- [autoloading]
[library $SMLNJ-BASIS/basis.cm is stable]
[library $SMLNJ-BASIS/(basis.cm):basis-common.cm is stable]
[autoloading done]
val it = "5" : string
- 

$ mlton foo.sml
$ ./foo
$

Why is semicolon necessary in this single line code for MLton to compile it successfully?

Solution

The syntax of SML programs is defined as follows (The Definition of Standard ML, page 64):

𝑝𝑟𝑜𝑔𝑟𝑎𝑚 ::= 𝑡𝑜𝑝𝑑𝑒𝑐 `;` 𝑝𝑟𝑜𝑔𝑟𝑎𝑚?

Hence, a program must be terminated with a semicolon. So MLton is correct, while SML/NJ is more liberal than the standard and allows the final semicolon before the end-of-file to be omitted.

Edit: There is the added complication that using a bare expression at the toplevel is a so-called derived form, i.e., syntactic sugar, and that is again only defined with a semicolon (page 72, Fig. 18). In general, if the semicolon was omitted between two such expressions, then this syntax would become ambiguous (because x x is an application). Hence, I suspect that MLton's parser requires the semicolon only around such expressions. (Technically, I suppose it parses a file as either a program, or just a 𝑡𝑜𝑝𝑑𝑒𝑐.)