smltype-inferencevalue-restrictionmlton

define type before use


According to the MLton documentation:

Standard ML requires types to be defined before they are used. [link]

Not all implementations enforce this requirement (for example, SML/NJ doesn't), but the above-linked page makes a good case for why it might be needed for soundness (depending on how the implementation handles the value restriction), and it accords with some of the commentary in the Definition:

Although not assumed in our definitions, it is intended that every context C = T, U, E has the property that tynames ET. Thus T may be thought of, loosely, as containing all type names which "have been generated". […] Of course, remarks about what "has been generated" are not precise in terms of the semantic rules. But the following precise result may easily be demonstrated:

Let S be a sentence T, U, EphraseA such that tynames ET, and let S′ be a sentence T′, U′, E′ ⊢ phrase′ ⇒ A′ occurring in a proof of S; then also tynames E′ ⊆ T′.

[page 21]

But I'm doubly confused by this.

Firstly — the above theorem seems backward. If I correctly understand the phrase "occurring in a proof of S", then this seems to mean (by contrapositive) "once you have a context that violates the intention that tynames ET, all subsequent contexts will also violate that intention". Even if that's true, it seems that it would be much more useful and meaningful to assert the converse, namely, "if all contexts so far conform to the intention that tynames ET, then any subsequently inferable context will also conform to that intention". No?

Secondly — neither MLton's statement nor the Definition's statement actually seems to be supported by the inference rules (or the "Further Restrictions" that follow them). A few inference rules have "tynames τT of C" or "tynames VET of C" as a side-condition, but none of those rules is needed for this program (given in the above-linked documentation):

val r = ref NONE
datatype t = A | B
val () = r := SOME A

(Specifically: rule (4) has to do with let, rule (14) with =>, and rule (26) with rec. None of those is used in this program.)

And coming at it from the other direction, rule (17), which covers datatype declarations, requires only that the generated type names not be in T of C; so it doesn't prevent the generation of a type name used in the existing value environment (except insofar as it's already true that tynames VET of C).

I feel like I'm probably missing something pretty basic here, but I have no idea what it could be!


Solution

  • Regarding your first question, I'm not sure why you suggest that reading. The result basically says that if you have a derivation S (think of it as a tree) whose context satisfies the condition, then all its subderivations (think subtrees) will have contexts that also satisfy the condition. In other words, all rules maintain the condition. Think of the condition as the well-formedness requirement for contexts C.

    Regarding your second question, note the use of ⊕ in the sequencing rule (24), which extends T of C as needed. More concretely, if r was assigned type t option ref, then the first declaration would produce an environment E1 with the corresponding ttynames E1. Then, according to the sequencing rule (24), the second declaration would have to be elaborated under the context C' = CE1, which is defined as C + (tynames E1, E1) in Section 4.3. Hence, tT of C', as required for well-formedness, and consequently, rule (17) would not be able to pick the same t as the denotation of t.