I have this code
data Slist a = Empty | Scons (Sexp a) (Slist a)
data Sexp a = AnAtom a | AnSlist (Slist a)
data Fruit = Peach | Apple | Pear | Lemon | Fig deriving (Show,Eq)
sxOccurs oatm sxp =
let slOC Empty = 0
slOC (Scons se sls) = (seOC se) + (slOC sls)
seOC (AnAtom atm) = if (atm == oatm) then 1 else 0
seOC (AnSlist sla) = slOC sla
in seOC sxp
As you can see in sxOccurs
I've got two helper functions inside the let
which are "mutually self-referential" as my The Little MLer calls them: slOC
and seOC
. So in SML you must use the keyword and
to let them know about each other and "cross-reference" each other. BTW, sxOccurs
counts the number of a specific AnAtom
objects in an s-list, in my example atoms are Fruit
variables.
My question is, is this an example of referential transparency? Similarly, in Davie he gives this example
let s0 = emptyStack
s1 = push 12.2 s0
s2 = push 7.1 s1
s3 = push 6.7 s2
s4 = divStack s3
s5 = push 4.3 s4
s6 = subtStack s5
s7 = multStack s6
s8 = push 2.2 s7
s9 = addStack s8
in popStack s9
noting that a stack in Imperative-land is constantly mutating the stack, while Haskell is creating a new si
variable for each stack operation. Then he says that each of these lines could be shuffled into a different order and the outcome would be unchanged. AFAICT this is the same basic idea as my sxOccurs
when it doesn't care what order I present the sub-functions. So, again, is this the deeper meaning of referential transparency? If not, what is this I'm showing here?
Referential transparency means this, and only this: you may replace a variable by its definition without changing the meaning of the program. This is called "referential transparency" because you can "look through" a reference to its definition.
For example, you write:
slOC Empty = 0
slOC (Scons se sls) = (seOC se) + (slOC sls)
seOC (AnAtom atm) = if (atm == oatm) then 1 else 0
seOC (AnSlist sla) = slOC sla
Here's a couple transformations you could make, thanks to referential transparency:
-- replace slOC by its definition
seOC (AnSlist sla) = (\v -> case v of Empty -> 0; SCons se sls -> seOC se + slOC sls) sla
-- replace slOC by its definition *again*, starting from the previous line
seOC (AnSlist sla) = (\v -> case v of
Empty -> 0
SCons se sls -> seOC se + (\v -> case v of
Empty -> 0
SCons se sls -> seOC se + slOC sls
) sls
) sla
-- replace slOC by its definition in another equation
slOC (Scons se sls) = seOC se + (\v -> case v of Empty -> 0; SCons se sls -> seOC se + slOC sls) sls
-- or, you could replace seOC by its definition instead
slOC (SCons se sls) = (\v -> case v of
AnAtom atm -> if atm == oatm then 1 else 0
AnSlist sla -> sLOC sla
) se + slOC sls
-- or you could do both, of course
Well, of course, right? By now you may be thinking, "But Daniel, how could this property ever fail?". I will turn briefly to another language to illustrated: C.
int *x = malloc(sizeof(*x));
x[0] = 42;
printf("%d\n", x[0]);
In case you don't read C well, this creates a new variable named x
, allocates some space for it, writes 42 into that space, then prints out the value stored in that space. (We should probably expect it to print 42
!) But I've defined x = malloc(sizeof(*x))
in the first line; can I replace x
by this definition elsewhere?
No! This is a very different program:
int *x = malloc(sizeof(*x));
malloc(sizeof(*x))[0] = 42;
printf("%d\n", x[0]);
It's still a syntactically valid program, but now x[0]
has not been initialized at the moment we reach the line that prints it -- because we allocated a second, independent chunk of space, and initialized that other space instead.
This turns out to be the main way that other languages violate referential transparency: when you have variables whose value can change, it is not safe to replace references to them by their defined value, either because it may have changed since then or because this will cause it not to change in the way the rest of the program expected it to. Haskell eschews this ability; variables, once assigned a value, are never modified.