Let a tree data structure be defined as such:
A tree has one Node as its root. A Node is either a Leaf or it is an Inner Node which has one or more Nodes as its children.
In some kind of pseudo OO programming language we may define a tree like this:
Node := InnerNode | Leaf
Leaf {
isLeaf() : TRUE
}
InnerNode {
isLeaf() : FALSE
children() : List<Node>
}
Tree {
root() : Node
}
Now we can define two functions, 'bad_code' and 'good_code'. The function 'bad_code' does not compile, the other function does:
function bad_code(Node anyNode) : void {
// this will give a compile time error "type Node does not define method children()"
anyNode.children();
}
function good_code(Node anyNode) : void {
// the compiler understands that all Nodes must have a method called isLeaf() which
// returns a boolean
let b : boolean <- anyNode.isLeaf();
if (b == FALSE) {
// this will not give a compile time error because the compiler can deduce that
// anyNode must be of type InnerNode which has the method children()
anyNode.children();
}
}
Question:
What you're describing is that the compiler uses the control-flow graph to narrow the type of a variable, so that when an if
statement tests a condition which relates to the type of a variable, a more specific type for the same variable can be inferred for the body of the if
statement.
This is called control-flow type narrowing, and it's done in e.g. Typescript. It is purely a static check, done at compile-time with no runtime penalty; in fact, types in Typescript are not available at runtime at all.
type TreeNode = InnerNode | Leaf
interface Leaf {
isLeaf: true
}
interface InnerNode {
isLeaf: false
children: Node[]
}
function bad_code(anyNode: TreeNode): void {
// type error: Property 'children' does not exist on type 'TreeNode'.
console.log(anyNode.children);
}
function good_code(anyNode: TreeNode): void {
if (!anyNode.isLeaf) {
// narrowed type to anyNode: InnerNode
console.log(anyNode.children);
}
}
Note that Typescript requires you to do this in a particular way; we test anyNode.isLeaf
directly rather than storing it in a variable b: boolean
first, because Typescript doesn't keep track of the relationship between the two variables b
and anyNode
:
function bad_in_typescript(anyNode: TreeNode): void {
let b: boolean = anyNode.isLeaf;
if (!b) {
// type error: Property 'children' does not exist on type 'TreeNode'.
console.log(anyNode.children);
}
}
Also, in the above code isLeaf
is a property instead of a method. Typescript does have a related feature called user-defined type guards which allow a method's return type to be something like this is Leaf
, indicating that the method returns true
only when called on something of type Leaf
:
type TreeNode = InnerNode | Leaf
interface BaseNode {
isLeaf(): this is Leaf
isInner(): this is InnerNode
}
interface Leaf extends BaseNode {}
interface InnerNode extends BaseNode {
children(): Node[]
}
However, Typescript is still a bit more limited than your example; we have to test anyNode.isInner()
because !anyNode.isLeaf()
won't necessarily do the same narrowing. (Typescript uses structural types, so in fact this Leaf
is a supertype of InnerNode
, which causes some problems for the union type. If you give Leaf
a property like value: number
which InnerNode
doesn't have, then !anyNode.isLeaf()
works how you would expect.)