algorithmhaskellocamltypechecking# Algorithm for type checking ML-like pattern matching?

How do you determine whether a given pattern is "good", specifically whether it is exhaustive and non-overlapping, for ML-style programming languages?

Suppose you have patterns like:

```
match lst with
x :: y :: [] -> ...
[] -> ...
```

or:

```
match lst with
x :: xs -> ...
x :: [] -> ...
[] -> ...
```

A good type checker would warn that the first is not exhaustive and the second is overlapping. How would the type checker make those kinds of decisions in general, for arbitrary data types?

Solution

Here's a sketch of an algorithm. It's also the basis of Lennart Augustsson's celebrated technique for compiling pattern matching efficiently. (The paper is in that incredible FPCA proceedings (LNCS 201) with oh so many hits.) The idea is to reconstruct an exhaustive, non-redundant analysis by repeatedly splitting the most general pattern into constructor cases.

In general, the problem is that your program has a possibly empty bunch of ‘actual’ patterns {p_{1}, .., p_{n}}, and you want to know if they cover a given ‘ideal’ pattern q. To kick off, take q to be a variable x. The invariant, initially satisfied and subsequently maintained, is that each p_{i} is σ_{i}q for some substitution σ_{i} mapping variables to patterns.

How to proceed. If n=0, the bunch is empty, so you have a possible case q that isn't covered by a pattern. Complain that the ps are not exhaustive. If σ_{1} is an injective *renaming* of variables, then p_{1} catches every case that matches q, so we're warm: if n=1, we win; if n>1 then oops, there's no way p_{2} can ever be needed. Otherwise, we have that for some variable x, σ_{1}x is a constructor pattern. In that case split the problem into multiple subproblems, one for each constructor c_{j} of x's type. That is, split the original q into multiple ideal patterns q_{j} = [x:=c_{j} y_{1} .. y_{arity(cj)}]q, and refine the patterns accordingly for each q_{j} to maintain the invariant, dropping those that don't match.

Let's take the example with `{[], x :: y :: zs}`

(using `::`

for `cons`

). We start with

```
xs covering {[], x :: y :: zs}
```

and we have [xs := []] making the first pattern an instance of the ideal. So we split xs, getting

```
[] covering {[]}
x :: ys covering {x :: y :: zs}
```

The first of these is justified by the empty injective renaming, so is ok. The second takes [x := x, ys := y :: zs], so we're away again, splitting ys, getting.

```
x :: [] covering {}
x :: y :: zs covering {x :: y :: zs}
```

and we can see from the first subproblem that we're banjaxed.

The overlap case is more subtle and allows for variations, depending on whether you want to flag up any overlap, or just patterns which are completely redundant in a top-to-bottom priority order. Your basic rock'n'roll is the same. E.g., start with

```
xs covering {[], ys}
```

with [xs := []] justifying the first of those, so split. Note that we have to refine ys with constructor cases to maintain the invariant.

```
[] covering {[], []}
x :: xs covering {y :: ys}
```

Clearly, the first case is strictly an overlap. On the other hand, when we notice that refining an actual program pattern is needed to maintain the invariant, we can filter out those strict refinements that become redundant and check that at least one survives (as happens in the `::`

case here).

So, the algorithm builds a set of ideal exhaustive overlapping patterns q in a way that's motivated by the actual program patterns p. You split the ideal patterns into constructor cases whenever the actual patterns demand more detail of a particular variable. If you're lucky, each actual pattern is covered by disjoint nonempty sets of ideal patterns and each ideal pattern is covered by just one actual pattern. The tree of case splits which yield the ideal patterns gives you the efficient jump-table driven compilation of the actual patterns.

The algorithm I've presented is clearly terminating, but if there are datatypes with no constructors, it can fail to accept that the empty set of patterns is exhaustive. This is a serious issue in dependently typed languages, where exhaustiveness of conventional patterns is undecidable: the sensible approach is to allow "refutations" as well as equations. In Agda, you can write (), pronounced "my Aunt Fanny", in any place where no constructor refinement is possible, and that absolves you from the requirement to complete the equation with a return value. Every exhaustive set of patterns can be made *recognizably* exhaustive by adding in enough refutations.

Anyhow, that's the basic picture.

- Product of Array Except Self Javascript
- `std::list<>::sort()` - why the sudden switch to top-down strategy?
- There is a lock of 4-digit PIN. The digits to be chosen from 0-5 for the keys. Find all possible combinations of keys
- Finding all the subsets of a set
- Dynamic programming problem for minimum cost tower placement
- How to modify a string while preserving its original index as reference?
- Generate list of pattern contained in dictionary values Python
- How to make rounded percentages add up to 100%
- Find min cost to carry load
- Find all permutations of an array - Where am I going wrong?
- Java Conversion issue - Int to string and again
- Finding number of postorder BSTs permutations
- generate all n bit binary numbers in a fastest way possible
- Angular 16 - can't bind to 'formGroup' since it isn't a known property of 'form'
- Second highest number ArrayList
- Sort (hex) colors to match rainbow
- Implementing a Harris corner detector
- algorithm for find lowest subsequence of items in 2 sets
- Problem in thinking recursively. How to take recursive leap of faith?
- Syracuse sequence using recursive python functions
- Is there a name for this "fill in the blank" algorithm?
- Bitwise operation without using any predefined C functions
- How to get the water level in each bar of a histogram after raining?
- How to combine date ranges in PHP
- How to space neighbouring wireless transmissions temporally?
- Generating a graph with certain degree distribution?
- Find the coin change (Greedy Algorithm) when coins are in decimals and returned amount in coins is larger then original return value
- Find a specific word algorithm in c++
- Minimum number of inversions to make one array identical to another
- flattening a list of contracts based on priority