I have an set numeric ranges that I would like to optimize.
Here's a simple example of initial values:
Start End
9 12
1 2
60 88
10 11
79 80
What I'd expect as output after optimization:
Start End
1 2
9 12
60 88
These are the left
and right
values from Modified Preorder Tree Traversal (Nested Set) data stored in a MySQL database. I use them to exclude inactive branches from the result, and am not currently optimizing the ranges at all. I thought I might get a performance gain from optimizing the ranges before use.
MORE INFO
The values are passed into a query for exclusion of the inactive branches in the tree using a NOT BETWEEN
clause. I thought that I could optimize the performance of that query by using a minimal set of ranges.
Here is an SQL that will return what you want
mysql> CREATE TABLE sample (Start INT, End INT);
mysql> INSERT sample VALUES (9,12),(1,2),(60,88),(10,11),(79,80);
mysql> SELECT *
-> FROM sample s
-> WHERE NOT EXISTS (SELECT 1
-> FROM sample
-> WHERE s.Start > Start AND s.Start < End);
+-------+------+
| Start | End |
+-------+------+
| 9 | 12 |
| 1 | 2 |
| 60 | 88 |
+-------+------+
You can, of course, create VIEW, move the data to another table or delete rows using the above SQL.
NOTE: I am not really sure why are you doing this 'optimization'.
EDIT:
The query can be rewritten as
SELECT s.*
FROM sample s LEFT JOIN
sample s2 ON s.Start > s2.Start AND s.Start < s2.End
WHERE s2.start IS NULL;
Which will create different execution plan (2xsimple select vs primary/dependent subquery for EXISTS), so performance might be different. Both queries will use an index on (Start, End) if it exists.