I have a table called Lives:
create table Lives(
animal varchar(10) not null,
year int not null,
zoo varchar(10),
primary key(animal,year)
);
With the given data:
ANIMAL YEAR ZOO
---------- ----------- ----------
joe 2000 a
joe 2001 a
joe 2002 d
joe 2004 c
fred 2002 b
fred 2004 c
jane 2000 a
jane 2001 b
jane 2002 b
jack 2000 a
jack 2001 a
jack 2002 d
jack 2004 c
judy 2004 b
ruby 2003 d
alfred 2006 a
It consists of the name of the animal, a year and the zoo it was in that year.
I need a query that finds the pairs (a, b) of animals that have always been in the same zoo throughout all years, and such that a is lexicographically smaller than b (i.e., a < b). More precisely, such pairs (a, b) satisfy the following condition: if animal a lives in zoo z during year y then b also lives in zoo z during year y, and vice versa.
So the output for my example data would be:
Animal Animal
------- -------
jack joe
Up until now I've constructed this query:
SELECT l1.animal, l2.animal
FROM Lives as l1, Lives as l2
WHERE l2.year = l1.year and l1.animal > l2.animal
It gives me animals that have been in the zoo for 1 year. I don't now how to continue from this.
I'm going to use this query in my sqlj program. Is it possible to construct a query that satisfies my desired result, or should I go on from my current query and implement the rest in sqlj?
I think what you want is an exact relational division, that returns all pairs such that one pair can not have any zoo or year that the other does not have.
A common way to do this is to use a double negation through two correlated sub-queries. It's a bit hard to follow but it should give you the correct result.
-- select all distinct pairs such that...
SELECT * FROM (
SELECT a.animal AS animal1, b.animal AS animal2
FROM lives a
INNER JOIN lives b ON a.zoo = b.zoo AND a.year = b.year AND a.animal < b.animal
) animals
WHERE NOT EXISTS (
-- there does not exist any animal that is not...
SELECT * FROM lives b
WHERE b.animal = animals.animal2
AND NOT EXISTS (
-- in the set of animals that share year and zoo
SELECT * FROM lives c
WHERE c.animal = animals.animal1
AND c.zoo = b.zoo AND c.year = c.year
)
)
GROUP BY animals.animal1, animals.animal2
Using a simple count to determine equality won't work as you'll get a match when the count is the same even though one animal has more zoos than the other. To verify this add this row:
ANIMAL YEAR ZOO
jane 2004 b
And the result you get from the accepted answer will be:
animal animal
jane jack
joe jack
joe jane
Whereas my solution gives:
animal1 animal2
jack joe