I'm studying graph traversal using recursive CTEs (learnSQL course). The question asks about visiting every city from a starting city. They use a "path" string built up from appended city names and then using where position(city.name IN travel.path) = 0
to see if the city has already been visited. That works but I often look for alternatives and I could use an array of visited cities.
As an experienced dev (e.g. Java but any imperative language) I'd represent visited
as a SET for efficient containment testing, but postgresql only seems to offer ARRAY.
The example data is, of course, tiny, and isn't a problem but I'd like to know what's most efficient/best practise. String lookup using position
isn't terribly semantic IMO.
Thanks!
with recursive
travel(path, visited, visited_count) AS (
select
name::text,
array[name::text],
1
from city
where
name = 'Bristol'
union all
select
travel.path || '->' || city.name,
visited || name::text,
visited_count + 1
from city, travel
where
city.name <> all(visited)
--position(city.name IN travel.path) = 0
)
select
path,
visited
from travel
where
visited_count = 6
Use the ARRAY
approach for semantic clarity, as both array checks (<> ALL
) and string searches (position()) have similar O(N) lookup performance within the recursive step.
WITH RECURSIVE travel(path, visited, visited_count) AS (
-- base case
SELECT name::text, ARRAY[name::text], 1
FROM city WHERE name = 'Bristol'
UNION ALL
-- recursive step
SELECT
t.path || '->' || c.name, -- build path string
t.visited || c.name::text, -- append to visited array (o(n) copy + append)
t.visited_count + 1
FROM city c, travel t
WHERE c.name <> ALL(t.visited) -- check if visited (o(n) array scan)
)
SELECT path, visited
FROM travel
WHERE visited_count = (SELECT count(*) FROM city); -- condition to stop (e.g., all cities visited)