sqlsql-servert-sqlpermutation

The most elegant way to generate permutations in SQL server


Given a the following table:

Index | Element
---------------
  1   |    A
  2   |    B
  3   |    C
  4   |    D

We want to generate all the possible permutations (without repetitions) using the elements. the final result (skipping some rows) will look like this:

  Results
----------
   ABCD
   ABDC
   ACBD
   ACDB
   ADAC
   ADCA

   ...

   DABC
   DACB
   DBCA
   DBAC
   DCAB
   DCBA

  (24 Rows)

How would you do it?


Solution

  • After making some perhaps snarky comments, this problem stuck in my brain all evening, and I eventually came up with the following set-based approach. I believe it definitely qualifies as "elegant", but then I also think it qualifies as "kinda dumb". You make the call.

    First, set up some tables:

    --  For testing purposes
    DROP TABLE Source
    DROP TABLE Numbers
    DROP TABLE Results
    
    
    --  Add as many rows as need be processed--though note that you get N! (number of rows, factorial) results,
    --  and that gets big fast. The Identity column must start at 1, or the algorithm will have to be adjusted.
    --  Element could be more than char(1), though the algorithm would have to be adjusted again, and each element
    --  must be the same length.
    CREATE TABLE Source
     (
       SourceId  int      not null  identity(1,1)
      ,Element   char(1)  not null
     )
    
    INSERT Source (Element) values ('A')
    INSERT Source (Element) values ('B')
    INSERT Source (Element) values ('C')
    INSERT Source (Element) values ('D')
    --INSERT Source (Element) values ('E')
    --INSERT Source (Element) values ('F')
    
    
    --  This is a standard Tally table (or "table of numbers")
    --  It only needs to be as long as there are elements in table Source
    CREATE TABLE Numbers (Number int not null)
    INSERT Numbers (Number) values (1)
    INSERT Numbers (Number) values (2)
    INSERT Numbers (Number) values (3)
    INSERT Numbers (Number) values (4)
    INSERT Numbers (Number) values (5)
    INSERT Numbers (Number) values (6)
    INSERT Numbers (Number) values (7)
    INSERT Numbers (Number) values (8)
    INSERT Numbers (Number) values (9)
    INSERT Numbers (Number) values (10)
    
    
    --  Results are iteratively built here. This could be a temp table. An index on "Length" might make runs
    --  faster for large sets.  Combo must be at least as long as there are characters to be permuted.
    CREATE TABLE Results
     (
       Combo   varchar(10)  not null
      ,Length  int          not null
     )
    

    Here's the routine:

    SET NOCOUNT on
    
    DECLARE
      @Loop     int
     ,@MaxLoop  int
    
    
    --  How many elements there are to process
    SELECT @MaxLoop = max(SourceId)
     from Source
    
    
    --  Initialize first value
    TRUNCATE TABLE Results
    INSERT Results (Combo, Length)
     select Element, 1
      from Source
      where SourceId = 1
    
    SET @Loop = 2
    
    --  Iterate to add each element after the first
    WHILE @Loop <= @MaxLoop
     BEGIN
    
        --  See comments below. Note that the "distinct" remove duplicates, if a given value
        --  is to be included more than once
        INSERT Results (Combo, Length)
         select distinct
            left(re.Combo, @Loop - nm.Number)
            + so.Element
            + right(re.Combo, nm.Number - 1)
           ,@Loop
          from Results re
           inner join Numbers nm
            on nm.Number <= @Loop
           inner join Source so
            on so.SourceId = @Loop
          where re.Length = @Loop - 1
    
        --  For performance, add this in if sets will be large
        --DELETE Results
        -- where Length <> @Loop
    
        SET @Loop = @Loop + 1
     END
    
    --  Show results
    SELECT *
     from Results
     where Length = @MaxLoop
     order by Combo
    

    The general idea is: when adding a new element (say "B") to any string (say, "A"), to catch all permutations you would add B to all possible positions (Ba, aB), resulting in a new set of strings. Then iterate: Add a new element (C) to each position in a string (AB becomes Cab, aCb, abC), for all strings (Cba, bCa, baC), and you have the set of permutations. Iterate over each result set with the next character until you run out of characters... or resources. 10 elements is 3.6 million permutations, roughly 48MB with the above algorithm, and 14 (unique) elements would hit 87 billion permutations and 1.163 terabytes.

    I'm sure it could eventually be wedged into a CTE, but in the end all that would be is a glorified loop. The logic is clearer this way, and I can't help but think the CTE execution plan would be a nightmare.