Filter generated permutations in python

I want to generate permutations of elements in a list, but only keep a set where each element is on each position only once.

For example [1, 2, 3, 4, 5, 6] could be a user list and I want 3 permutations. A good set would be:

[1,2,3,5,4,6]
[2,1,4,6,5,3]
[3,4,5,1,6,2]

However, one could not add, for example, [1,3,2,6,5,4] to the above, as there are two permutations in which 1 is on the first position twice, also 5 would be on the 5th position twice, however other elements are only present on those positions once.

My code so far is :

# this simply generates a number of permutations specified by number_of_samples

def generate_perms(player_list, number_of_samples):
    myset = set()
    while len(myset) < number_of_samples:
         random.shuffle(player_list)
         myset.add(tuple(player_list))
    return [list(x) for x in myset]

# And this is my function that takes the stratified samples for permutations.
def generate_stratified_perms(player_list, number_of_samples):
    user_idx_dict = {}
    i = 0

    while(i < number_of_samples):
        perm = generate_perms(player_list, 1)
        for elem in perm:
            if not user_idx_dict[elem]:
                user_idx_dict[elem] = [perm.index(elem)]
            else:
                user_idx_dict[elem] += [perm.index(elem)]
        [...]
    return total_perms

but I don't know how to finish the second function.

So in short, I want to give my function a number of permutations to generate, and the function should give me that number of permutations, in which no element appears on the same position more than the others (once, if all appear there once, twice, if all appear there twice, etc).

Solution

Let's starting by solving the case of generating n or fewer rows first. In that case, your output must be a Latin rectangle or a Latin square. These are easy to generate: start by constructing a Latin square, shuffle the rows, shuffle the columns, and then keep just the first r rows. The following always works for constructing a Latin square to start with:

1 2 3 ... n
2 3 4 ... 1
3 4 5 ... 2
... ... ...
n 1 2 3 ...

Shuffling rows is a lot easier than shuffling columns, so we'll shuffle the rows, then take the transpose, then shuffle the rows again. Here's an implementation in Python:

from random import shuffle

def latin_rectangle(n, r):
    square = [
        [1 + (i + j) % n for i in range(n)]
        for j in range(n)
    ]
    shuffle(square)
    square = list(zip(*square)) # transpose
    shuffle(square)
    return square[:r]

Example:

>>> latin_rectangle(5, 4)
[(2, 4, 3, 5, 1),
 (5, 2, 1, 3, 4),
 (1, 3, 2, 4, 5),
 (3, 5, 4, 1, 2)]

Note that this algorithm can't generate all possible Latin squares; by construction, the rows are cyclic permutations of each other, so you won't get Latin squares in other equivalence classes. I'm assuming that's OK since generating a uniform probability distribution over all possible outputs isn't one of the question requirements.

The upside is that this is guaranteed to work, and consistently in O(n^2) time, because it doesn't use rejection sampling or backtracking.

Now let's solve the case where r > n, i.e. we need more rows. Each column can't have equal frequencies for each number unless r % n == 0, but it's simple enough to guarantee that the frequencies in each column will differ by at most 1. Generate enough Latin squares, put them on top of each other, and then slice r rows from it. For additional randomness, it's safe to shuffle those r rows, but only after taking the slice.

def generate_permutations(n, r):
    rows = []
    while len(rows) < r:
        rows.extend(latin_rectangle(n, n))
    rows = rows[:r]
    shuffle(rows)
    return rows

Example:

>>> generate_permutations(5, 12)
[(4, 3, 5, 2, 1),
 (3, 4, 1, 5, 2),
 (3, 1, 2, 4, 5),
 (5, 3, 4, 1, 2),
 (5, 1, 3, 2, 4),
 (2, 5, 1, 3, 4),
 (1, 5, 2, 4, 3),
 (5, 4, 1, 3, 2),
 (3, 2, 4, 1, 5),
 (2, 1, 3, 5, 4),
 (4, 2, 3, 5, 1),
 (1, 4, 5, 2, 3)]

This uses the numbers 1 to n because of the formula 1 + (i + j) % n in the first list comprehension. If you want to use something other than the numbers 1 to n, you can take it as a list (e.g. players) and change this part of the list comprehension to players[(i + j) % n], where n = len(players).