algorithmsortingstandard-deviationdeviation

Place different items evenly in array


I'm wondering if it is possible to somehow "sort" items in an array to place them in "equal" spacings.

An example is more than hundreds of words so:

Apple - 1
Banana - 2
Pineapple - 3
Orange - 4

And this is an array:

[ 'Apple', 'Apple', 'Banana', 'Pineapple', 'Pineapple', 'Pineapple', 'Orange' ]
[ 1, 1, 2, 3, 3, 3, 4 ]

What I want to achieve is something similar to this:

[ 'Apple', 'Pineapple', 'Banana', 'Apple', 'Pineapple', 'Orange', 'Pineapple' ]
[ 1, 3, 2, 1, 3, 4, 3 ] 

With this transformation Pineapple has 1 item offset between other 'Pineapple' and Apple is placed in [0] and [3] placement.

Before I start an implementation I'm looking for an already invented solution - it should be something related with standard deviation?


Solution

  • Start off by ordering your words by number of occurences. Then iterate over them, first filling up all even indices, then all odd indices.
    The first word can at most fill up all even indices. In most modern arrays there should always be at least as many slots with an even index as there are with an odd one. If your language doesn't qualify for that (i.e. one-based arrays), pick even or odd based on the number of available slots.
    The second most common word can only occur at most as many times as the most common word, so there's no possibility this way that the same word winds up in two adjacent slots that way.
    A simple python-implementation would look like this:

    import math
    
    def spaced_ordering(words):
        words = sorted(words, key=words.count, reverse=True)
        output = [None] * len(words)
    
        for i in range(0, math.ceil(len(words) / 2)):
            output[i * 2] = words[i]
    
        for i in range(0, math.floor(len(words) / 2)):
            output[i * 2 + 1] = words[math.ceil(len(words) / 2) + i]
    
        return output
    

    Note: The above implementation is neither exactly performant, nor exactly fancy, nor does it include checking for valid inputs (e.g. what happens if a word occurs more than math.ceil(len(words) / 2) times). It only serves to demonstrate the basic principle.