arraysswiftnscountedset

Most common array elements


I need to find the most common (modal) elements in an array.

The simplest way I could think of was to set variables for each unique element, and assign a count variable for each one, which increases every time it is recorded in a for loop which runs through the array.

Unfortunately the size of the array is unknown and will be very large, so this method is useless.

I have come across a similar question in Objective-C that uses an NSCountedSet method to rank the array elements. Unfortunately I am very new to programming, and could only translate the first line into Swift.

The suggested method is as follows:

    var yourArray: NSArray! // My swift translation

    NSCountedSet *set = [[NSCountedSet alloc] initWithArray:yourArray];

    NSMutableDictionary *dict=[NSMutableDictionary new];

    for (id obj in set) {
        [dict setObject:[NSNumber numberWithInteger:[set countForObject:obj]]
            forKey:obj]; //key is date
    }

    NSLog(@"Dict : %@", dict);

    NSMutableArray *top3=[[NSMutableArray alloc]initWithCapacity:3];

    //which dict obj is = max
    if (dict.count>=3) {

        while (top3.count<3) {
            NSInteger max = [[[dict allValues] valueForKeyPath:@"@max.intValue"] intValue];

            for (id obj in set) {
                if (max == [dict[obj] integerValue]) {
                    NSLog(@"--> %@",obj);
                    [top3 addObject:obj];
                    [dict removeObjectForKey:obj];
                }
            }
        }
    }

    NSLog(@"top 3 = %@", top3);

In my program I will need to find the top five place names in an array.


Solution

  • edit: now with Swift 2.0 below

    Not the most efficient of solutions but a simple one:

    let a = [1,1,2,3,1,7,4,6,7,2]
    
    var frequency: [Int:Int] = [:]
    
    for x in a {
        // set frequency to the current count of this element + 1
        frequency[x] = (frequency[x] ?? 0) + 1
    }
    
    let descending = sorted(frequency) { $0.1 > $1.1 }
    

    descending now consists of an array of pairs: the value and the frequency, sorted most frequent first. So the “top 5” would be the first 5 entries (assuming there were 5 or more distinct values). It shouldn't matter how big the source array is.

    Here's a generic function version that would work on any sequence:

    func frequencies
      <S: SequenceType where S.Generator.Element: Hashable>
      (source: S) -> [(S.Generator.Element,Int)] {
    
        var frequency: [S.Generator.Element:Int] = [:]
    
        for x in source {
            frequency[x] = (frequency[x] ?? 0) + 1
        }
    
        return sorted(frequency) { $0.1 > $1.1 }
    }
    
    frequencies(a)
    

    For Swift 2.0, you can adapt the function to be a protocol extension:

    extension SequenceType where Generator.Element: Hashable {
        func frequencies() -> [(Generator.Element,Int)] {
    
            var frequency: [Generator.Element:Int] = [:]
    
            for x in self {
                frequency[x] = (frequency[x] ?? 0) + 1
            }
    
            return frequency.sort { $0.1 > $1.1 }
        }
    }
    
    a.frequencies()
    

    For Swift 3.0:

    extension Sequence where Self.Iterator.Element: Hashable {
        func frequencies() -> [(Self.Iterator.Element,Int)] {
    
            var frequency: [Self.Iterator.Element:Int] = [:]
    
            for x in self {
                frequency[x] = (frequency[x] ?? 0) + 1
            }
    
            return frequency.sorted { $0.1 > $1.1 }
        }
    }