algorithmpseudocodevoting

Seeking Pseudo-code for Calculating the Smith and Schwartz Set


I've read Wikipedia on the Smith Set, Schwartz Set, Kosaraju's Algorithm, Tarjan's Algorithm, and the path-based strongly component algorithms; however, my experience with such algorithms is…lacking. Wikipedia also says you can use a version of Kosaraju's algorithm to generate the Schwartz set—and that these algorithms can calculate the Smith set.

Wikpedia also has some pseudo-code for Tarjan's algorithm, but not the others; and it's not specific to this relatively-sensitive application. I'm also not 100% certain which is the simplest to implement—which has the feature of least likelihood of errors in implementation.

I'd like some more-direct pseudocode to cover computing the Smith and Schwartz set from one of these algorithms, given a set of ranked ballots. I find it easier to grasp concepts when I have a practical process I can walk. I'll turn it into actual code myself.

Consider the following data structure:

Type Ballot {
  Array Votes[] {
    Candidate Candidate; // We do this in C#
    int Rank;
  }
}

For a collection of Ballots, each individual Ballot will contain an array Votes such as the following:

Ballot b.Votes[] =
  [ Vote(Alex.ID, 1),
    Vote(Chris.ID, 3),
    Vote(Sam.ID, 2) ];

This corresponds to a voter casting Alex>Sam>Chris, and there may be further candidates who are all equally less-preferred than Chris.

I assume the first step would be to tally the individual votes and graph the victory. For example: if 100 voters rank Alex above Sam (Alex = 1, Sam >= 2) and 50 voters rank Sam above Alex, Alex defeats Sam. Thus I guess there'd be a data structure as such:

Type GraphNode {
  Candidate Candidate;
  GraphNode Array Defeats[];
  GraphNode Array Ties[];
  GraphNode Array DefeatedBy[];
}

Whereby a GraphNode for Alex would have an element in Defeats[] pointing to a GraphNode for Sam, and vice versa.

Given these GraphNodes, what do I do with it to identify the Smith and Schwartz sets?

Thanks in advance.


Solution

  • I guess python is close enough to pseudocode.

    Let's say we have n candidates numbered from 0 to n - 1.

    First you can compute a matrix beats[i][j] equal to True if candidate i beats candidate j and False otherwise.

    Now compute the transitive closure of the matrix using the Floyd-Warshall algorithm:

    for k in range(n):
        for i in range(n):
            for j in range(n):
                beats[i][j] = beats[i][j] or (beats[i][k] and beats[k][j])
    

    After that the matrix has a slightly different meaning: beats[i][j] means there is a "beating path" i -> c1 -> c2 -> ... -> j such that i beats c1, c1 beats c2 and so on up to j.

    The Schwartz components are the ones in which all pairs i, j have beating paths running both ways, and there's no other candidate that beats any of them (see the Wikipedia section on properties mentioning a top cycle).

    Basically for each candidate i try building a component around it, like this:

    schwartz_components = []
    
    for i in range(n):
        schwartz_component = {i}
        is_schwartz = True
        for j in range(n):
            if beats[j][i]:
                if beats[i][j]:
                    schwartz_component.add(j)
                else:
                    is_schwartz = False
        if is_schwartz:
             schwartz_components.append(schwartz_component)
    
    schwartz_set = set.union(*schwartz_components)
    print(schwartz_set)
    

    For the Smith set it would be a bit different, you'd need to start with cannot_beat[i][j] = not beats[i][j], use Floyd-Warshall on that and then build the set around each i by adding all the candidates with a path to it through the cannot_beat relation.

    I guess it would go something like this (after the Floyd-Warshall step):

    smith_candidates = []
    
    for i in range(n):
        smith_candidate = {i}
        for j in range(n):
            if cannot_beat[i][j]:
                smith_candidate.add(j)
        smith_candidates.append(smith_candidate)
    
    # Take the smallest candidate
    smith_set = min(smith_candidates, key=len)
    

    There's likely a bug somewhere, but that's the idea, roughly.