algorithmsatisfiabilityreference-implementation

Converter from SAT to 3-SAT


Does anyone know of a good program to convert CNF files with any number of variables per clause to CNF files with exactly 3 variables per clause (3-CNF)? I've seen this algorithm in computer science books but can't find an implementation anywhere and would hate to waste time implementing it myself if others have already done it. Thanks!


Solution

  • I didn't know any program to do that either, but the algorithm is really simple so just I wrote the following python script (download) that reads a general CNF in DIMACS format and writes the CNF of an equivalent 3-SAT problem in DIMACS format:

    from __future__ import print_function
    import fileinput
    
    cnf = list()
    cnf.append(list())
    maxvar = 0
    
    for line in fileinput.input():
        tokens = line.split()
        if len(tokens) == 0 or tokens[0] == "p" or tokens[0] == "c":
            continue
        for tok in tokens:
            lit = int(tok)
            maxvar = max(maxvar, abs(lit))
            if lit == 0:
                cnf.append(list())
            else:
                cnf[-1].append(lit)
    
    assert len(cnf[-1]) == 0
    cnf.pop()
    
    new_cnf = list()
    for clause in cnf:
        while len(clause) > 3:
            new_clause = list()
            for i in range(0, len(clause), 2):
                if i+1 < len(clause):
                    new_cnf.append(list())
                    new_cnf[-1].append(clause[i])
                    new_cnf[-1].append(clause[i+1])
                    maxvar += 1
                    new_cnf[-1].append(-maxvar)
                    new_clause.append(maxvar)
                else:
                    new_clause.append(clause[i])
            clause = new_clause
        new_cnf.append(clause)
    
    print("p cnf %d %d" % (maxvar, len(new_cnf)))
    for clause in new_cnf:
        print(" ".join([ "%d" % lit for lit in clause ]) + " 0")
    

    The interesting bit is of course the for clause in cnf: loop that takes the general sat problem stored in cnf and transforms it into a 3-sat instance stored in new_cnf. It does this by translating a clause such as

    (A[1] or A[2] or A[3] or A[4] or A[5] or A[6] or A[7])
    

    into the following set of clauses.

    (A[1] or A[2] or ~X[1])
    (A[3] or A[4] or ~X[2])
    (A[5] or A[6] or ~X[3])
    
    (X[1] or X[2] or X[3] or A[7])
    

    The first three clauses are added to new_cnf. The last clause is not 3-sat so the algorithm is re-run on this last clause, yielding the following new clauses:

    (X[1] or X[2] or ~Y[1])
    (X[3] or A[7] or ~Y[2])
    
    (Y[1] or Y[2])
    

    This are all 3-sat clauses, so they are added to new_cnf and the algorithm continues with the next clause from cnf. (If the last clause were not 3-sat, the algorithm would keep working on it until only 3-sat clauses are left. The length of the last clause approximately halves with each iteration.)