cfilteringnested-loopsparipari-gp

Identify loops over a vector of indices; make list, minus rotations


I work with vectors of natural numbers of length N and sum-of-entries S, for instance, with (N,S)=(4,7) one example vector E=[1,2,1,3] where all entries in the vector are assumed > 0.

I want to list all vectors with the same configuration (N,S)=(4,7), but rotations should be filtered out.

Question: what is the best algorithm?
(my practical implementation is in Pari/GP which provides a nested for-loop using a vector of bounds for lower and upper index value)

I've tried first a "brute force" solution, in that I generate a list using the nested for-loop, but storing the vector concatenated twofold concat(E,E) say in the list EE[], and then, for each new vector E=[e_1,e_2,e_3,e_4] checking whether this vector occurs in the concatenated vectors in the already checked list EE[], and if not then append it as new valid entry
EE[k]=E||E = [e_1,e_2,e_3,e_4,e_1,e_2,e_3,e_4]. The comparision here is like string comparision - a match is always found if it begins at positions 1 or up to N.

This method works, but looks to me a bit like brute force and due to its combinatorical structure explodes with increasing N and S.

Does a better method exist?

Note: target language is Pari/GP
Note: a pseudoalgorithm would suffice - but perhaps the tools in Pari/GP allow some more compact solutions/notation.


Example, (N,S)=(4,7)
The following is a very simple example.

Assume by a nested loop I get the vectors E in the following way:

[1,1,1,4] --- first vector: store as [1,1,1,4;1,1,1,4] in EE
[1,1,2,3] --- 2nd vector: not in EE so far, append as [1,1,2,3;1,1,2,3] to EE
[1,2,1,3] ...
[2,1,1,3] ...
[1,1,3,2] --- ignore!! This is a rotation of an earlier entry (is in EE)
[1,2,2,2] 
...

This builds successively the list of vectors EE:

[1,1,1,4;1,1,1,4]
[1,1,2,3;1,1,2,3]
[1,2,1,3;1,2,1,3]
[2,1,1,3;2,1,1,3]
[1,2,2,2;1,2,2,2]

and this list, with the first N columns only, is the list of vectors I want to work with later.

Additional sanity check: for (N,S)=(4,16) I get for the unfiltered list length_unfiltered = 455 , and length_filtered=116 after rotations are deleted.


Solution

  • There is a well known algorithm for generating strings with rotations deleted (usually called necklaces in combinatorics). This algorithm works in constant amortized time meaning that rotated equivalents can be removed in constant time.

    Frank Rusky calls this algorithm the FKM algorithm. It is described in https://people.engr.ncsu.edu/savage/AVAILABLE_FOR_MAILING/necklace_fkm.pdf. (Also several other papers and also chapter 7.2 of Rusky's book 'Combinatorial Generation').

    The implementation is straight forward (but it would take me a couple of hours to code in PARI, so for now I am leaving). The additional requirement of a given sum can be incorporated into the algorithm without difficulty.

    A less efficient alternative would be to generate all the (N, S) words and then filter out those that are not necklaces. For the generation part, there are built in PARI functions forpart and forperm. The filtering can be done using a simplified adaption of the FKM algorithm. Since only a test is required, backtracking and recursion can be avoided in this test.

    Some PARI code follows: The following PARI can be used to generate all vectors of length n and sum s. This method avoids recursion and calls act for each solution.

    forsumset(n, s, act)={
      my(w=vector(n), p=1);
      while(p > 0,
        if(s>n-p, w[p]++; s--; if(p==n-1, w[n]=s;act(w), p++), s+=w[p]; w[p]=0; p--);
       );
    }
    

    Example use:

    local(total=0); forsumset(4, 16, w->total++); total
    

    Gives 455 as expected.

    The following function is a test for the necklace property using the FKM algortithm:

    isnecklace(w)={
      my(d=1); 
      for(p=2, #w, my(e=w[p]-w[p-d]); if(e<0, return(0)); if(e>0, d=p));
      #w%d==0
    }
    

    Example use:

    local(total=0); forsumset(4,16,w->if(isnecklace(w), total++)); total
    

    Gives 116 as expected.

    Update
    The following combines the two ideas into a single algorithm that computes each new term in amortized constant time. Since the number of results is exponentially dependent on s, the overall time is still exponential. If it is only required to count the total number of entries then there are faster methods.

    forneck(n, s, act)={
      my(w=vector(n), ds=vector(n), p=1, d=1);
      while(p > 0,
        if(w[p], 
          w[p]++; s--; d=p,
          my(e=if(p==1,1,w[p-d])); w[p]=e; s-=e);
        if(s>=n-p, 
           if(p==n, if(s, w[n]+=s; s=0; d=p); if(p%d==0, act(w)), p++; ds[p]=d), 
           d=ds[p]; s+=w[p]; w[p]=0; p--);
      );
    }
    

    Example use:

    local(total=0); forneck(4,16,w->total++); total
    

    Gives 116 as expected.