pythonmathsubsequence

How many sub-sequences of unique elements can be possible?


I'v a sequence of integer number [A1, A2, A3.....AN] I'm trying to count all sub-sequences which contain at most K unique numbers.

For Example:

Given sequence:: [2, 3, 3, 7, 5] and K = 3

All sub-sequences are:

[],
[2],[3],[3],[7],[5],
[2, 3],[2, 3],[2, 7],[2, 5],[3, 3],[3, 7],[3, 5],[3, 7],[3, 5],[7, 5],
[2, 3, 3],[2, 3, 7],[2, 3, 5],[2, 3, 7],[2, 3, 5],[2, 7, 5],[3, 3, 7],[3, 3, 5],[3, 7, 5],[3, 7, 5],
[2, 3, 3, 7],[2, 3, 3, 5],[2, 3, 7, 5],[2, 3, 7, 5],[3, 3, 7, 5],
[2, 3, 3, 7, 5]

I need all sub-sequences (only for counting) that have unique elements.

Counted sub-sequences are:

[],
[2],[3],[3],[7],[5],
[2, 3],[2, 3],[2, 7],[2, 5],[3, 7],[3, 5],[3, 7],[3, 5],[7, 5],
[2, 3, 7],[2, 3, 5],[2, 3, 7],[2, 3, 5],[2, 7, 5],[3, 7, 5],[3, 7, 5]

Total = 22

Not counted sub-sequences are:

[3, 3],
[2, 3, 3],[3, 3, 7],[3, 3, 5]

Ignored for higher length: length > K (If k = 5 then ignored for duplicate elements):

[2, 3, 3, 7],[2, 3, 3, 5],[3, 3, 7, 5],
[2, 3, 3, 7, 5]

[ N.B: All sub-sequences not be unique, but their elements (number) should be identical. Ex- [2, 3],[2, 3] both are counted but [3, 3] ignored ]

::CODE::

import itertools as it
import sys, math

n, m = map(int, sys.stdin.readline().split())
a = list(map(int, sys.stdin.readline().split()))

cnt = 0
for i in range(2, m+1): 
  for val in it.combinations(a, i):
    if len(set(val)) == i:
      cnt += 1
print(1+n+cnt)

Input:

5 3
2 3 3 7 5

Output: 22

BUT I need a efficient solution, maybe mathematical solution using nCretc or programmatic solution.

Constraints:

1 <= K <= N <= 1,000,00
1 <= Ai <= 9,000

Time: 1 sec


Solution

  • Try this:

    import numpy as np
    mod = 1000000007
    n, k = map(int, input().split())
    a = list(map(int, input().split()))
    fre = [0]*10000
    A = []
    for i in range(0, n):
        fre[a[i]] += 1
    for i in range(0, 9001):
        if fre[i] > 0:
            A.append(fre[i])   
    kk = min( len( A ), k ) + 1
    S = np.zeros( kk, dtype=int );   S[0] = 1
    for a in A:
       S[1:kk] = (S[1:kk] + (a * S[0:kk-1])% mod) % mod
    ans = 0
    for s in S:
        ans = ((ans + s) % mod)
    print(ans)
    

    This program return all sub-sequences (only for count) that have unique elements.