[SOLVED] How many sub-sequences of unique elements can be possible?

How many sub-sequences of unique elements can be possible?

I'v a sequence of integer number [A1, A2, A3.....AN] I'm trying to count all sub-sequences which contain at most K unique numbers.

For Example:

Given sequence:: [2, 3, 3, 7, 5] and K = 3

All sub-sequences are:

[],
[2],[3],[3],[7],[5],
[2, 3],[2, 3],[2, 7],[2, 5],[3, 3],[3, 7],[3, 5],[3, 7],[3, 5],[7, 5],
[2, 3, 3],[2, 3, 7],[2, 3, 5],[2, 3, 7],[2, 3, 5],[2, 7, 5],[3, 3, 7],[3, 3, 5],[3, 7, 5],[3, 7, 5],
[2, 3, 3, 7],[2, 3, 3, 5],[2, 3, 7, 5],[2, 3, 7, 5],[3, 3, 7, 5],
[2, 3, 3, 7, 5]

I need all sub-sequences (only for counting) that have unique elements.

Counted sub-sequences are:

[],
[2],[3],[3],[7],[5],
[2, 3],[2, 3],[2, 7],[2, 5],[3, 7],[3, 5],[3, 7],[3, 5],[7, 5],
[2, 3, 7],[2, 3, 5],[2, 3, 7],[2, 3, 5],[2, 7, 5],[3, 7, 5],[3, 7, 5]

Total = 22

Not counted sub-sequences are:

[3, 3],
[2, 3, 3],[3, 3, 7],[3, 3, 5]

Ignored for higher length: length > K (If k = 5 then ignored for duplicate elements):

[2, 3, 3, 7],[2, 3, 3, 5],[3, 3, 7, 5],
[2, 3, 3, 7, 5]

[ N.B: All sub-sequences not be unique, but their elements (number) should be identical. Ex- [2, 3],[2, 3] both are counted but [3, 3] ignored ]

::CODE::

import itertools as it
import sys, math

n, m = map(int, sys.stdin.readline().split())
a = list(map(int, sys.stdin.readline().split()))

cnt = 0
for i in range(2, m+1): 
  for val in it.combinations(a, i):
    if len(set(val)) == i:
      cnt += 1
print(1+n+cnt)

Input:

5 3
2 3 3 7 5

Output: 22

BUT I need a efficient solution, maybe mathematical solution using nCretc or programmatic solution.

Constraints:

1 <= K <= N <= 1,000,00
1 <= Ai <= 9,000

Time: 1 sec

Solution

Try this:

import numpy as np
mod = 1000000007
n, k = map(int, input().split())
a = list(map(int, input().split()))
fre = [0]*10000
A = []
for i in range(0, n):
    fre[a[i]] += 1
for i in range(0, 9001):
    if fre[i] > 0:
        A.append(fre[i])   
kk = min( len( A ), k ) + 1
S = np.zeros( kk, dtype=int );   S[0] = 1
for a in A:
   S[1:kk] = (S[1:kk] + (a * S[0:kk-1])% mod) % mod
ans = 0
for s in S:
    ans = ((ans + s) % mod)
print(ans)

This program return all sub-sequences (only for count) that have unique elements.