I'v a sequence of integer number [A1, A2, A3.....AN]
I'm trying to count all sub-sequences which contain at most K unique numbers.
Given sequence:: [2, 3, 3, 7, 5]
and K = 3
All sub-sequences are:
[],
[2],[3],[3],[7],[5],
[2, 3],[2, 3],[2, 7],[2, 5],[3, 3],[3, 7],[3, 5],[3, 7],[3, 5],[7, 5],
[2, 3, 3],[2, 3, 7],[2, 3, 5],[2, 3, 7],[2, 3, 5],[2, 7, 5],[3, 3, 7],[3, 3, 5],[3, 7, 5],[3, 7, 5],
[2, 3, 3, 7],[2, 3, 3, 5],[2, 3, 7, 5],[2, 3, 7, 5],[3, 3, 7, 5],
[2, 3, 3, 7, 5]
I need all sub-sequences (only for counting) that have unique elements.
[],
[2],[3],[3],[7],[5],
[2, 3],[2, 3],[2, 7],[2, 5],[3, 7],[3, 5],[3, 7],[3, 5],[7, 5],
[2, 3, 7],[2, 3, 5],[2, 3, 7],[2, 3, 5],[2, 7, 5],[3, 7, 5],[3, 7, 5]
Total = 22
[3, 3],
[2, 3, 3],[3, 3, 7],[3, 3, 5]
Ignored for higher length: length > K
(If k = 5
then ignored for duplicate elements):
[2, 3, 3, 7],[2, 3, 3, 5],[3, 3, 7, 5],
[2, 3, 3, 7, 5]
[ N.B: All sub-sequences not be unique, but their elements (number) should be identical. Ex- [2, 3],[2, 3] both are counted but [3, 3] ignored ]
import itertools as it
import sys, math
n, m = map(int, sys.stdin.readline().split())
a = list(map(int, sys.stdin.readline().split()))
cnt = 0
for i in range(2, m+1):
for val in it.combinations(a, i):
if len(set(val)) == i:
cnt += 1
print(1+n+cnt)
Input:
5 3
2 3 3 7 5
Output: 22
BUT I need a efficient solution, maybe mathematical solution using nCr
etc or programmatic solution.
1 <= K <= N <= 1,000,00
1 <= Ai <= 9,000
Time: 1 sec
Try this:
import numpy as np
mod = 1000000007
n, k = map(int, input().split())
a = list(map(int, input().split()))
fre = [0]*10000
A = []
for i in range(0, n):
fre[a[i]] += 1
for i in range(0, 9001):
if fre[i] > 0:
A.append(fre[i])
kk = min( len( A ), k ) + 1
S = np.zeros( kk, dtype=int ); S[0] = 1
for a in A:
S[1:kk] = (S[1:kk] + (a * S[0:kk-1])% mod) % mod
ans = 0
for s in S:
ans = ((ans + s) % mod)
print(ans)
This program return all sub-sequences (only for count) that have unique elements.