I am working on a python map/reduce in multiple parts.
My first map prints to the stdin so that the first reduce can pick it up.
The result of the map looks like this:
frozenset([4]) 14
The reduce reads in frozenset([4])
as the key, and 14
as the value.
How can I extract just the [4] from the key to pass to the output of the reduce? The map looks like this:
import sys
data = sys.stdin.read()
dataset = []
for line in data.splitlines():
dataset.append(map(int, line.strip().split(" ")))
c1 = []
for transaction in dataset:
for item in transaction:
if not [item] in c1:
c1.append([item])
candidates = map(frozenset, c1)
sscnt = {}
for tid in dataset:
for can in candidates:
if can.issubset(tid):
sscnt.setdefault(can, 0)
sscnt[can] += 1
for key,val in sscnt.items():
print key, val
The reduce looks like this:
import sys
min_support = 12
sscnt = {}
for input_line in sys.stdin:
input_line = input_line.strip()
key, value = input_line.split(" ")
key = int(key)
sscnt[key] = int(value)
retlist = []
for key in sscnt:
support = sscnt[key]
if value >= min_support:
retlist.insert(0, key)
print retlist
The output from the reduce looks like this:
['frozenset([1])', 'frozenset([4])', 'frozenset([2])']
The input data looks like this:
1 2 3 5 8
2 3 4 7
1 2 4 5 7
1 2 4 6 7
1 2 3 4 5
1 2 4 5 6
1 2 4 6 9
1 2 4 8
3 5 6 8
1 2 4 7
1 2 4 5
1 2 4 9
3 5 6 9
1 2 4 7
3 5 6
1 2 4 8
1 5 6
3 5 9
1 2 4 6
4 5 6 7
Does eval
works for you?
Signature: eval(source, globals=None, locals=None, /)
Docstring: Evaluate the given source in the context of globals and locals.
list(eval('frozenset([1])'))
Returns:
[1]