I have a question about the methods for doing the climbing hill algorithm with a specific problem I have.
I have 2 metrics: Sum_x and Sum_y, and I have about a hundred rows in this format:
CategoryA | CategoryB | x | y |
---|---|---|---|
A | 10 | 1 | 4 |
A | 12 | 3 | 1 |
B | 10 | 7 | 4 |
C | 17 | 3 | 3 |
where the combination of CategoryA and CategoryB are unique. I need to find which rows are included in Sum_x and Sum_y.
Do you know how I can do this? My current approach is to code each row in binary, starting with all rows as 0, and change each value from 0 to 1 to sum the x and y column, plotting the sums of x and y and finding which matches.
Is this the right approach? I'm trying to write this in Q for now, but I think even Python will work for this.
here's a solution using dynamic programming:
find_subsets_summing_to_value: { [p_list; p_sum]
len:count p_list;
// dp[x][y] = 1b iff any of the first x elements of p_list sum to y
dp: #[1+len; enlist (1b , p_sum#0b)] {: .[y; z; :; y[z[0]-1][z[1]] | y[z[0]-1][z[1] - x[z[0]-1]]]; }[p_list]/ cross[1+ til count p_list; 1 + til p_sum];
backtrack: { [p_dp; p_list; p_subsets; p_current_subset; p_idx; p_target]
if[ (p_idx=0)&p_target=0; : p_subsets , enlist p_current_subset; ];
if[ (p_idx=0)| p_target<0; : p_subsets; ];
if[ p_dp[p_idx][p_target];
p_subsets: .z.s[p_dp; p_list; p_subsets; p_current_subset , p_list[p_idx-1]; p_idx - 1; p_target - p_list[p_idx - 1]];
];
: .z.s[p_dp; p_list; p_subsets; p_current_subset; p_idx - 1; p_target]
};
: backtrack[dp; p_list; (); (); len; p_sum];
};
find_matches: { [p_tbl; p_sumX; p_sumY]
: {distinct x where {y=sum x `y}[;y] each x}[;p_sumY] raze { {`CategoryA`CategoryB xasc y $[1=count x; enlist x; x]}[;y] each (cross/)group[y `x] x }[;p_tbl] each distinct find_subsets_summing_to_value[p_tbl `x; p_sumX];
};
demo_find_matches: { [p_tbl; p_sumX; p_sumY]
show "***************************************";
show p_tbl;
-1 "SUM_X: " , string[p_sumX];
-1 "SUM_Y: " , string[p_sumY];
show "***************************************";
show each find_matches[p_tbl; p_sumX; p_sumY];
-1 "\n\n";
};
q)demo_find_matches[([] CategoryA:`A`A`B`C; CategoryB:10 12 10 17; x:1 3 7 3; y:4 1 4 3); 8; 8]
"***************************************"
CategoryA CategoryB x y
-----------------------
A 10 1 4
A 12 3 1
B 10 7 4
C 17 3 3
SUM_X: 8
SUM_Y: 8
"***************************************"
CategoryA CategoryB x y
-----------------------
A 10 1 4
B 10 7 4
q)demo_find_matches[([] CategoryA:`A`A`B`B`C`C; CategoryB:1 2 1 2 1 2; x:1 2 2 3 4 5; y:7 5 4 2 9 5); 4; 9]
"***************************************"
CategoryA CategoryB x y
-----------------------
A 1 1 7
A 2 2 5
B 1 2 4
B 2 3 2
C 1 4 9
C 2 5 5
SUM_X: 4
SUM_Y: 9
"***************************************"
CategoryA CategoryB x y
-----------------------
C 1 4 9
CategoryA CategoryB x y
-----------------------
A 1 1 7
B 2 3 2
CategoryA CategoryB x y
-----------------------
A 2 2 5
B 1 2 4