Given a knapsack weight W and a set of n items each with certain value value_i and weight w_i, we need to calculate the maximum value we can get from the items with total weight ≤ W. This is different from classical Knapsack problem, here we are allowed to use unlimited number of instances of an item (adapted from [1]).
There is a standard solution for this problem using dynamic programming: instead of creating a function f: CurrentWeight -> MaxRemainingValue, we create a function g: (CurrentWeight, CurrentItem) -> MaxRemainingValue that iterates through each item. This way we avoid cycles in the implicit graph formed by g, and then we can use dynamic programming. One standard implementation of such function can be seen here in C++:
#include<iostream>
#include<vector>
#include <chrono>
#define INF (int)1e8
using namespace std;
using namespace std::chrono;
const int MAX_WEIGHT = 10000;
const int N_ITEMS = 10;
int memo[N_ITEMS][MAX_WEIGHT + 1];
vector<int> weights;
vector<int> values;
void initializeMemo(){
for(int i = 0; i < N_ITEMS; i++)
for(int j = 0; j < MAX_WEIGHT + 1; j++)
memo[i][j] = -1;
}
// return max possible remaining value
int dpUnboundedKnapsack(int currentItem, int currentWeight){
if(currentItem == N_ITEMS)
return 0;
int& ans = memo[currentItem][currentWeight];
if(ans != -1)
return ans;
int n_taken = 0;
while(true){
int newWeight = currentWeight + n_taken * weights[currentItem];
if(newWeight > MAX_WEIGHT)
break;
int value = n_taken * values[currentItem];
ans = max(ans, dpUnboundedKnapsack(currentItem+1, newWeight) + value);
n_taken++;
}
return ans;
}
int main(){
initializeMemo();
// weights equal 1
weights = vector<int>(N_ITEMS, 1);
// values equal 1, 2, 3, ... N_ITEMS
values = vector<int>(N_ITEMS);
for(int i = 0; i < N_ITEMS; i++)
values[i] = i+1;
auto start = high_resolution_clock::now();
cout << dpUnboundedKnapsack(0, 0) << endl;
auto stop = high_resolution_clock::now();
auto duration = duration_cast<microseconds>(stop - start);
cout << duration.count() << endl;
return 0;
}
This solution transverses a DAG (we never go back an item, so there is no cycles here, all edges are of the form (item, weight) -> (item+1, new_weight) ). Each edge of the DAG is visited at most once. Hence, the time complexity of this algorithm is O(E), with E being the number of edges of the graph. In the worst case scenario, each weight is equal to 1, so each vertex (item, weigth) connects to, on average, other W/2 vertexes. So we have O(E) = O(W·#vertexes) = O(W·W·n) = O(W^2·n). The problem is, everywhere I look on the internet it is said the runtime complexity of this algorithm is O(W·n) because every vertex is calculated only once [1][2][3]. This explanation does not seem to make sense. Also, if that was the case, the algorithm above should not run so slowly. Here is a table of the algorithm runtime × MAX_WEIGHT value (try this for yourself, just have to run the code above):
MAX_WEIGHT time (microseconds)
100 1349
1000 45773
10000 5490555
20000 21249396
40000 80694646
We can clearly see a O(W^2) trend for large values of W, as suspected.
Finally, one may ask: this worst case scenario is pretty dull, as you can just take the greatest value for each repeated weight. Indeed, with this simple pre-processing the worst case scenario now becomes the one with weights = [1, 2, 3, 4, ..., n]. In this case there are around O(W^2·log(n) + W·n) edges (see the image below. I tried my best, hope you understand). So the runtime complexity of the algorithm should be O(W^2·log(n) + W·n) instead of O(W·n) as suggested pretty much every where?
Btw, here is the runtime I obtained for the case weights = [1, 2, 3, 4, ..., n]:
MAX_WEIGHT time (microseconds)
100 964
200 1340
400 2541
800 6878
1000 10407
10000 1202582
20000 5181070
40000 18761689
[1] https://www.geeksforgeeks.org/unbounded-knapsack-repetition-items-allowed/
[2] https://en.wikipedia.org/wiki/Knapsack_problem#Dynamic_programming_in-advance_algorithm
You are correct that the runtime of your algorithm is indeed O(nW2). Here’s an easy way to see this: there are O(nW) grid cells to fill in, and each cell requires O(W) time to process because you can take at most W copies of any one item.
However, there’s a faster way to compute the answer than what you’re proposing here. Specifically, we can eliminate the inner while loop in your code by using the following insight. Suppose you want to make the largest possible value using only the first n items of the list and with W available capacity. There are two options:
Stated differently, instead of asking at each point “how many copies of this do I want?,” you ask “am I done taking copies of this item, or do I want one more copy?”
This reduces the work done per element in the table to O(1), which is where the O(nW) runtime comes from.