(python) 0/1 Knapsack Problem - Topic

Member

Posts: 4,841

Joined: Jan 16 2008

Gold: 0.00

Oct 22 2013 09:17pm

Hey. I was tasked to write a recursive solution to the 0/1 knapsack problem using dynamic programming. My solution seems to work and is considerably faster than the same algorithm without a memo-dictionary.

Code

def knapsack(itemlist, capacity, table = {}):
""" given a list of items and a maximum capacity, returns globally optimal
solution to 0/1 knapsack problem as (total_value, (item1, item2, ...)).
table stores intermediary solutions by mapping (itemlist, capacity) to
(best_value, best_items) tuples. """
if not itemlist or capacity == 0:
return (0, ())

# table lookup:
if (tuple(itemlist), capacity) in table:
return table[(tuple(itemlist), capacity)]

# discard too heavy items
consider = itemlist[0]
if consider.getWeight() > capacity:
return knapsack(itemlist[1:], capacity, table)

# explore: take item:
takeVal, takeList = knapsack(itemlist[1:], capacity - consider.getWeight(), table)
takeVal += consider.getValue()

# explore: don't take item:
dontVal, dontList = knapsack(itemlist[1:], capacity, table)

# update table, return better choice
if takeVal > dontVal:
choice = (takeVal, takeList + (consider,))
else:
choice = (dontVal, dontList)
table[(tuple(itemlist), capacity)] = choice
return choice

Now, after looking at the sample solution, I have two questions:
1. In the sample solution, the memo-dictionary is also updated for the case "if consider.getWeight() > capacity:". Does this even improve the efficency of the algorithm?
2. In the sample solution, the dictionary keys are of the form (len(itemlist), capacity). Why does this work? I'd figure you'd need to know which items you can choose from for the stored subsolutions.

Thanks!

tt_toby

Member

Posts: 4,841

Joined: Jan 16 2008

Gold: 0.00

Oct 25 2013 01:24pm

Since nobody is answering I assume the code is not clear?

Azrad

Member

Posts: 10,812

Joined: Oct 15 2009

Gold: Locked

Warn: 20%

Oct 25 2013 01:55pm

Quote (tt_toby @ Oct 25 2013 12:24pm)

Since nobody is answering I assume the code is not clear?

i am interested, and read it, so your not being ignored

. I just don't have anything useful to contribute at this time.

I'm sure others are in the same position. Perhaps could you provide an example what an itemlist might look like? I assume it must be sorted heaviest to lightest? Also could you expand on the methods getWeight() and getValue()

This post was edited by Azrad on Oct 25 2013 02:15pm

Azrad

Member

Posts: 10,812

Joined: Oct 15 2009

Gold: Locked

Warn: 20%

Oct 25 2013 02:19pm

Quote (tt_toby @ Oct 22 2013 08:17pm)

1. In the sample solution, the memo-dictionary is also updated for the case "if consider.getWeight() > capacity:". Does this even improve the efficency of the algorithm?

Well if you got a million items in the list heavier than the capacity of the knapsack (so they clearly can't be part of the solution for that knapsack), I imagine removing them from the list will dramatically increase performance.

This post was edited by Azrad on Oct 25 2013 02:20pm

tt_toby

Member

Posts: 4,841

Joined: Jan 16 2008

Gold: 0.00

Oct 25 2013 05:04pm

Hi Azrad. An Item is a simple object with a weight and a value. Here is the code:

Code

class Item(object):
def __init__(self, name, value, weight):
self.name = name
self.value = value
self.weight = weight

def getWeight(self):
return self.weight

def getValue(self):
return self.value

def __str__(self):
return self.name

def __repr__(self):
return str((self.name, self.value, self.weight))

An itemlist parameter contains an arbitrary number of Item objects. The knapsack function is supposed to find out the maximum value of items you can put into a sack with a maximum carrying weight of 'capacity'.

Quote (Azrad @ Oct 25 2013 08:55pm)

I assume it must be sorted heaviest to lightest?

Sorting the list is not required, but you can speed up the algorithm by sorting the list once on the first call of the function. I have done this in the improved code (see below).

Also, I have found out the answer to question 2: Once given a list parameter, any sublist is completely defined by its length, since the list is not shuffled throughout the function calls, and each function call (unless it hits the base case) cuts off the first element from the list.

Since 1 and 2 are answered, my new question would be how to speed up the improved algorithm even further. Here's the code:

Code

def knapsack(itemlist, capacity, table = {}, totals = None, presorted = False, VAL = 0, WEIGHT = 1):
""" Given a list of items and a maximum capacity, returns globally optimal solution
to 0/1 knapsack problem as (total value, items to take).

Default parameters:
table: stores subsolutions by mapping (length of sublist, capacity left) tuples
to (total value, items to take) tuples.
totals: list keeping track of total value and weight of items yet to consider.
presorted: specifies whether itemlist is sorted by weight.
VAL, WEIGHT: constants for readability. """
# sort itemlist from lightest to heaviest item (once)
if not presorted:
itemlist = sorted(itemlist, key = lambda item: item.getWeight())

# base cases
consider = itemlist[0]
if not itemlist or capacity == 0 or capacity < consider.getWeight():
return (0, ())

# table lookup
if (len(itemlist), capacity) in table:
return table[(len(itemlist), capacity)]

# init totals (once)
if totals == None:
totalVal, totalWeight = 0, 0
for item in itemlist:
totalVal += item.getValue()
totalWeight += item.getWeight()
totals = [totalVal, totalWeight]

# try to take everything
if capacity >= totals[WEIGHT]:
result = (totals[VAL], tuple(itemlist))
else:
# update totals for itemlist[1:]
totals = totals[:]
totals[VAL] -= consider.getValue()
totals[WEIGHT] -= consider.getWeight()

# explore: take item
capacityLeft = capacity - consider.getWeight()
takeVal, takeList = knapsack(itemlist[1:], capacityLeft, table, totals, True)
takeVal += consider.getValue()

# explore: don't take item
dontVal, dontList = knapsack(itemlist[1:], capacity, table, totals, True)

# choose better result
if takeVal > dontVal:
result = (takeVal, takeList + (consider,))
else:
result = (dontVal, dontList)

# update table and return
table[(len(itemlist), capacity)] = result
return result

Azrad

Member

Posts: 10,812

Joined: Oct 15 2009

Gold: Locked

Warn: 20%

Oct 25 2013 05:43pm

Quote (tt_toby @ Oct 25 2013 04:04pm)

my new question would be how to speed up the improved algorithm even further.

Build with Cython?

Quote (tt_toby @ Oct 25 2013 04:04pm)

Sorting the list is not required, but you can speed up the algorithm by sorting the list once on the first call of the function. I have done this in the improved code (see below).

You're clearly better at this than me, I'm just an amateur... But I don't see how the following snippet (from your original post, I see your new code sorts the list) will accomplish anything if the list is not sorted (or perhaps that was your original complaint?):

Code

consider = itemlist[0]
if consider.getWeight() > capacity:
return knapsack(itemlist[1:], capacity, table)

if the 0th object is very light, but the 1st object is ridiculously heavy (heavier than capacity); that if statement will never evaluate as true, and the ridiculously heavy objects will not be removed from the list. However if the item list is sorted heaviest to lightest, then it will quickly eliminate all the ridiculously heavy objects, hopefully drastically reducing the solution space. Now I have to admit, I don't understand exactly how you are exploring the solution space yet (approx the 2nd half of your code), so maybe I'm just talking out of my ass.

/e Oh along the same lines, perhaps you could attack the sorted list from the back looking for items with values of 0, and remove those.

This post was edited by Azrad on Oct 25 2013 05:54pm

tt_toby

Member

Posts: 4,841

Joined: Jan 16 2008

Gold: 0.00

Oct 25 2013 06:10pm

Quote

But I don't see how the following snippet (from your original post, I see your new code sorts the list) will accomplish anything if the list is not sorted

It reduces the size of the problem. Let's make an example, we have a list of items [A, B] which is not sorted from heaviest to lightest (in fact it does not matter how it is sorted or if it is sorted at all).

A = Item('A', 5, 12)
B = Item('B', 8, 19)

knapsack([A,B], 20) will first consider to take item A. In the take branch, knapsack([B], 20 - 12) is called. Now, consider == B and because 20 - 12 < 19 (weight of B ) this function call will now execute the code snippet you were refering to and call knapsack([], 8) which is the base case and returns (0, ()). It's the same with longer list. As soon as an item too heavy for the current capacity left is encountered, it is discarded by calling the function without that item. Remember the considered item ist always the first on the list, which gets cut off in the next call by passing itemlist[1:].
Of course, ultimately the 'dont take A' branch will evaluate as better in this case.

This post was edited by tt_toby on Oct 25 2013 06:27pm

Go Back To Programming & Development Topic List