sqlsql-servercombinationscombinatoricspurchase-order

Optimal selection for ordering multiple items (parts) from multiple suppliers (vendors)


The task here is to define the optimal (as detailed below) way of ordering items (parts) from suppliers.

The relevant parts of the table schema (with some sample data) are

Items

ID  NUMBER
1   Item0001
2   Item0002
3   Item0003

Suppliers

ID  NAME         DELIVERY DISCOUNT
1   Supplier0001 0        0
2   Supplier0002 0        0.025
3   Supplier0003 20       0

DELIVERY is the delivery charge (in dollars) levied by that supplier on each delivery. DISCOUNT is the settlement discount (as a percentage i.e. 2.5% for ID=2 above) allowed by that supplier for on time payment.

SupplierItems

SUPPLIER_ID ITEM_ID PRICE
1           2       21.67
1           5       45.54
1           7       32.97

This is the many-to-many join between suppliers and items with the price that supplier charges for that item (in dollars). Every item has at least 1 supplier but some have more than one. A supplier may have no items.

PartsRequests

ID ITEM_ID QUANTITY LOCATION_ID ORDER_ID
1  59      4        2           (null)
2  89      5        2           (null)
3  42      4        2           (null)

This table is a request from a field site for parts to be ordered and delivered by the supplier to that site. A delivery of any number of items to a site attracts a delivery charge. When the parts are ordered, the ORDER_ID is inserted into the table so we are only concerned with those where ORDER_ID IS NULL

The question is, what is the optimal way to order these parts for each `LOCATION' where there are 3 optimal solutions that need to be presented to the user for selection.

  1. The combination of orders with the least number of suppliers
  2. The combination of orders with the lowest total cost i.e. The sum of QUANTITY*PRICE for each item plus the DELIVERY for each order summed over all orders ignoring DISCOUNT
  3. As item 2 but accounting for DISCOUNT

Clearly I need to determine the combinations of orders that are available and then determining the optimal ones becomes trivial but I am a bit stuck on an efficient way to deal with building the combinations.

I have built some SQL fiddles in SQL Server 2008 with random data. This one has 100 items, 10 suppliers and 100 requests. This one has 1000 items, 50 suppliers and 250 requests. The table schema is the same.

Update

I reasoned that the solution had to be recursive and I built a nice table valued function to get but I ran into the 32 hard limit on recursion in SQL Server. I was uncomfortable with it anyway because it hinted more of a procedural language solution than a RDMS.

So I am now playing with CTE recursion.

The root query is:

SELECT DISTINCT
       '' SOLUTION_ID
      ,LOCATION_ID
      ,SUPPLIER_ID
      ,(subquery I haven't quite worked out) SOLE_SUPPLIER
FROM PartsRequests pr
     INNER JOIN
     SupplierItems si ON pr.ITEM_ID=si.ITEM_ID
WHERE pr.ORDER_ID IS NULL

This gets all the suppliers that can supply the required items and is certainly a solution, probably not optimal. The subquery sets a flag if the supplier is the sole supplier of any product required for that location; if so they must be part of any solution.

The recursive part is to remove suppliers one by one by means of CTE.SUPPLIER_ID<>CTE.SUPPLIER_ID and add them if they still cover all the items. The SOLUTION_ID will be a CSV list of the suppliers removed, partly to uniquely identify each solution and partly to check against so I get combinations instead of permutations.

Still working on the details, the purpose of this update was to allow the Community to say "Yay, looks like that will work" or, alternatively "You moron, that won't work because ..."

Thanks


Solution

  • This is a more general answer (as in, not sql) as I think solving this problem will require something more powerful. Your first scenario is to select a minimum number of suppliers. This problem can be seen as a set cover problem as you are trying to cover all demands per site with the suppliers. This problem is already NP-complete.

    Your third scenario seems to be basically the same as the second. You just have to take the discount into account in the prices, assuming you pay on time for every order.

    The second scenario is at least NP-hard as I see a lot of resemblance with the facility location problem. You are trying to decide which suppliers (facilities) to use (open) to cover your orders (demands) based on their prices and delivery costs (opening costs).

    Enumerating your possible solutions seems infeasible as with 10 suppliers, you have 2^10 possibilities of using them, further complicated by the distribution of demands internally.

    I would suggest some dynamic programming to first select the suppliers that you have to use (=they are the only ones that deliver a specific thing), eliminating some possibilities (if the cost for supplier A +delivery cost A< cost for supplier B) and then trying to expand your set of possible solutions. Linear programming is also a valid train of thought.