Taking an existing customer product inventory list and creating bundles out of the products

59 views Asked by At

I'm working on SQL Server 2019 and have a problem I am trying to solve which I can best describe as "bundling" - the end result is a significantly shrunken dataset and fewer inventory items to keep track of. Here is a quick example of the dataset:

Account ID Product ID
1000 10
1000 20
1000 30
1001 10
1001 20
1001 30
1001 50
1001 60
1001 70
1002 50
1002 60
1002 70
1003 80
1004 10
1004 20
1004 80

I created a new table which the "bundles" reside in as a reference table:

Bundle ID Product ID
1 10
1 20
1 30
2 50
2 60
2 70
3 80

Quick note: In order for a bundle to be present on a given Account ID, ALL Product IDs within the bundle must be present. My goal is to have this sort of output:

Account ID Bundle ID # of Products outside Bundles
1000 1 0
1001 1 0
1001 2 0
1002 2 0
1003 3 0
1004 3 2

I expect to be eliminating/reviewing any outliers in the result set, i.e. Account IDs which have a number in the third column =/ 0

Initially I was just trying to use cases before making the actual table that the Bundles reside in, and then opted to put the product selection inside of a CTE. My second approach was to string together every product name (another column part of the first table, names are distinct) and associate the extended string listing every single product the account owns as a new column within the CTE. Then, if the Account ID hits every single qualifier (i.e. PRODUCTSTRING LIKE 'Product1' AND PRODUCTSTRING LIKE 'Product2' . . . ) then it fulfills the case. This approach feels shaky at best and at worst like it will just give incorrect results for me unknowingly.

The only other approach that I can think of thus far is to count the number of unique products an Account ID has and match it to the count of unique products of a bundle, and then run cases based off of that, but even that feels like it will become extremely bulky very quickly.

The Account ID/Product ID view I'm working off of has around 1300 rows. Could anyone point me in the direction of either an approach I have not considered or a tool I am missing?

1

There are 1 answers

0
Charlieface On

This starts off as a kind of Relational Division With Remainder problem. You are missing the Account and Bundle tables, which store the unique values respectively, I have assumed these exist.

However, this is going to be extremely difficult to get what you want, especially given how poorly defined the problem is.

If you wanted to know just how many ProductsRemaining there are after applying a single Bundle, you can do as follows:

  • Cross-join Account with Bundle
  • Then, within an APPLY...
  • ... take the AccountProduct for that Account...
  • ... and the BundleProduct for that Bundle...
  • ... and full-join them.
  • Remove anything which has a missing AccountProduct, ie there exists a BundleProduct with no matching AccountProduct.
  • Take the count of AccountProduct with no BundleProduct.
SELECT
  a.Id AS AccountId,
  b.Id AS BundleId,
  bp.*
FROM Account a
CROSS JOIN Bundle b
CROSS APPLY (
    SELECT
      COUNT(*) - COUNT(bp.ProductId) AS ProductsRemaining, COUNT(*) - COUNT(ap.ProductId) ap, count(*) c
    FROM (
        SELECT *
        FROM AccountProduct ap
        WHERE ap.AccountId = a.Id
    ) ap
    FULL JOIN (
        SELECT *
        FROM BundleProduct bp
        WHERE bp.BundleId = b.Id
    ) bp ON bp.ProductId = ap.ProductId
    HAVING COUNT(*) = COUNT(ap.ProductId)  -- no missing products
) bp;

db<>fiddle

The problems start when, judging by your comments, you want to be able to take any possible combination of Bundle to make up the products, but we don't necessarily know which combination. There may be multiple possible combinations, some may be contradictory. You may need a combinatorics approach in some procedural language, trying every possible combnation of bundles.

For example, what happens if you put together one bundle which contains products 1, 2 and another containing 2, 3, is that a valid combination for an account which has products 1, 2, 3?