Collaborative Filtering with SQL

66 views Asked by At

i'm trying to create a recommendation system with sql. Im using the following code:

CREATE TABLE TestTable (
  UserId INT,
  ProductId INT,
  Rating DECIMAL(3, 1),
  PRIMARY KEY (UserId, ProductId)
);


INSERT INTO TestTable (UserId, ProductId, Rating)
VALUES
  (1, 1001, 4.5),
  (1, 1002, 3.8),
  (1, 1003, 4.2),
  (1, 1004, 3.0),
  (2, 1002, 4.0),
  (2, 1003, 3.5),
  (2, 1005, 4.1),
  (3, 1001, 3.2),
  (3, 1003, 4.5),
  (3, 1004, 4.0),
  (4, 1001, 4.1),
  (4, 1002, 3.9),
  (4, 1004, 3.7),
  (4, 1005, 4.2);
  
  SELECT t2.ProductId
FROM TestTable t1 
INNER JOIN TestTable t2 ON t1.UserId = 1 AND t2.UserId IN (2, 3, 4)
     AND t2.ProductId NOT IN (
         SELECT ProductId
         FROM TestTable
         WHERE UserId = 1
     )
     AND t1.ProductId = t2.ProductId
GROUP BY t2.ProductId
HAVING AVG(t2.Rating) >= 3 
     AND COUNT(DISTINCT t2.UserId) >= 3; 

The code returns zero rows. Maybe anybody here has an idea :) Thanks in advance!

1

There are 1 answers

0
lemon On BEST ANSWER

Here's the approach I would take:

  • gather similar users, according to items they both liked (matching product ids, similar avg rating, at least three products)
  • select products of those similar users, whose products are not yet liked by the original user
WITH cte AS (
    SELECT t2.UserId
    FROM       TestTable t1 
    INNER JOIN TestTable t2 
            ON t1.UserId = 1 AND t2.UserId IN (2, 3, 4)
           AND t1.ProductId = t2.ProductId
    GROUP BY t1.UserId, t2.UserId
    HAVING COUNT(t2.UserId) >= 3
       AND ABS(AVG(t2.Rating - t1.Rating))<1
)
SELECT t1.ProductId
FROM TestTable t1
WHERE EXISTS(SELECT 1 FROM cte t2 WHERE t1.userid = t2.userid)
  AND NOT EXISTS(SELECT 1 FROM TestTable t2 WHERE userid = 1 AND t1.ProductID = t2.ProductId)

Output:

productid
1005

Here's a PostgreSQL demo, although this is likely to be working on the most common DBMSs.