I have customers, jobs, orders and dates as columns. Customers can have multiple orders per job and of course multiple jobs given a certain date range.
What I am trying to do is figure out how many "new" orders there are within a certain date range. Meaning, if a job has 5 orders, it should only return 1 order.
Here's a short example of a database table, desired result, and the query I have been trying to work with thus far.
+-------+-------+---------+------------+
| CusID | JobID | OrderNo | OrderDate |
+-------+-------+---------+------------+
| 1 | 10 | 25 | 2021-12-22 |
| 1 | 10 | 26 | 2022-02-09 |
| 3 | 5 | 28 | 2022-01-10 |
| 3 | 6 | 29 | 2022-01-11 |
+-------+-------+---------+------------+
There are 4 rows of orders with their associated JobID. The query should return 2 rows since order 25 and order 26 are both associated with JobID 10, only 1 can be counted as "new", while OrderNo 25 falls outside of the outer queries date range.
Here is the query I am trying to use, based on the date being >= to the first of the year. I want to be able to use the OrderDate from the outer query, to compare with the OrderDate from the inner query. If there is a JobID with an OrderDate that is less than the current rows OrderDate that order is NOT new and should return the JobID, thus making the inner query false and filtering out the row.
SELECT * FROM orders
WHERE OrderDate >= '2022-01-01'
AND JobID NOT IN
(SELECT JobID FROM orders WHERE inner.OrderDate < outer.OrderDate)
Expected result
| 3 | 5 | 28 | 2022-01-10 |
| 3 | 6 | 29 | 2022-01-11 |
Use
row_numberto rank the orders for each job. Put that in a subquery and only accept thenewestoldest? order for each job.Demonstration