I'm working on a system report that will retrieve last purchase cost of item with a condition of cut-off date and retrieve the max(date) "latest purchase" and when a latest purchase has multiple cost it will retrieve the higher cost. Also it is being group by location and item, please see details below:
Query:
SELECT t1.location, t1.item, MAX(t1.cost) AS cost,
t1.date
FROM MyTable AS t1
WHERE t1.date = (SELECT MAX(t2.date)
FROM MyTable AS t2
WHERE t1.location = t2.location
AND t1.item = t2.item
AND t1.cost <> 0
AND t2.date <= '2021/10/31')
GROUP BY t1.location, t1.item, t1.date;
MyTable Data:
| LOCATION | ITEM | COST | DATE |
|---|---|---|---|
| Loc1 | Item#1 | 100.00 | 02/01/2021 |
| Loc1 | Item#1 | 105.00 | 02/15/2021 |
| Loc1 | Item#1 | 103.00 | 02/19/2021 |
| Loc2 | Item#1 | 107.00 | 02/01/2021 |
| Loc2 | Item#1 | 105.00 | 02/13/2021 |
| Loc2 | Item#1 | 100.00 | 02/13/2021 |
Expected Result:
| LOCATION | ITEM | COST | DATE |
|---|---|---|---|
| Loc1 | Item#1 | 103.00 | 02/19/2021 |
| Loc2 | Item#1 | 105.00 | 02/13/2021 |
Can somebody help me to explain why the queries is OK on MSSQL while on MySQL it hang-up and take too long to execute and retrieve the data???
Also please take note that the number of ROWS/DATA on the MyTable is the same on MSSQL and MySQL around 500K ROWS. I tried to find the issues some say that the queries are have uncorrelated subquery, that's why is is inefficient query. Can somebody help me to re-code the the query in able to make it efficient and run on MySQL??? Please help!
I also tried to execute it on MySQL on table with 1K ROWS it seems the query is OK.
I hope you can help me and I needed it badly. Thanks in advance :)
Let's assume your table structure is like this:
As you can see, there's no index being set on the table. I've created a fake table in my local database and insert 500K rows of data. With your query, I stopped after 10+ minutes it ran because honestly, that's more than enough to say the query is inefficient. Then I changed it to
JOINapproach instead, so the modified query is like this:Still with the same table structure as above (with no index) and 500K of data, this query returns me:
A total of 303 rows out of 500K data in less than 1 second. But then, 303 rows is just a little compared to the total 500K, right? So I've updated most of the
datecolumn in the table all to before '2021/10/31' and test the query again and here's the result:The query returns 416,507 of data out of 500K in about 16 seconds. Not by all means fast but remember that this is on a table without indexes. Let's try this with indexes. Now, I'm not going to do index on one by one of the column then test it one by one too, rather I'm just going to assign each of the column with its own index in one go and do the testing. Syntax to add indexes:
And the query result after the indexes been created:
The time it took to return is half from before it has indexes on the columns. The execution plan of the query looks like this:
Finally, let's go back to your original query and see how significant the indexes affecting it performance. With the index in place, your original query returns this:
And the execution plan:
As a conclusion, without indexes on your table, your original query definitely take forever to run whereas turning it into a
JOINsignificantly reduces the duration. With indexes being set, theJOINquery is around 2-3 seconds faster than your original query however the most important thing to note is that your original query struggle to even return a mere 303 rows of data result before the table have indexes. You haven't post your table structure yet and I did post a (now deleted) comment asking for it however after doing the test myself, I'm pretty sure that your current table don't have any index.P/S: This test data is too big for fiddles so I have the test code in Pastebin instead. You can test this on your local server. The insert data syntax is only compatible with MySQL v8+ and MariaDB 10.2.2 and above.
https://pastebin.com/GdXhQ5mw