Why SQL queries hang on MySQL while MSSQL is OK

51 views Asked by At

I'm working on a system report that will retrieve last purchase cost of item with a condition of cut-off date and retrieve the max(date) "latest purchase" and when a latest purchase has multiple cost it will retrieve the higher cost. Also it is being group by location and item, please see details below:

Query:

SELECT t1.location, t1.item, MAX(t1.cost) AS cost, 
       t1.date 
  FROM MyTable AS t1 
     WHERE t1.date = (SELECT MAX(t2.date) 
                        FROM MyTable AS t2 
                             WHERE t1.location = t2.location 
                             AND t1.item = t2.item 
                             AND t1.cost <> 0 
                             AND t2.date <= '2021/10/31') 
 GROUP BY t1.location, t1.item, t1.date;

MyTable Data:

LOCATION ITEM COST DATE
Loc1 Item#1 100.00 02/01/2021
Loc1 Item#1 105.00 02/15/2021
Loc1 Item#1 103.00 02/19/2021
Loc2 Item#1 107.00 02/01/2021
Loc2 Item#1 105.00 02/13/2021
Loc2 Item#1 100.00 02/13/2021

Expected Result:

LOCATION ITEM COST DATE
Loc1 Item#1 103.00 02/19/2021
Loc2 Item#1 105.00 02/13/2021

Can somebody help me to explain why the queries is OK on MSSQL while on MySQL it hang-up and take too long to execute and retrieve the data???

Also please take note that the number of ROWS/DATA on the MyTable is the same on MSSQL and MySQL around 500K ROWS. I tried to find the issues some say that the queries are have uncorrelated subquery, that's why is is inefficient query. Can somebody help me to re-code the the query in able to make it efficient and run on MySQL??? Please help!

I also tried to execute it on MySQL on table with 1K ROWS it seems the query is OK.

I hope you can help me and I needed it badly. Thanks in advance :)

1

There are 1 answers

0
FanoFN On

Let's assume your table structure is like this:

CREATE TABLE `mytable` (
  `Location` varchar(50) DEFAULT NULL,
  `item` varchar(50) DEFAULT NULL,
  `cost` decimal(14,2) DEFAULT NULL,
  `date` date DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

As you can see, there's no index being set on the table. I've created a fake table in my local database and insert 500K rows of data. With your query, I stopped after 10+ minutes it ran because honestly, that's more than enough to say the query is inefficient. Then I changed it to JOIN approach instead, so the modified query is like this:

SELECT t1.location, t1.item, MAX(t1.cost) AS cost, 
       t1.date 
  FROM MyTable AS t1 
     JOIN (SELECT MAX(t2.date) t2d,  t2.location, t2.item
                        FROM MyTable AS t2 
                             WHERE t2.date <= '2021/10/31'
                             GROUP BY  t2.location, t2.item) ref
   ON t1.date = ref.t2d 
   AND t1.location = ref.location 
   AND t1.item = ref.item 
 WHERE t1.cost <> 0 
 GROUP BY t1.location, t1.item, t1.date;

Still with the same table structure as above (with no index) and 500K of data, this query returns me:

1 queries executed, 1 success, 0 errors, 0 warnings

303 row(s) affected

Execution Time : 0.531 sec
Transfer Time  : 0 sec
Total Time     : 0.532 sec

A total of 303 rows out of 500K data in less than 1 second. But then, 303 rows is just a little compared to the total 500K, right? So I've updated most of the date column in the table all to before '2021/10/31' and test the query again and here's the result:

1 queries executed, 1 success, 0 errors, 0 warnings

416507 row(s) affected

Execution Time : 16.413 sec
Transfer Time  : 0.208 sec
Total Time     : 16.621 sec

The query returns 416,507 of data out of 500K in about 16 seconds. Not by all means fast but remember that this is on a table without indexes. Let's try this with indexes. Now, I'm not going to do index on one by one of the column then test it one by one too, rather I'm just going to assign each of the column with its own index in one go and do the testing. Syntax to add indexes:

ALTER TABLE mytable   
  ADD INDEX (Location),
  ADD INDEX (item),
  ADD INDEX (cost),
  ADD INDEX (date)

And the query result after the indexes been created:

1 queries executed, 1 success, 0 errors, 0 warnings

416507 row(s) affected

Execution Time : 7.753 sec
Transfer Time  : 0.220 sec
Total Time     : 7.973 sec

The time it took to return is half from before it has indexes on the columns. The execution plan of the query looks like this:

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY ALL NULL NULL NULL NULL 248559 Using where; Using temporary; Using filesort
1 PRIMARY t1 ref Location,item,cost,date Location 53 ref.location 1 Using where
2 DERIVED t2 ALL Location,item,date NULL NULL NULL 497119 Using where; Using temporary; Using filesort

Finally, let's go back to your original query and see how significant the indexes affecting it performance. With the index in place, your original query returns this:

1 queries executed, 1 success, 0 errors, 0 warnings

Query: SELECT t1.location, t1.item, MAX(t1.cost) AS cost, 
       t1.date 
  FROM MyTable AS t1 
     WHERE t1.date = (SELECT MAX(t2.date) 
                        FROM MyTable AS t2 
                             WHERE t1.location = t2.location 
                             AND t1.item = t2.item 
                             AND t1.cost <> 0 
                             AND t2.date <= '2021/10/31') 
 GROUP BY t1.location, t1.item, t1.date;
  
416507 row(s) affected

Execution Time : 9.840 sec
Transfer Time  : 0.202 sec
Total Time     : 10.042 sec

And the execution plan:

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 ALL NULL NULL NULL NULL 497119 Using where; Using temporary; Using filesort
2 DEPENDENT SUBQUERY t2 ref Location,item,date Location 53 test.t1.Location 1 Using where

As a conclusion, without indexes on your table, your original query definitely take forever to run whereas turning it into a JOIN significantly reduces the duration. With indexes being set, the JOIN query is around 2-3 seconds faster than your original query however the most important thing to note is that your original query struggle to even return a mere 303 rows of data result before the table have indexes. You haven't post your table structure yet and I did post a (now deleted) comment asking for it however after doing the test myself, I'm pretty sure that your current table don't have any index.

P/S: This test data is too big for fiddles so I have the test code in Pastebin instead. You can test this on your local server. The insert data syntax is only compatible with MySQL v8+ and MariaDB 10.2.2 and above.

https://pastebin.com/GdXhQ5mw