Why SQL queries hang on MySQL while MSSQL is OK

Question

Why SQL queries hang on MySQL while MSSQL is OK

51 views Asked by Randy Soriano At 05 November 2021 at 09:16

I'm working on a system report that will retrieve last purchase cost of item with a condition of cut-off date and retrieve the max(date) "latest purchase" and when a latest purchase has multiple cost it will retrieve the higher cost. Also it is being group by location and item, please see details below:

Query:

SELECT t1.location, t1.item, MAX(t1.cost) AS cost, 
       t1.date 
  FROM MyTable AS t1 
     WHERE t1.date = (SELECT MAX(t2.date) 
                        FROM MyTable AS t2 
                             WHERE t1.location = t2.location 
                             AND t1.item = t2.item 
                             AND t1.cost <> 0 
                             AND t2.date <= '2021/10/31') 
 GROUP BY t1.location, t1.item, t1.date;

MyTable Data:

LOCATION	ITEM	COST	DATE
Loc1	Item#1	100.00	02/01/2021
Loc1	Item#1	105.00	02/15/2021
Loc1	Item#1	103.00	02/19/2021
Loc2	Item#1	107.00	02/01/2021
Loc2	Item#1	105.00	02/13/2021
Loc2	Item#1	100.00	02/13/2021

Expected Result:

LOCATION	ITEM	COST	DATE
Loc1	Item#1	103.00	02/19/2021
Loc2	Item#1	105.00	02/13/2021

Can somebody help me to explain why the queries is OK on MSSQL while on MySQL it hang-up and take too long to execute and retrieve the data???

Also please take note that the number of ROWS/DATA on the MyTable is the same on MSSQL and MySQL around 500K ROWS. I tried to find the issues some say that the queries are have uncorrelated subquery, that's why is is inefficient query. Can somebody help me to re-code the the query in able to make it efficient and run on MySQL??? Please help!

I also tried to execute it on MySQL on table with 1K ROWS it seems the query is OK.

I hope you can help me and I needed it badly. Thanks in advance :)

Original Q&A

There are 1 answers

**FanoFN** · Answer 1 · 2021-11-06T02:08:58+00:00

Let's assume your table structure is like this:

CREATE TABLE `mytable` (
  `Location` varchar(50) DEFAULT NULL,
  `item` varchar(50) DEFAULT NULL,
  `cost` decimal(14,2) DEFAULT NULL,
  `date` date DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

As you can see, there's no index being set on the table. I've created a fake table in my local database and insert 500K rows of data. With your query, I stopped after 10+ minutes it ran because honestly, that's more than enough to say the query is inefficient. Then I changed it to JOIN approach instead, so the modified query is like this:

SELECT t1.location, t1.item, MAX(t1.cost) AS cost, 
       t1.date 
  FROM MyTable AS t1 
     JOIN (SELECT MAX(t2.date) t2d,  t2.location, t2.item
                        FROM MyTable AS t2 
                             WHERE t2.date <= '2021/10/31'
                             GROUP BY  t2.location, t2.item) ref
   ON t1.date = ref.t2d 
   AND t1.location = ref.location 
   AND t1.item = ref.item 
 WHERE t1.cost <> 0 
 GROUP BY t1.location, t1.item, t1.date;

Still with the same table structure as above (with no index) and 500K of data, this query returns me:

1 queries executed, 1 success, 0 errors, 0 warnings

303 row(s) affected

Execution Time : 0.531 sec
Transfer Time  : 0 sec
Total Time     : 0.532 sec

A total of 303 rows out of 500K data in less than 1 second. But then, 303 rows is just a little compared to the total 500K, right? So I've updated most of the date column in the table all to before '2021/10/31' and test the query again and here's the result:

1 queries executed, 1 success, 0 errors, 0 warnings

416507 row(s) affected

Execution Time : 16.413 sec
Transfer Time  : 0.208 sec
Total Time     : 16.621 sec

The query returns 416,507 of data out of 500K in about 16 seconds. Not by all means fast but remember that this is on a table without indexes. Let's try this with indexes. Now, I'm not going to do index on one by one of the column then test it one by one too, rather I'm just going to assign each of the column with its own index in one go and do the testing. Syntax to add indexes:

ALTER TABLE mytable   
  ADD INDEX (Location),
  ADD INDEX (item),
  ADD INDEX (cost),
  ADD INDEX (date)

And the query result after the indexes been created:

1 queries executed, 1 success, 0 errors, 0 warnings

416507 row(s) affected

Execution Time : 7.753 sec
Transfer Time  : 0.220 sec
Total Time     : 7.973 sec

The time it took to return is half from before it has indexes on the columns. The execution plan of the query looks like this:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY		ALL	NULL	NULL	NULL	NULL	248559	Using where; Using temporary; Using filesort
1	PRIMARY	t1	ref	Location,item,cost,date	Location	53	ref.location	1	Using where
2	DERIVED	t2	ALL	Location,item,date	NULL	NULL	NULL	497119	Using where; Using temporary; Using filesort

Finally, let's go back to your original query and see how significant the indexes affecting it performance. With the index in place, your original query returns this:

1 queries executed, 1 success, 0 errors, 0 warnings

Query: SELECT t1.location, t1.item, MAX(t1.cost) AS cost, 
       t1.date 
  FROM MyTable AS t1 
     WHERE t1.date = (SELECT MAX(t2.date) 
                        FROM MyTable AS t2 
                             WHERE t1.location = t2.location 
                             AND t1.item = t2.item 
                             AND t1.cost <> 0 
                             AND t2.date <= '2021/10/31') 
 GROUP BY t1.location, t1.item, t1.date;
  
416507 row(s) affected

Execution Time : 9.840 sec
Transfer Time  : 0.202 sec
Total Time     : 10.042 sec

And the execution plan:

id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	PRIMARY	t1	ALL	NULL	NULL	NULL	NULL	497119	Using where; Using temporary; Using filesort
2	DEPENDENT SUBQUERY	t2	ref	Location,item,date	Location	53	test.t1.Location	1	Using where

As a conclusion, without indexes on your table, your original query definitely take forever to run whereas turning it into a JOIN significantly reduces the duration. With indexes being set, the JOIN query is around 2-3 seconds faster than your original query however the most important thing to note is that your original query struggle to even return a mere 303 rows of data result before the table have indexes. You haven't post your table structure yet and I did post a (now deleted) comment asking for it however after doing the test myself, I'm pretty sure that your current table don't have any index.

P/S: This test data is too big for fiddles so I have the test code in Pastebin instead. You can test this on your local server. The insert data syntax is only compatible with MySQL v8+ and MariaDB 10.2.2 and above.

https://pastebin.com/GdXhQ5mw

TechQA.

Why SQL queries hang on MySQL while MSSQL is OK

There are 1 answers

Related Questions in MYSQL

Related Questions in SUBQUERY

Related Questions in CORRELATED-SUBQUERY

Related Questions in MSSQL-JDBC

Related Questions in MAXDATE

Popular Questions

Trending Questions