When MySQL 5.7 DMR3 was released, I couldn’t wait to try out the new UNION ALL
optimization that no longer requires data to be materialized in a temporary table.
This new optimization can be seen via EXPLAIN
, but it is also nice to run it through SHOW PROFILES
, which breaks down query execution step by step. However, this feature is now deprecated (since it overlaps with performance_schema
), and will be removed in a future version.
So today I wanted to show you a combination of:
* What a UNION ALL
statement looks like in MySQL 5.6 (EXPLAIN
, SHOW PROFILES
).
* How it is improved in MySQL 5.7 (EXPLAIN
, SHOW PROFILES
).
* How you can easily emulate the SHOW PROFILES
feature with performance_schema
+ ps_helper 🙂
Initial Setup
The UNION ALL
query I am going to demonstrate uses the following fake sample data:
use test; CREATE TABLE `table_a` ( `col1` int(11) NOT NULL AUTO_INCREMENT, `col2` char(255) DEFAULT NULL, PRIMARY KEY (`col1`) ) ENGINE=InnoDB; CREATE TABLE `table_b` ( `col1` int(11) NOT NULL AUTO_INCREMENT, `col2` char(255) DEFAULT NULL, PRIMARY KEY (`col1`) ) ENGINE=InnoDB; INSERT INTO table_a (col2) VALUES ('A'), ('AA'), ('AAA'); INSERT INTO table_b (col2) VALUES ('B'), ('BB'), ('BBB');
The query that I want to use is this one:
mysql> SELECT * FROM table_a UNION ALL SELECT * FROM table_b; +------+------+ | col1 | col2 | +------+------+ | 1 | A | | 2 | AA | | 3 | AAA | | 1 | B | | 2 | BB | | 3 | BBB | +------+------+ 6 rows in set (0.00 sec)
MySQL 5.6
We can see from EXPLAIN
that MySQL 5.6 requires a step to insert the rows into a temporary table, from which the results can be returned:
mysql> EXPLAIN SELECT * FROM table_a UNION ALL SELECT * FROM table_b\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: table_a type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 3 Extra: NULL *************************** 2. row *************************** id: 2 select_type: UNION table: table_b type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 3 Extra: NULL *************************** 3. row *************************** id: NULL select_type: UNION RESULT table: <union1,2> type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: Using temporary 3 rows in set (0.00 sec)
Via SHOW PROFILES
, we can also see there are multiple stages of Sending data
. I believe the best way to describe this state as ‘shipping rows around in the server and between storage engines’:
mysql> set profiling=1; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> show profiles; +----------+------------+-------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+-------------------------------------------------------+ | 1 | 0.00039200 | SELECT * FROM table_a UNION ALL SELECT * FROM table_b | +----------+------------+-------------------------------------------------------+ 2 rows in set, 1 warning (0.01 sec) mysql> show profile for query 1; +----------------------+----------+ | Status | Duration | +----------------------+----------+ | starting | 0.000059 | | checking permissions | 0.000005 | | checking permissions | 0.000005 | | Opening tables | 0.000091 | | System lock | 0.000011 | | optimizing | 0.000004 | | statistics | 0.000012 | | preparing | 0.000009 | | optimizing | 0.000002 | | statistics | 0.000005 | | preparing | 0.000004 | | executing | 0.000003 | | Sending data | 0.000053 | | executing | 0.000002 | | Sending data | 0.000029 | | optimizing | 0.000006 | | statistics | 0.000006 | | preparing | 0.000004 | | executing | 0.000002 | | Sending data | 0.000019 | | removing tmp table | 0.000007 | | Sending data | 0.000002 | | query end | 0.000007 | | closing tables | 0.000010 | | freeing items | 0.000019 | | cleaning up | 0.000016 | +----------------------+----------+ 26 rows in set, 1 warning (0.00 sec)
MySQL 5.7
The output from EXPLAIN
now shows the query executing in two steps instead of three. The warning here is benign. It is caused by EXPLAIN EXTENDED
now being permanently enabled:
mysql> EXPLAIN SELECT * FROM table_a UNION ALL SELECT * FROM table_b\G *************************** 1. row *************************** id: 1 select_type: PRIMARY table: table_a partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 3 filtered: 100.00 Extra: NULL *************************** 2. row *************************** id: 2 select_type: UNION table: table_b partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 3 filtered: 100.00 Extra: NULL 2 rows in set, 1 warning (0.00 sec)
And using SHOW PROFILES
we can see that there are now only two Sending data
steps:
mysql> set profiling=1; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> show profiles; +----------+------------+-------------------------------------------------------+ | Query_ID | Duration | Query | +----------+------------+-------------------------------------------------------+ | 1 | 0.00038900 | SELECT * FROM table_a UNION ALL SELECT * FROM table_b | +----------+------------+-------------------------------------------------------+ 1 row in set, 1 warning (0.00 sec) mysql> show profile for query 1; +----------------------+----------+ | Status | Duration | +----------------------+----------+ | starting | 0.000094 | | checking permissions | 0.000013 | | checking permissions | 0.000008 | | Opening tables | 0.000068 | | System lock | 0.000012 | | optimizing | 0.000015 | | statistics | 0.000012 | | preparing | 0.000010 | | optimizing | 0.000003 | | statistics | 0.000005 | | preparing | 0.000006 | | executing | 0.000002 | | Sending data | 0.000046 | | executing | 0.000003 | | Sending data | 0.000014 | | query end | 0.000010 | | removing tmp table | 0.000012 | | query end | 0.000003 | | closing tables | 0.000010 | | freeing items | 0.000027 | | cleaning up | 0.000016 | +----------------------+----------+ 21 rows in set, 1 warning (0.00 sec)
You will notice that there is in fact still a temporary table, denoted by the step removing tmp table
. The release notes explain this point very carefully:
“The server no longer uses a temporary table for UNION statements that
meet certain qualifications. Instead, it retains from temporary table
creation only the data structures necessary to perform result column
typecasting. The table is not fully instantiated and no rows are
written to or read from it; rows are sent directly to the client. As
a result, The result is reduced memory and disk requirements, and
smaller delay before the first row is sent to the client because the
server need not wait until the last query block is executed. EXPLAIN
and optimizer trace output will change: The UNION RESULT query block
will not be present because that block is the part that reads from
the temporary table.”
Example using ps_helper
And now to demonstrate how to replace SHOW PROFILES
with performance_schema
! To do this, I decided to base my scripts on ps_helper
by Mark Leith.
There is no specific reason you need to do this, but I admire the way ps_helper works and its something I’ve wanted to try extending for a while. It also includes useful helper functions to intelligently truncate/format SQL, and convert times to a human readable format.
So the steps are:
- Install ps_helper.
- Install my sql script ps-show-profiles.sql.
Not specifically a limitation of performance_schema, but one small difference with my script is that it is enabled globally as follows:
call ps_helper.enable_profiling();
In any new session, you can then use the profiling feature as follows:
mysql> SELECT * FROM table_a UNION ALL SELECT * FROM table_b; +------+------+ | col1 | col2 | +------+------+ | 1 | A | | 2 | AA | | 3 | AAA | | 1 | B | | 2 | BB | | 3 | BBB | +------+------+ 6 rows in set (0.00 sec) mysql> call ps_helper.show_profiles(); +----------+-----------+-------------------------------------------------------+ | Event_ID | Duration | Query | +----------+-----------+-------------------------------------------------------+ | 58 | 58.57 us | select @@version_comment limit 1 | | 70 | 73.19 us | select USER() | | 82 | 297.47 us | SELECT * FROM table_a UNION ALL SELECT * FROM table_b | +----------+-----------+-------------------------------------------------------+ 3 rows in set (0.04 sec) Query OK, 0 rows affected (0.04 sec) mysql> call ps_helper.show_profile_for_event_id(82); +----------------------+-----------+ | Status | Duration | +----------------------+-----------+ | init | 55.55 us | | checking permissions | 2.15 us | | checking permissions | 3.29 us | | Opening tables | 68.30 us | | System lock | 7.43 us | | optimizing | 2.34 us | | statistics | 9.97 us | | preparing | 7.53 us | | optimizing | 606.00 ns | | statistics | 2.99 us | | preparing | 2.62 us | | executing | 591.00 ns | | Sending data | 73.23 us | | executing | 484.00 ns | | Sending data | 23.93 us | | query end | 7.46 us | | removing tmp table | 3.37 us | | closing tables | 7.68 us | | freeing items | 15.27 us | | cleaning up | 804.00 ns | +----------------------+-----------+ 20 rows in set (0.00 sec) Query OK, 0 rows affected (0.00 sec)
I renamed the command SHOW PROFILE FOR QUERY x
to show_profile_for_event_id
, since this better maps to performance_schema
naming, but otherwise it behaves pretty much the same.
For some reason, there is a second query end
state in the SHOW PROFILES
version after removing the temporary table. I’m not sure why it is, and if not including it will be an issue.
Conclusion
It’s great to see these optimizations introduced into MySQL – I can think of a number of users who can benefit from a better UNION ALL
.
I implemented my own SHOW PROFILES
feature really as an educational step for myself to make sure I knew performance_schema
, and it was actually quite a quick and joyful experience. I plan to work with the maintainer of ps_helper and see if this can be included in future versions.