EXPLAIN tbl_name
or EXPLAIN SELECT select_options
EXPLAIN tbl_name
is a synonym for DESCRIBE tbl_name
or
SHOW COLUMNS FROM tbl_name
.
When you precede a SELECT
statement with the keyword EXPLAIN
,
MySQL explains how it would process the SELECT
, providing
information about how tables are joined and in which order.
With the help of EXPLAIN
, you can see when you must add indexes
to tables to get a faster SELECT
that uses indexes to find the
records. You can also see if the optimizer joins the tables in an optimal
order. To force the optimizer to use a specific join order for a
SELECT
statement, add a STRAIGHT_JOIN
clause.
For non-simple joins, EXPLAIN
returns a row of information for each
table used in the SELECT
statement. The tables are listed in the order
they would be read. MySQL resolves all joins using a single-sweep
multi-join method. This means that MySQL reads a row from the first
table, then finds a matching row in the second table, then in the third table
and so on. When all tables are processed, it outputs the selected columns and
backtracks through the table list until a table is found for which there are
more matching rows. The next row is read from this table and the process
continues with the next table.
Output from EXPLAIN
includes the following columns:
table
-
The table to which the row of output refers.
type
-
The join type. Information about the various types is given below.
possible_keys
-
The
possible_keys
column indicates which indexes MySQL could
use to find the rows in the table. If this column is empty, there are no
relevant indexes. In this case, you may be able to improve the performance
of your query by examining the WHERE
clause to see if it refers to
some column or columns that would be suitable for indexing. If so, create an
appropriate index and check the query with EXPLAIN
again.
To see what indexes a table has, use SHOW INDEX FROM tbl_name
.
key
-
The
key
column indicates the key that MySQL actually
decided to use. The key is NULL
if no index was chosen.
key_len
-
The
key_len
column indicates the length of the key that
MySQL decided to use. The length is NULL
if the key
is NULL
.
ref
-
The
ref
column shows which columns or constants are used with the
key
to select rows from the table.
rows
-
The
rows
column indicates the number of rows MySQL must
examine to execute the query.
Extra
-
If the
Extra
column includes the text Only index
, this means
that information is retrieved from the table using only information in the
index tree. Normally, this is much faster than scanning the entire table.
If the Extra
column includes the text where used
, it means that
a WHERE
clause will be used to restrict which rows will be matched
against the next table or sent to the client.
The different join types are listed below, ordered from best to worst type:
system
-
The table has only one row (= system table). This is a special case of
the
const
join type.
const
-
The table has at most one matching row, which will be read at the start of
the query. Since there is only one row, values from the column in this row
can be regarded as constants by the rest of the optimizer.
const
tables are very fast as they are read only once!
eq_ref
-
One row will be read from this table for each combination of rows from
the previous tables. This the best possible join type, other than the
const
types. It is used when all parts of an index are used by
the join and the index is UNIQUE
or a PRIMARY KEY
.
ref
-
All rows with matching index values will be read from this table for each
combination of rows from the previous tables.
ref
is used if the join
uses only a leftmost prefix of the key, or if the key is not UNIQUE
or a PRIMARY KEY
(in other words, if the join cannot select a single
row based on the key value). If the key that is used matches only a few rows,
this join type is good.
range
-
Only rows that are in a given range will be retrieved, using an index to
select the rows. The
ref
column indicates which index is used.
index
-
This is the same as
ALL
, except that only the index tree is
scanned. This is usually faster than ALL
, as the index file is usually
smaller than the data file.
ALL
-
A full table scan will be done for each combination of rows from the
previous tables. This is normally not good if the table is the first
table not marked
const
, and usually very bad in all other
cases. You normally can avoid ALL
by adding more indexes, so that
the row can be retrieved based on constant values or column values from
earlier tables.
You can get a good indication of how good a join is by multiplying all values
in the rows
column of the EXPLAIN
output. This should tell you
roughly how many rows MySQL must examine to execute the query. This
number is also used when you restrict queries with the max_join_size
variable.
10.1 Tuning server parameters.
The following example shows how a JOIN
can be optimized progressively
using the information provided by EXPLAIN
.
Suppose you have the SELECT
statement shown below, that you examine
using EXPLAIN
:
EXPLAIN SELECT tt.TicketNumber, tt.TimeIn,
tt.ProjectReference, tt.EstimatedShipDate,
tt.ActualShipDate, tt.ClientID,
tt.ServiceCodes, tt.RepetitiveID,
tt.CurrentProcess, tt.CurrentDPPerson,
tt.RecordVolume, tt.DPPrinted, et.COUNTRY,
et_1.COUNTRY, do.CUSTNAME
FROM tt, et, et AS et_1, do
WHERE tt.SubmitTime IS NULL
AND tt.ActualPC = et.EMPLOYID
AND tt.AssignedPC = et_1.EMPLOYID
AND tt.ClientID = do.CUSTNMBR;
For this example, assume that:
-
The columns being compared have been declared as follows:
Table | Column | Column type
|
tt | ActualPC | CHAR(10)
|
tt | AssignedPC | CHAR(10)
|
tt | ClientID | CHAR(10)
|
et | EMPLOYID | CHAR(15)
|
do | CUSTNMBR | CHAR(15)
|
-
The tables have the indexes shown below:
Table | Index
|
tt | ActualPC
|
tt | AssignedPC
|
tt | ClientID
|
et | EMPLOYID (primary key)
|
do | CUSTNMBR (primary key)
|
-
The
tt.ActualPC
values aren't evenly distributed.
Initially, before any optimizations have been performed, the EXPLAIN
statement produces the following information:
table type possible_keys key key_len ref rows Extra
et ALL PRIMARY NULL NULL NULL 74
do ALL PRIMARY NULL NULL NULL 2135
et_1 ALL PRIMARY NULL NULL NULL 74
tt ALL AssignedPC,ClientID,ActualPC NULL NULL NULL 3872
range checked for each record (key map: 35)
Since type
is ALL
for each table, this output indicates that
MySQL is doing a full join for all tables! This will take quite a
long time, as the product of the number of rows in each table must be
examined! For the case at hand, this is 74 * 2135 * 74 * 3872 =
45,268,558,720
rows. If the tables were bigger, you can only imagine how
long it would take...
One problem here is that MySQL can't (yet) use indexes on columns
efficiently if they are declared differently. In this context,
VARCHAR
and CHAR
are the same unless they are declared as
different lengths. Since tt.ActualPC
is declared as CHAR(10)
and et.EMPLOYID
is declared as CHAR(15)
, there is a length
mismatch.
To fix this disparity between column lengths, use ALTER TABLE
to
lengthen ActualPC
from 10 characters to 15 characters:
mysql> ALTER TABLE tt MODIFY ActualPC VARCHAR(15);
Now tt.ActualPC
and et.EMPLOYID
are both VARCHAR(15)
.
Executing the EXPLAIN
statement again produces this result:
table type possible_keys key key_len ref rows Extra
tt ALL AssignedPC,ClientID,ActualPC NULL NULL NULL 3872 where used
do ALL PRIMARY NULL NULL NULL 2135
range checked for each record (key map: 1)
et_1 ALL PRIMARY NULL NULL NULL 74
range checked for each record (key map: 1)
et eq_ref PRIMARY PRIMARY 15 tt.ActualPC 1
This is not perfect, but is much better (the product of the rows
values is now less by a factor of 74). This version is executed in a couple
of seconds.
A second alteration can be made to eliminate the column length mismatches
for the tt.AssignedPC = et_1.EMPLOYID
and tt.ClientID =
do.CUSTNMBR
comparisons:
mysql> ALTER TABLE tt MODIFY AssignedPC VARCHAR(15),
MODIFY ClientID VARCHAR(15);
Now EXPLAIN
produces the output shown below:
table type possible_keys key key_len ref rows Extra
et ALL PRIMARY NULL NULL NULL 74
tt ref AssignedPC,ClientID,ActualPC ActualPC 15 et.EMPLOYID 52 where used
et_1 eq_ref PRIMARY PRIMARY 15 tt.AssignedPC 1
do eq_ref PRIMARY PRIMARY 15 tt.ClientID 1
This is ``almost'' as good as it can get.
The remaining problem is that, by default, MySQL assumes that values
in the tt.ActualPC
column are evenly distributed, and that isn't the
case for the tt
table. Fortunately, it is easy to tell MySQL
about this:
shell> isamchk --analyze PATH_TO_MYSQL_DATABASE/tt
shell> mysqladmin refresh
Now the join is ``perfect'', and EXPLAIN
produces this result:
table type possible_keys key key_len ref rows Extra
tt ALL AssignedPC,ClientID,ActualPC NULL NULL NULL 3872 where used
et eq_ref PRIMARY PRIMARY 15 tt.ActualPC 1
et_1 eq_ref PRIMARY PRIMARY 15 tt.AssignedPC 1
do eq_ref PRIMARY PRIMARY 15 tt.ClientID 1
Note that the rows
column in the output from EXPLAIN
is an
``educated guess'' from the MySQL join optimizer; To optimize a
query, you should check if the numbers are even close to the truth. If not,
you may get better performance by using STRAIGHT_JOIN
in your
SELECT
statement and trying to list the tables in a different order in
the FROM
clause.