MySql InnoDB Indexes for Developers

This article aims to be a high level reference for developers who just want to use Mysql InnoDB indexes effectively. Although you can read this from top to bottom, I’d encourage you to save time by skipping to sections that interest you using the below table of contents.

Contents

Quick Reference

This is a quick list of the practical guidelines that you can use to make your applications more performant at the database layer. We’ll dive into the why each of these things are true, and give you mental models such that you can arrive at these guidelines yourself without memorization, at later sections of the article.


Index Structure

Indexes are constructed from a B+tree structure. Understanding this structure is key to gaining an intuitive reasoning for how queries will perform using different indexes. Both the PRIMARY index and secondary indexes have a similar tree structure, differing only in what data is stored on the leaves of the tree.

Primary Index Structure & Paging

Looking at the structure of a PRIMARY index first, we see that the full row is stored in each leaf of the tree, and each leaf is ordered sequentially on the disk:

note: page size greatly reduced for illustration

If possible, it is best to select the primary index for a table with data locality in mind, to reduce how many page loads need to take place to satisfy a query. For example, imagine a blogging application that almost always queries for posts on a specific blog: SELECT * FROM posts WHERE blog_id=10. Ideally we’d like for this table to have all the posts where blog_id = 10 stored sequentially in the primary index. However, by using a standard primary index on [id] we see a structure where rows are stored in disk pages in the order they were created. The diagram below illistrates this (ID integer, colour coded by blog)

By insteading using a composite primary key, this table can be made much more efficient given its access pattern. In the below diagram I show the same data, but instead stored in a composite primary index of [blog_id, id]. This means rows are first sorted by blog_id, and then by id for each blog. When the data is stored in this manner, our query for SELECT * FROM posts WHERE blog_id=10 is likely to load less pages from disk.

Secondary Index Structure

Secondary indexes have a similar structure to that of primary indexes. However, instead of storing the full row in the leaves of the tree, only the primary index value is stored:

This structure means that if additional columns are required (eg a SELECT *), then after traversing the secondary index, mysql must also traverse the primary index for each selected row to actually load the required data. By only selecting columns available in the secondary index, mysql can skip this extra lookup step.

Composite Index Structure

When multiple columns are specified in a compound index, we get a structure composed of nested B+trees. This is where the ordering of columns within an index definition becomes relevant. The root of the tree is defined by the first column in the index definition, the second nested tree would be the second column in the index definition, and so on.

The below diagram is the structure of a composite index [blog_id, created_at] (with the implied id column on the end):

Index Navigation

We’ll now look at how mysql navigates these indexes given various query structures, which should help give an intuitive understanding for why Mysql performance will vary.

ref-type queries

With this tree-based model in place, we can see that to efficiently use an index like this we must navigate the tree in the order that the index is defined. If our query does not specify a value for all columns in an index, this is ok as long as the values we do provide are at the top of the tree.

I’ve illustrated below two examples of the “happy path” for navigating a [blog_id, user_id] index. In the first, we have provided a filter on both blog_id and user_id, which allows for mysql to fully navigate the tree to the leaves. In the second, we only provide a filter on blog_id, but because this is the first column in the index mysql is still able to efficiently select all the rows that meet this criteria (it selects all leaves from the sub tree)

💡 Notice how the ordering of the rows appears to change depending on how far mysql navigates the tree. Rows in an index are always stored in the same ordering scheme as the index itself. That is to say that a [blog_id, user_id] index will naturally store rows in blog_id, user_id, id ordering. However, as we navigate further into the tree the ordering appears to “simplify”. If we are selecting rows for a specific blog=10 and user_id=50 then the rows appear to be ordered by id, because they all share the same values for both blog_id and user_id!

range-type queries

The range-type lookup happens when a query selects multiple values for an indexed column. There are many ways a query could end up in this scenario:

  • WHERE updated_at > X
  • WHERE user_id != X
  • WHERE blog_id IN (1, 5, 34, 200)
  • WHERE user_id=10 OR user_id=50

In such cases, mysql will need to navigate multiple branches of the index. Consider the below query where we search for all recently created posts for a particular blog:

There are two things I’d like to point out about this pattern:

  1. The ordering of the results is created_at, id. Sorting by id (for example) is not possible without a filesort operation.
  2. There is no efficient way to navigate the deeper sections of the tree (ie the id portion of the index above) without iterating through all the selected branches of created_at. That is, if we were to add an additional filter on id > 5 for example, then there is no single tree to navigate and provide these values. Mysql would instead need to iterate through every sub-tree (one per value of created_at) to complete this filtering operation (see diagram below)

A common application pattern that can suffer performance issues due to both these points is ID-based pagination (ie if your pagination system needs to filter on WHERE id > x ORDER BY id). See the section on compound cursor iteration for a potential solution.

Post-Index Computations

Although Mysql will always start off by using an index if possible to reduce the amount of post-processing required, it goes without saying that not every query will be fully solved by an avilable index. There are a couple post-index computations that we should consider the relative performance of.

First, if the query is requesting an ORDER which differs from how the index sorts its rows, a filesort will be required. This is an expensive operation (O(nlogn)) if the number of rows to be sorted is large. In addition, sorting requires Mysql to temporarily buffer the entire dataset (eg you can’t stream a sorting operation). See ref-type queries for a reminder how indexes naturally sort rows.

If the WHERE clause of the query was not fully solved by the index, Mysql will also need to scan through the rows one after another to apply the remaining filtering. While this is faster than a filesort, it is also relatively slow for large data sets (O(n)). If the index reducing the dataset down to a reasonable size upfront, then this may not be a big deal.

There is one exception here: if the query also specifies a LIMIT x, Mysql need not scan the entire dataset. Mysql will “short circuit” the row scan as soon as it has enough rows to satisfy the query condition. The performance of this kind of pattern is dependent on the “hit ratio” of rows. If there are relatively few rows that satisfy the filter condition compared to rows that must be dropped, then Mysql will need to scan for a longer time compared to if almost all the rows are valid. In addition, you may find that the “density” of interesting rows is not constant throughout the range being scanned! If your application is paginating through a large data set, you may find that different pages have different query times, depending on how many rows need to be filtered out on each page.

Join Algorithms

Nested-Loop Join

The primary algorithm used by Mysql to join tables is a couple variations on “nested-loop”. This a very straightforward algorithm:

for each row in tableA:
  for each matching row in tableB:
    send to output steram

Because Mysql is often capable of streaming joins, when the join query includes a LIMIT x the full join may not have to be executed:

  1. Mysql executes query in the outer table
  2. For the first row in the range, look-up matching rows in the inner table
  3. For each matching row, emit to output stream
  4. Stop execution if enough rows found, otherwise move to next row and goto 2

Note that similar to un-indexed row scans, this does not apply to all conditions (for example, if sorting is required the full dataset is required in memory).

Hash Join

The hash-join algorithm avoids the N+1 B-tree traversal by taking some time upfront (O(n)) to build an in-memory hash table. The reason this strategy can give a performance improvement over nested-loop in some cases is because a hash table lookup has an average profile of O(1) compared to B-tree traversal at O(logn). Given we may have to do this lookup many times (m times), the overall performance profile of hash-join is O(n + m), whereas nested-loop’s profile is O(nlogm).

The below plot shows the region (shaded blue) where the function n+m has a lower value than n*logm:

The tradeoff is of course that this (potentially large) hash table must fit into memory. For this reason one strategy to improve hash-join performance is to limit the size of this temporary table by SELECT-ing only those columns which you really need (to reduce width), and pre-filtering as many rows as possible (to reduce height).

The second thing to keep in mind for performance of hash-joins is that indexes on the join condition will not be used. Rather, the place to focus is on indexing any independent filters if possible. For example, consider the following query:

SELECT * FROM tableA t1 JOIN tableB t2
ON t1.shared_id = t2.shared_id
WHERE t1.colA = "foo"
AND t2.colB = "bar"

The best indexes for this query given a hash-join would be on t1.colA and a second index on t2.colB. Execution of this join would then look like:

  1. Use the index on colA to find all rows in t1 where colA = "foo".
  2. For each row, build a hash table where the key is t1.shared_id and the value is the row.
  3. Use the index on colB to find all rows in t2 where colB = "bar".
  4. For each row, look up the matching t1 rows from the hash table by t2.shared_id

Compound Cursor Iteration

In the range-type queries section I briefly introduced the problem of using an ID based cursor to paginate through a dataset queried via a range-type query. One potential solution to this problem is to use a compound cursor. To work, this cursor must be constructed from the column being filtered with the range. For example, lets consider the query SELECT * FROM blog_posts WHERE blog_id=10 AND created_at>"2023-01-01". The natural index to use for this range query would be [blog_id, created_at].

Because the range query filter is performed on the created_at column, I will propose a compound cursor of created_at, id. Let’s take a look at how the query structure changes. The first query of our iteration will look something like this:

SELECT * FROM blog_posts
WHERE blog_id = 10 AND created_at > "2023-01-01"
ORDER BY created_at, id ASC
LIMIT 100

This will get the first “page” of 100 results. Notice that we are ordering by created_at, id, which is the natural ordering of the index. This allows mysql to efficiently grab the first 100 rows it sees without scanning the full time range.

Now lets look at how we get the next “page” of results. The “cursor value” is defined as the created_at and id values from the last row in the batch of 100 previously selected. We’ll call these values x and y for simplicity.

  x = batch.last.created_at,
  y = batch.last.id

The structure of the query that loads the second page of results has this form:

SELECT * FROM blog_posts
WHERE blog_id = 10
AND (
  created_at = x AND id > y
  OR created_at > x
)
ORDER BY created_at, id ASC
LIMIT 100

Notice in this query that we have an OR clause, meaning mysql is going to navigate the index two times, once per “side” of the clause. Lets consider these as two seperate queries for the sake of a performance analysis.

-- Query A: the "left" side of the OR clause
SELECT * FROM blog_posts
WHERE blog_id = 10 AND created_at = x AND id > y
ORDER BY created_at, id ASC
LIMIT 100

-- Query B: the "right" side of the OR clause
SELECT * FROM blog_posts
WHERE blog_id = 10 AND created_at > x
ORDER BY created_at, id ASC
LIMIT 100

By analysing the OR clauses piece-wise like this, we can now see that each clause has only one range filter. Combining this with the fact that the ORDER BY is the natural ordering of these indexes, we have solved both of the original problems! No more filesort, no more unsolvable double-range filters.

🟡 A note of warning: this pattern may only work as you expect for columns that have immutable values. Like any data structure iteration, if a value changes during your iteration your process may either process a row twice or not at all! For example, consider if instead of iterating on created_at, id we were iterating on updated_at, id. If a row is updated (and therefore updated_at increases) while a process is already iterating through the dataset, we may see that updated row twice: once with its lower updated_at value, and again with the higher value.

Notes mentioning this note