How alter table locks tables and handles transactions

I’ve talked to several people that have questions about how alter table works under the hood. They want to know how it handles locking tables why they can sometimes use a table during alter table and other times they can’t. Also why it’s so slow :)

First let’s look at the basic process alter table typically goes through.

  1. If a transaction is open on this thread, commit it.
  2. Acquire a read lock for the table.
  3. Make a temporary table with new structure
  4. Copy the old table to the temporary table row by row changing the structure of the rows on the fly.
  5. Rename the original table out of the way
  6. Rename the temporary table to the original table name.
  7. Drop the original table.
  8. Release the read lock.

The slowest part of the process is copying rows from the original table to the temporary table. For large tables this can take minutes to hours. There are a few optimizations built into this process. If the alter table query only renames the table then mysql doesn’t bother copying all the rows to a temporary table and just renames it. For most other things such as renaming columns, adding/dropping indexes, making columns nullable, changing the column default all require copying the entire table.

During the first 4 steps MySQL allows other clients to read from the table being altered. When alter table is done copying rows to the temporary table and is ready to rename it it changes the table lock. MySQL instructs all other clients currently reading from the table to close the table when they are done. While alter table is waiting for existing clients to finish reading from the table it prevents other clients from starting to read from the table. During this time selects will be blocked on “Waiting for tables”. When the last client is done reading from the table alter table continues renaming the table.

This has some interesting implications for transactions and repeatable reads. Internally innodb keeps track of when rows are created. When a transaction is started it can only see rows that were created before the transaction was started (using repeatable read). Any rows created after are not returned. Since alter table copied rows from the old table to the new table rows get a new version number as they are inserted into the temporary table. An alter table can cause a dirty read in transactions that span an alter table. Transactions started before alter table will get no rows back from the table after alter table is finished. If your application is sensitive to dirty reads or getting no rows back from a table (really dirty read :) ) then don’t run alter table on a server when clients are running.

Here is an example.

mysql a> alter ignore table t add unique index (t);

mysql b> begin;
Query OK, 0 rows affected (0.00 sec)

#This select is from the original table while alter table is copying rows
mysql> select * from t limit 10;
+——+
| t |
+——+
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
| 10 |
+——+
10 rows in set (0.00 sec)
#alter table finishes
#Rows created in the temporary table before we issued begin
mysql> select * from t limit 10;
+———-+
| t |
+———-+
| 10 |
| 6920631 |
| 27998430 |
| 41865298 |
| 49403894 |
+———-+
5 rows in set (0.00 sec)

#Get a new view of the table
mysql> commit;
Query OK, 0 rows affected (0.00 sec)

#This select returns all the rows.
mysql> select * from t limit 10;
+———–+
| t |
+———–+
| 10 |
| 6920631 |
| 27998430 |
| 41865298 |
| 49403894 |
| 50522347 |
| 84441015 |
| 109401269 |
| 110202688 |
| 123590778 |
+———–+
10 rows in set (0.00 sec)

If we start the transaction before alter table the select after alter table has finished will return no rows even though there are rows in the table. I’m not sure why innodb allows rename of a table when transactions still have a few of that table open. It seems like a bug to me.

[Updated 2007-05-14 Bug #28432

5 Comments

  1. Heikki Tuuri says:

    Eric,

    InnoDB, in an ALTER TABLE, commits the row-copying transaction at every 10,000 copied rows. That is to save space in the undo log.

    What I do not understand is why you in the second result above only get 5 rows back. How big was the table that you ALTERed?

    Since an ALTER destroys the history information for rows, there is no way InnoDB’s consistent read could work in a consistent way over an ALTER TABLE. I hope this is mentioned in the MySQL manual.

    “I’m not sure why innodb allows rename of a table when transactions still have a few of that table open. It seems like a bug to me.”

    We really cannot block ALTER or RENAME whenever someone has a consistent read snapshot open anywhere in the database. It would cause too long waits.

    Regards,

    Heikki

  2. Heikki Tuuri says:

    Eric,

    actually, the right solution would be to return from the SELECT an error like “snapshot too old”, at least in the second SELECT above.

    We should compare the table timestamp to the consistent read snapshot timestamp.

    Please file an InnoDB bug report to bugs.mysql.com about this.

    Regards,

    Heikki

  3. [...] table during alter table and other times they can’t. Also why it’s so slow.” He explains how alter table locks tables and handles transactions with a little help from Heikki [...]

  4. Lukas says:

    a LOCK TABLE call is another alternative, but if you have users that have to wait then, its probably best to take the server down, especially if you have other servers that can take over ..

  5. [...] alter table and other times they can’t. Also why it’s so slow.” He explains how alter table locks tables and handles transactions with a little help from Heikki [...]

Leave a Reply