Archive for March 2007

Don’t reference auto increment ids outside of mysql.

It’s a common practice to use MySQL’s auto increment feature for things like product ids or invoice numbers. While it always works when the system is first setup it can cause major headaches over time. The problems usually fall into one of three categories, dual master, test/staging setups, and reconciliation of ids.

MySQL in a dual master configuration requires that each master not be able to assign the other masters id. Replication events that use auto increment are tagged with the id that was allocated for that transaction. If you run mysqlbinlog on a bin log it will show something like SET INSERT_ID=123; This ensures that auto increment events are replicated to slaves correctly.

In a dual master configuration each master will try to allocate the same id, replicate the event to the other master where that id is already taken. This causes replication to stop on a duplicate key error. In MySQL 5 there are the auto_increment_increment and auto_increment_offset options that attempt to alleviate this situation by letting masters allocate unique ids. More details on how they work can be found in the manual. Using these options wastes key space so you will end up using larger data types for increment ids making MySQL less efficient. if you really need dual master (talk to me if you _really_ think you do, chances are you don’t) it works ok. A nasty problem rears it’s head when you start referencing these ids outside of MySQL. My favorite example is using auto_increment for invoice ids. Since two masters are allocating ids at offsets they aren’t contiguous like most accountants like. Instead of invoice numbers like 1001, 1002, 1003 you end up with 1011, 1021, 1022. depending on which master allocated which id when. Some might argue that this is a handy way to artificially inflate invoice ids but I think it’s a waste.

The testing/staging scenario usually causes reconciliation of ids. It’s more painful than dual master because fixing it often involves contacting partners with new ids, throwing away existing data or writing reconciliation scripts. What happens is that during staging data entry occurs. The ids generated from auto increment are associated with specific data entries and handed out. When the data is ready, it’s deployed to production. If production has also allocated new ids from previous data entry a collision occurs. It’s true that this can be avoided by good communication with product teams about how to use the ids but it’s better to just not reference auto increment ids outside of mysql in the first place. It also works the other way. Records from production need to be imported into a staging environment for testing purposes but staging has allocated those ids from previous tasks.

Reconciliation of ids, not counting the scenario above usually involves someone somewhere in the company that doesn’t like that the ids start with 1 instead of 0 or that when an item is deleted it’s id isn’t used again. Maybe ids should have started with 1000 instead of 100. Most of these can be fixed with simple updates, others require costly migrations away from auto_increment still others require dumping all the data, fixing the ids, and importing it again.

I’ve found it very difficult to convince people not to reference auto increment ids outside of their application. It seems so simple and clean cut on the surface. This entry was spawned from two different issues I helped people with today involving reference auto increment ids outside of MySQL and how to recover from the situation it caused. I’ve ran into this problem many many times in the past. Not referencing auto increment values outside of MySQL is a good best practices bullet point for any general MySQL presentation.

[Updated: 2007-03-21 I just talked to another person that is setting a default value of 1000 for auto increment so they can use it for ID card numbers.]

Client auto vertical output

Do you hate it when you execute a query in the mysql cli only to have the output wrap three times making it almost impossible to read? Do you end up re-executing the query with \G waiting once again for the query to run? Well wait no more! My very first submitted patch to mysql is a client patch that adds –auto-vertical-output. This option compares the width of your terminal to the width of the result set and enables vertical output if it’s going to wrap. The patch link and details are Bug #26780