Default log file name changes and replication breakage.

In a Great Magnet moment Trent Lloyd posted an excellent write-up on how to recover from relay log name changes on the same day I was going to write up a procedure to send to a client who had a similar issue. Thanks Trent! The problem goes a bit deeper than server hostname changes because there have been a few changes to how mysql handles default log file names in 5.0

Prior to 5.0.38 the default log file name started with the hostname. The problem is, as Trent points out, that if the hostname of the server changes then mysql doesn’t generate default log file names correctly. The error message though is something like:

090825 18:54:53 [ERROR] Failed to open the relay log ‘/mysql/old_hostname-relay-bin.000015′ (relay_log_pos 107657)

There are a few strange things going on here. First if the relay-log index file default name changed and it didn’t know how to open the file then how did it know to open file 15 which is the file it was processing before? The answer is simple, it gets that information from the file which records where the sql thread is in it’s processing. If it was processing that file and it exists, and it knows the correct file name then why can’t it open it?

A quick scan of the source shows two problems. Once the error messages are very vague, the other is that they are printed in a different order than they happened. This message:

[ERROR] Could not find target log during relay log initialization

Actually happens before the Failed to open relay log message. It’s just saved in a variable and printed out later. This message is kind of cryptic. It should be:

[ERROR] Could not find log file ‘old_hostname-relay-bin.000015′ in relay log index file ‘/mysql/new_hostname-relay-bin.index’

The issue isn’t that MySQL couldn’t open the binlog file. The file is there and perfectly healthy. The issue is that MySQL couldn’t match the log file name it had recorded in the file to a valid file name in the new_hostname-relay-bin.index file.

The “fix” that was put in place to solve the problem of default files being named after the hostname was to generate the default log file names from the pid-file variable without the extension. This doesn’t really fix the problem, it just moves it around. If you don’t set the relay-log and relay-log-index variables and you change the pid-file variable then it’s exactly the same as if you changed the hostname in the old method. If the, and names are fixed why can’t the default log file name be something simple like ‘mysql’ instead of being based off of something else that can be changed and has nothing to do with replication?

Leave a Reply