Replication tutorial notes – part 1

I’m attempting to live blog corrections and notes while sitting in the replication tutorial. Lars is covering available options in MySQL replication. I’m going to attempt to cover some recommended best practices and things that are possible to do in MySQL but should be avoided. Please keep in mind that I’m writing this during the presentation. If anything is confusing post a comment and I will clean it up.

When designing a MySQL architecture that are several possible configurations. Two that should be avoided are dual master where you write to both masters. Configuring replication in a dual master dual writer setup means there is no single authority on the data. There is also no need to write to both masters as this doesn’t give you any performance improvement. Each master has to process the same sql statements. One step further is circular replication that wasn’t mentioned in the talk but has been in other publications. When using three or more masters if one dies there is no way to restore it without brining down the other masters.

There are several configuration options for filtering data on the slaves. You can filter but the current database or by tables. My recommendation is to keep the data on the slave the same as the master as much as possible. When the slave is different you’re much more likely to have queries succeed on the master but fail on the slave. The worst of these options is replicate-do-db. This option filters queries based on “USE db”. If the use db is set differently on the master the query will succeed but can be passed through the slave without executing. This is a silent failure and won’t cause replication to stop.

A quick note about show binlog events. This is similar to mysqlbinlog in that it dumps events in the binary log. The difference is that show binlog events doesn’t include set variables.

When using relay slaves it’s very important to always configure them in pairs. If a relay slave dies there is no good way to to connect it’s slaves to the relay slave’s master. Using blackhole can cause silent failures to be passed through as well as issues with auto increment and storage engine differences. Blackhole should always be avoided.

Reset slave should function the way it does on the slide in that it should delete everything and leave the slave in a blank state. It actually deletes the logs and resets the host, user, and password. The functionality has been changed a few times in different version so be sure to check the manual. The differences are in that mysql may keep the old host information in memory or it may forget.

Purge master logs was covered briefly. It should be noted that purge master logs won’t purge logs that a connected slave hasn’t downloaded. If a slave is disconnected the master will happily delete logs that a slave hasn’t downloaded yet. I recommend keeping at least 7 days of logs on the master for situations where slave gets disconnected.

Now we’re into the section of specialized slaves. These slaves have a subset of data on the master. Avoid this as much as possible see my row based replication post about half way down for the reasons.

The slide for HA + Scale out has dual master with all of the slaves hanging off of one master. It’s better to have slaves hanging off each master because if the active master dies hard enough that you can’t get binlog offsets from it then there isn’t a good way to change slaves over to the existing master. Balancing your slaves will make it easier to failover. Ah excellent someone up front just pointed this out.

2 Comments

  1. Anand says:

    Hi Eric,
    This is a very interesting blog as it covers lot of gotcha points in replication. We are thinking of have several slaves (for reads) for a master. In this scenario, will the master be overwhelmed by trying to replicate to many slaves?

    In our case the master get several writes a second and this causes the cache data for reads to be invalidated often.
    The idea is to off-load the master of the reads and improve the browse experience of the users. But if the master is getting lot of writes (probably several a second), then will wouldn’t the slaves too need to be updated and still have the cache invalidate problem?

    Thanks,
    Anand

  2. Randy Taylor says:

    I am trying to set up replication I have followed several tutorials I see the master and slave communicating
    the relay log is growing by the slave is not reflecting the changes to the tables.

    what am I doing

Leave a Reply