Archive for the ‘MySQL’ Category.
4/20/2009, 4:59 pm
This week throngs of MySQL developers, users, and enthusiasts descended on silicon valley. Apparently the valley’s cooling system can’t keep up because as they arrived the outside temperature went up into the 90s (32s for those of you who choose to use a sane temperature measurement system). I’m not attending the conference this year but I almost wish I was to get some of the air conditioning. It’s supposed to cool off and rain on Saturday. As the brain power leaves I suspect the valley is going to cool and moisture in the air will condense into rain, or tears.
I bet you thought this post was going to be about Oracle and Sun, sorry. I think the weather is more interesting.
4/11/2009, 9:50 pm
I was reading through the manual and noticed that myisamchk parallel recover option is still listed as beta code. The feature was added in 4.0.2 which was released in july 2002. This means it’s been in beta longer than gmail
4/10/2009, 12:29 pm
While updating the mirror last week I was surprised to see that the newest MRU MySQL release is numbered 5.0.79. Previously enterprise releases had even numbers and community releases had odd numbers. I posted the question in #mysql-dev and HarrisonF was kind enough to explain it all.
MySQL 5.0 is running out of version numbers. There are limitations in mysql_get_version(), the executable comment syntax, and other places that mean MySQL can only have two digit release version numbers. MySQL Enterprise has started using odd and even version numbers to extend the life of 5.0.
This raises a few questions. What will happen to 5.0 when it runs out of release numbers? Is community going to be sacrificed to give enterprise more versions to use? Are the version restrictions going to be fixed in the future? For example if a feature is implemented in a community release the executable comment version syntax isn’t suitable for preventing it from being executed in a newer enterprise release because the version scheme doesn’t differentiate between enterprise and community.
I think this is now rock solid proof that there were too many features packed into 5.0 and it was released too early. I hope there will be more major releases in the future with fewer features so these problems are prevented. By the way the advanced vs pro enterprise binaries add a whole new layer to the MySQL version issues.
3/30/2009, 5:27 pm
The #mysql-dev irc channel on freenode was created with the idea of getting the community people more involved in active discussion about mysql internals and development. When the channel was first created this happened for a few weeks and I was pretty happy to be able to observe and participate in the discussion. Now it’s mostly idle.
It seems that some people at Sun think there is still active discussion or internal developers paying attention to the channel because the GSoC web page directs people to #mysql-dev as a point of contact. The problem is there isn’t anyone there answering questions. I’ve seen quite a few people over the past few weeks ask questions that have gone unanswered. I think it’s time to restart the movement to open development and using #mysql-dev for discussion.
2/21/2009, 5:07 pm
A few months ago I got a strange email from one of my clients that contained two very simple looking select queries. The only difference between the two queries is that one included the distinct keyword and the other didn’t. The strange part is that the query that used distinct returned zero rows. I spent a few days narrowing down the clients data into a small test case then created a generic test case from that. I also traced the problem to the code that decides which index to use for a group by loose index scan which can be used to resolve queries using distinct.
The example can be found in the bug and in this sql file. My patch was a step in the right direction but not complete enough to solve all the issues. Since this isn’t a crashing but I was tempted to make this blog post into one of those sql quiz questions but decided to be nice instead. Feel free to use the sql file to fool your friends though.
12/17/2008, 2:09 am
When MySQL 5.1 first went GA I had the same knee jerk reaction as most of the community, “It’s not ready! There are still bugs!”. After thinking about it for a week or so I don’t think this matters. It’s true that MySQL isn’t really ready for GA but it doesn’t matter since most MySQL users I know wait several releases before even trying out a new GA release anyway. This varies wildly of course. Some users love the bleeding edge while others are still back on 4.1. This isn’t MySQL specific either. We do the same thing with most software. I don’t like that MySQL changed their requirements for a RC candidate to move 5.1 along but for most users it doesn’t matter. We will sit quietly and wait for 5.1 to stabilize before even thinking about deploying it. I’ll give it another six months.
5/19/2008, 8:21 pm
Last week I was working with a client that rediscovered a bug where setting expire_logs_days and issuing a flush logs causes the server to crash. It’s MySQL Bug #17733 if you want to have a look. Seeing MySQL crash was enough inspiration to fix something that I and others have wanted to fix in MySQL for years.
Currently a flush logs command tries to flush all of the following logs in order:
- General Log
- Slow Query Log
- Binary Log
- Relay Log
- Store Engine Logs (If available)
- Error Log
The reason I wanted to fix this is because my client was issuing a flush logs to rotate the error log on a server with no replication. The crash was caused by replication. With individual flush logs it’s less likely for this to happen again in the future. People can simply issue a query for the log they want to flush. The new commands flush logs named in the command. They are:
- flush general log;
- flush slow log;
- flush binary log;
- flush relay log;
- flush engine logs;
- flush error log;
The words log and logs are interchangeable. The query “flush general log” is just as valid as “flush general logs” even though there is only one log. I submitted the patch as a fix for MySQL Bug #14104.
The patch, flush_logs.patch was diffed against 6.0.4 but also applies on 5.1.24.
Rotation for different log files isn’t uniform. Rotating the slow log simply closes and opens it. I’m planning to write a second patch that rotates log files using the same numbered scheme as binary logs. This fixes the rotation for slow and general log as well as eliminating the annoying issue of error logs being destroyed after they are rotated to foo.log-old.
This patch hasn’t been accepted or committed yet so if you have any suggestions on how to make it better please let me know.
4/21/2008, 11:41 am
Have you ever executed a query from the MySQL command line client only to find that the output wrapped and the result is unreadable? In the past you have to run the query again with \G instead of ; or \g to get it to display the output in a vertical mode. My feature in MySQL 6.0.4 fixes that. The auto-vertical-output option tells the command line client to display the results in vertical format if the results are going to be too wide to display horizontally. It does this without re-executing the query because MySQL passes the length of each column in the result set. If the client isn’t able to determine the width of the screen it will default to 80 chars.
4/14/2008, 1:16 pm
This is a continuation of the MySQL User Conference replication notes part one.
The session is opening up talking about failover. The shared disk in this case is drbd. DRBD is a fine product for replicating block devices of single disk systems. It’s made redundant by raid and doesn’t provide as much protection as binary log failover. You can find my notes on why I don’t recomment DRBD for MySQL in drbd in the real world.
Lars went a bit quick through the other two configurations. I’ll try to review the slides and post comments.
The next configuration is using federated. The federated storage engine has many problems that make it almost useless for any production deployment. Mats says, “Federated isn’t the fastest engine in the world”. That’s an understatement. Join on two tables as they describe it is almost impossible. Aside from the performance issues this is my favorite limitation of the federated engine, “There is no way for the FEDERATED engine to know if the remote table has changed. The reason for this is that this table must work like a data file that would never be written to by anything other than the database system. The integrity of the data in the local table could be breached if there was any change to the remote database.”
Lars mentions Jeremy’s failover design that is recommended by Proven Scaling. Thanks Lars!
There is a lot of confusion about the difference between row based and statement based replication. Hopefully my row based replication post can clear up some of the confusion.
For keeping things simple and easy to debug I recommend sticking with either row or statement based replication. Teaching the implications of each to application developers is going to be more difficult than sticking with one model. The exception is things that can’t use row based replication like DML statements.
When using a SAN make sure you have redundant SANs, your backups aren’t on the same SAN as mysql and that if you have to use a shared san you have good control over the resources. MySQL is very sensitive to i/o latency increases. If someone else does a large operation on the SAN it can increase the i/o latency in mysql causing operations to take longer which can bring down your application.
I don’t recommend using a hardware load balancer to manage write load between masters. There is a risk of sending writes to both masters at the same time. In a properly configured dual master setup half the writes will be rejected on a read only error. The worst case is that writes go to both masters causing replication error and inconsistent data.
The tutorial is wrapping up now. I look forward to using row based replication 5.1. Thanks Lars and Mats.
4/14/2008, 11:29 am
I’m attempting to live blog corrections and notes while sitting in the replication tutorial. Lars is covering available options in MySQL replication. I’m going to attempt to cover some recommended best practices and things that are possible to do in MySQL but should be avoided. Please keep in mind that I’m writing this during the presentation. If anything is confusing post a comment and I will clean it up.
When designing a MySQL architecture that are several possible configurations. Two that should be avoided are dual master where you write to both masters. Configuring replication in a dual master dual writer setup means there is no single authority on the data. There is also no need to write to both masters as this doesn’t give you any performance improvement. Each master has to process the same sql statements. One step further is circular replication that wasn’t mentioned in the talk but has been in other publications. When using three or more masters if one dies there is no way to restore it without brining down the other masters.
There are several configuration options for filtering data on the slaves. You can filter but the current database or by tables. My recommendation is to keep the data on the slave the same as the master as much as possible. When the slave is different you’re much more likely to have queries succeed on the master but fail on the slave. The worst of these options is replicate-do-db. This option filters queries based on “USE db”. If the use db is set differently on the master the query will succeed but can be passed through the slave without executing. This is a silent failure and won’t cause replication to stop.
A quick note about show binlog events. This is similar to mysqlbinlog in that it dumps events in the binary log. The difference is that show binlog events doesn’t include set variables.
When using relay slaves it’s very important to always configure them in pairs. If a relay slave dies there is no good way to to connect it’s slaves to the relay slave’s master. Using blackhole can cause silent failures to be passed through as well as issues with auto increment and storage engine differences. Blackhole should always be avoided.
Reset slave should function the way it does on the slide in that it should delete everything and leave the slave in a blank state. It actually deletes the logs and resets the host, user, and password. The functionality has been changed a few times in different version so be sure to check the manual. The differences are in that mysql may keep the old host information in memory or it may forget.
Purge master logs was covered briefly. It should be noted that purge master logs won’t purge logs that a connected slave hasn’t downloaded. If a slave is disconnected the master will happily delete logs that a slave hasn’t downloaded yet. I recommend keeping at least 7 days of logs on the master for situations where slave gets disconnected.
Now we’re into the section of specialized slaves. These slaves have a subset of data on the master. Avoid this as much as possible see my row based replication post about half way down for the reasons.
The slide for HA + Scale out has dual master with all of the slaves hanging off of one master. It’s better to have slaves hanging off each master because if the active master dies hard enough that you can’t get binlog offsets from it then there isn’t a good way to change slaves over to the existing master. Balancing your slaves will make it easier to failover. Ah excellent someone up front just pointed this out.