Gmail Themes

January 5th, 2009

I’ve been using gmail since the name gold rush days. The interface is intuitive and the conversation threading makes reading mailing lists much easier. One thing has always bothered me and that’s the bright white color. It makes reading email painful in low light. Gmail recently added themes that changed most of that. The message bodies are still white but the chrome around the outside is much darker. My favorite theme is the mountains. The color scheme works well to remove some of the brightness while not contrasting too much with the messages bodies. If they manage to incorporate some rivers and fishing into the theme it will be perfect.

On MySQL 5.1 going GA

December 17th, 2008

When MySQL 5.1 first went GA I had the same knee jerk reaction as most of the community, “It’s not ready! There are still bugs!”. After thinking about it for a week or so I don’t think this matters. It’s true that MySQL isn’t really ready for GA but it doesn’t matter since most MySQL users I know wait several releases before even trying out a new GA release anyway. This varies wildly of course. Some users love the bleeding edge while others are still back on 4.1. This isn’t MySQL specific either. We do the same thing with most software. I don’t like that MySQL changed their requirements for a RC candidate to move 5.1 along but for most users it doesn’t matter. We will sit quietly and wait for 5.1 to stabilize before even thinking about deploying it. I’ll give it another six months.

Sun’s official support for Drizzle means more than just code.

October 2nd, 2008

As Jeremy wrote a few months ago one of the main issues with forking MySQL is the documentation is not open source. Today Jay Pipes announced that he is leaving the MySQL Community Team to be a staff engineer for Sun on the Drizzle project. Monty Taylor also released a similar announcement saying that he will also be working on Drizzle full time.

This means that Drizzle is no longer a side project hacked on by a few engineers outside of their work time, it’s an official project. With Sun employing Drizzle engineers and Sun owning the copyright to the MySQL documentation it’s now likely that Sun will create Drizzle docs from the existing MySQL docs to support their new product. The question now is how will the docs be licensed? Will Drizzle enjoy the fork protection that MySQL has for so many years by keeping it’s documentation hostage? Probably. If Sun creates docs based on the MySQL docs they will own the copyright and effectively own Drizzle.

Mac OS A2DP and FreePulse Logitech Headphones

September 8th, 2008

Apple is slowly fixing the many issues with it’s A2DP driver. In 10.4 the driver would kernel panic as often as it would work. In 10.5 it’s functional but only with a special trick to get it to change the audio channel correctly. Simply turning on the headphones and clicking “Use Headphones” on the dialog rarely works.

The trick is to run through this sequence twice. Turn on the headphones by holding the button on the right ear. After you hear the ping Mac OS will prompt you to use the headphones. Choose the option to use the headphones and the bluetooth icon will switch to connected. Now turn on the headphones again and when you hear them ping the bluetooth icon should switch back to disconnected. Under the bluetooth menu choose “Use Headphones” under the FreePulse device. The icon will stay disconnected but when you play audio it will switch to connect and start functioning.

I couldn’t find a place on apple’s site to report the bug with repeatable test case. If any of you know how to report bugs in Mac OS please tell me and I’ll file it so hopefully this behavior is fixed in future releases.

Solving the final issue with electric cars.

August 16th, 2008

For the past few years hyrbids have been all the rage. Now electric cars are coming on to the scene. I realized a while ago that neither of these vehicles is good for the great american tradition of the road trip. Before gas prices started to increase it was fairly common to pack up the family car and head across the country. This is still my preferred way to travel. I hate the uncertainty and the TSA interaction of flying. Driving is nice and relaxing once you get out of traffic.

Hybrids are fine for city driving but they offer no improvement over your typical four-banger once you get out on to the highway or back in the woods. Let’s face it, most hybrids have no business off pavement. In a few years I’m sure decent hybrid trucks will become available. I’m not holding my breath.

The other alternative is electric cars. They’re great for the daily commute. You drive to work and back then plug them in when you get home a night. Electric cars are great because we have so many different clean options for generating power. The typical range is 200 miles so they have plenty of juice to go to work and grab groceries on the way home. If you need to drive outside that range you’re effectively screwed. After the juice runs out the car has to be plugged in for hours before you can make go another 200 miles.

In order to get people to switch over to all electric cars we need to have the range of a gas fueled engine plus the ease of refueling. Since we can’t recharge batteries as fast as we can fill a tank with gas the only other option is to change the batteries out. I think we need to standardize on battery size and have them removed from the bottom of the car. Then instead of gas stations we can have automated robots that will drop all the batteries out of the bottom of a car and place charged batteries in. Think of it like driving through an automated car wash like device but instead of cleaning your car it drops the batteries out of the bottom and puts new ones in. This eliminates two problems with electric cars. First it means we can recharge electric cars as fast as or faster than gas cars and as long as there are battery change out stations we can continue to go on road trips. It also means that there is no longer an issue with the huge cost of replacing batteries after a few years. Part of the recharge cost will also be maintenance on the batteries. It’s like the propane exchange at your local grocery store.

Ptrace on threads and linux signal handling issues

June 25th, 2008

At Proven Scaling it’s not always all about scaling databases. Sometimes we get to solve other problems not related to scaling at all. We have a client that has been using jmap (unsupported) to grab memory statistics from java. They found that after they ran jmap they were unable to shutdown the jvm without it hanging.

After working on the problem for a bit I found that after jmap ran ps showed the java process as stopped. This is strange since java was still able to process requests. In linux threads are treated as processes, they get a pid just like any other process. To be POSIX compliant linux has the notion of thread groups and a thread group leader so signals can be delivered to an entire thread group.

jmap gets memory statistics by using ptrace on the pid of the thread group leader. When ptracing a thread group leader only that thread is stopped and analyzed. Other threads are free to continue processing. Jmap is a bit buggy in that it attaches to a thread but never detaches. Linux has a safe guard that if the parent of a traced process quits then linux changes the traced processes’ state from traced to stopped because traced processes can’t be killed.

When jmap quits the ps shows the process in state T which means stopped. What it doesn’t say is that only the thread group leader is stopped. To get around the limitations of ps and process states I went directly to proc to get the process state. The example below was done using mysqld instead of java. It shows the process state of all the threads, including the leader during a simulated jmap run.

In this example 20924 is the pid of mysqld. I substituted mysql for java in this example because I had it handy on my dev server. It reacts the same way java does. The bad_trace app simulates jmap by doing a ptrace attach and exiting before a detach. It sleeps for a big in the middle so I can get the process state of a normal traced process. Here is the source for bad_trace if you want to follow along at home.


#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/ptrace.h>

int main(int argc, char **argv)
{
pid_t pid = 0;

if (argc < 2)
{
printf("First arg should be a pid\n");
return 1;
}

printf("argc %d\n", argc);
pid = atoi(argv[1]);
printf("Attaching to pid (%d)\n", pid);
ptrace(PTRACE_ATTACH, pid, NULL, NULL);
printf("Sleeping for ten seconds\n");
sleep (10);

return 0;
}

The bad_trace app does a ptrace attach and exits without detaching.

This is the normal state on an idle server all threads are sleeping.

ebergen@kamet:(/proc/20924/task) grep State */status
20924/status:State: S (sleeping)
20926/status:State: S (sleeping)
20927/status:State: S (sleeping)
20928/status:State: S (sleeping)
20929/status:State: S (sleeping)
20931/status:State: S (sleeping)
20932/status:State: S (sleeping)
20933/status:State: S (sleeping)
20934/status:State: S (sleeping)

I execute bad_trace which puts the thread group leader into traced mode waiting for bad_trace to examine it.


ebergen@kamet:(/proc/20924/task) ~/bad_trace 20924
Attaching to pid (20924)
Sleeping for ten seconds
[1]+ Stopped ~/bad_trace 20924

I suspend bad_trace to grab the stats. You can see that the thread group leader has stopped in tracing mode waiting for bad_trace to examine it and tell it to continue.


ebergen@kamet:(/proc/20924/task) grep State */status
20924/status:State: T (tracing stop)
20926/status:State: S (sleeping)
20927/status:State: S (sleeping)
20928/status:State: S (sleeping)
20929/status:State: S (sleeping)
20931/status:State: S (sleeping)
20932/status:State: S (sleeping)
20933/status:State: S (sleeping)
20934/status:State: S (sleeping)
ebergen@kamet:(/proc/20924/task) fg
~/bad_trace 20924

bad_trace resumes and exits without detaching from the traced process. To enable an admin to kill the traced process linux does some cleanup work by changing it’s state from traced to stopped. Now we can send signals to the thread group and they will be able to respond.


ebergen@kamet:(/proc/20924/task) grep State */status
20924/status:State: T (stopped)
20926/status:State: S (sleeping)
20927/status:State: S (sleeping)
20928/status:State: S (sleeping)
20929/status:State: S (sleeping)
20931/status:State: S (sleeping)
20932/status:State: S (sleeping)
20933/status:State: S (sleeping)
20934/status:State: S (sleeping)

Naturally we don’t want the thread group leader stopped if everything is OK so I send it a continue signal to resume operation. This is where things get weird.


ebergen@kamet:(/proc/20924/task) kill -CONT 20924

Instead of the thread group leader resuming operation linux decides that it’s a good idea to stop all threads instead.


ebergen@kamet:(/proc/20924/task) grep State */status
20924/status:State: T (stopped)
20926/status:State: T (stopped)
20927/status:State: T (stopped)
20928/status:State: T (stopped)
20929/status:State: T (stopped)
20931/status:State: T (stopped)
20932/status:State: T (stopped)
20933/status:State: T (stopped)
20934/status:State: T (stopped)

Sending a second continue signal does the right thing and the process resumes.


ebergen@kamet:(/proc/20924/task) kill -CONT 20924
ebergen@kamet:(/proc/20924/task) grep State */status
20924/status:State: S (sleeping)
20926/status:State: S (sleeping)
20927/status:State: S (sleeping)
20928/status:State: S (sleeping)
20929/status:State: S (sleeping)
20931/status:State: S (sleeping)
20932/status:State: S (sleeping)
20933/status:State: S (sleeping)
20934/status:State: S (sleeping)

I did some digging in the kernel and didn’t see any specific reason for this behavior. I suspect it’s a kernel bug that has never been uncovered because tracing a single thread of a threaded process isn’t a very common operation.

Splitting flush logs command

May 19th, 2008

Last week I was working with a client that rediscovered a bug where setting expire_logs_days and issuing a flush logs causes the server to crash. It’s MySQL Bug #17733 if you want to have a look. Seeing MySQL crash was enough inspiration to fix something that I and others have wanted to fix in MySQL for years.

Currently a flush logs command tries to flush all of the following logs in order:

  • General Log
  • Slow Query Log
  • Binary Log
  • Relay Log
  • Store Engine Logs (If available)
  • Error Log

The reason I wanted to fix this is because my client was issuing a flush logs to rotate the error log on a server with no replication. The crash was caused by replication. With individual flush logs it’s less likely for this to happen again in the future. People can simply issue a query for the log they want to flush. The new commands flush logs named in the command. They are:

  • flush general log;
  • flush slow log;
  • flush binary log;
  • flush relay log;
  • flush engine logs;
  • flush error log;

The words log and logs are interchangeable. The query “flush general log” is just as valid as “flush general logs” even though there is only one log. I submitted the patch as a fix for MySQL Bug #14104.

The patch, flush_logs.patch was diffed against 6.0.4 but also applies on 5.1.24.

Rotation for different log files isn’t uniform. Rotating the slow log simply closes and opens it. I’m planning to write a second patch that rotates log files using the same numbered scheme as binary logs. This fixes the rotation for slow and general log as well as eliminating the annoying issue of error logs being destroyed after they are rotated to foo.log-old.

This patch hasn’t been accepted or committed yet so if you have any suggestions on how to make it better please let me know.

Auto vertical output lands in MySQL 6.0.4

April 21st, 2008

Have you ever executed a query from the MySQL command line client only to find that the output wrapped and the result is unreadable? In the past you have to run the query again with \G instead of ; or \g to get it to display the output in a vertical mode. My feature in MySQL 6.0.4 fixes that. The auto-vertical-output option tells the command line client to display the results in vertical format if the results are going to be too wide to display horizontally. It does this without re-executing the query because MySQL passes the length of each column in the result set. If the client isn’t able to determine the width of the screen it will default to 80 chars.

Replication tutorial notes - part 2

April 14th, 2008

This is a continuation of the MySQL User Conference replication notes part one.

The session is opening up talking about failover. The shared disk in this case is drbd. DRBD is a fine product for replicating block devices of single disk systems. It’s made redundant by raid and doesn’t provide as much protection as binary log failover. You can find my notes on why I don’t recomment DRBD for MySQL in drbd in the real world.

Lars went a bit quick through the other two configurations. I’ll try to review the slides and post comments.

The next configuration is using federated. The federated storage engine has many problems that make it almost useless for any production deployment. Mats says, “Federated isn’t the fastest engine in the world”. That’s an understatement. Join on two tables as they describe it is almost impossible. Aside from the performance issues this is my favorite limitation of the federated engine, “There is no way for the FEDERATED engine to know if the remote table has changed. The reason for this is that this table must work like a data file that would never be written to by anything other than the database system. The integrity of the data in the local table could be breached if there was any change to the remote database.”

Lars mentions Jeremy’s failover design that is recommended by Proven Scaling. Thanks Lars!

There is a lot of confusion about the difference between row based and statement based replication. Hopefully my row based replication post can clear up some of the confusion.

For keeping things simple and easy to debug I recommend sticking with either row or statement based replication. Teaching the implications of each to application developers is going to be more difficult than sticking with one model. The exception is things that can’t use row based replication like DML statements.

When using a SAN make sure you have redundant SANs, your backups aren’t on the same SAN as mysql and that if you have to use a shared san you have good control over the resources. MySQL is very sensitive to i/o latency increases. If someone else does a large operation on the SAN it can increase the i/o latency in mysql causing operations to take longer which can bring down your application.

I don’t recommend using a hardware load balancer to manage write load between masters. There is a risk of sending writes to both masters at the same time. In a properly configured dual master setup half the writes will be rejected on a read only error. The worst case is that writes go to both masters causing replication error and inconsistent data.

The tutorial is wrapping up now. I look forward to using row based replication 5.1. Thanks Lars and Mats.

Replication tutorial notes - part 1

April 14th, 2008

I’m attempting to live blog corrections and notes while sitting in the replication tutorial. Lars is covering available options in MySQL replication. I’m going to attempt to cover some recommended best practices and things that are possible to do in MySQL but should be avoided. Please keep in mind that I’m writing this during the presentation. If anything is confusing post a comment and I will clean it up.

When designing a MySQL architecture that are several possible configurations. Two that should be avoided are dual master where you write to both masters. Configuring replication in a dual master dual writer setup means there is no single authority on the data. There is also no need to write to both masters as this doesn’t give you any performance improvement. Each master has to process the same sql statements. One step further is circular replication that wasn’t mentioned in the talk but has been in other publications. When using three or more masters if one dies there is no way to restore it without brining down the other masters.

There are several configuration options for filtering data on the slaves. You can filter but the current database or by tables. My recommendation is to keep the data on the slave the same as the master as much as possible. When the slave is different you’re much more likely to have queries succeed on the master but fail on the slave. The worst of these options is replicate-do-db. This option filters queries based on “USE db”. If the use db is set differently on the master the query will succeed but can be passed through the slave without executing. This is a silent failure and won’t cause replication to stop.

A quick note about show binlog events. This is similar to mysqlbinlog in that it dumps events in the binary log. The difference is that show binlog events doesn’t include set variables.

When using relay slaves it’s very important to always configure them in pairs. If a relay slave dies there is no good way to to connect it’s slaves to the relay slave’s master. Using blackhole can cause silent failures to be passed through as well as issues with auto increment and storage engine differences. Blackhole should always be avoided.

Reset slave should function the way it does on the slide in that it should delete everything and leave the slave in a blank state. It actually deletes the logs and resets the host, user, and password. The functionality has been changed a few times in different version so be sure to check the manual. The differences are in that mysql may keep the old host information in memory or it may forget.

Purge master logs was covered briefly. It should be noted that purge master logs won’t purge logs that a connected slave hasn’t downloaded. If a slave is disconnected the master will happily delete logs that a slave hasn’t downloaded yet. I recommend keeping at least 7 days of logs on the master for situations where slave gets disconnected.

Now we’re into the section of specialized slaves. These slaves have a subset of data on the master. Avoid this as much as possible see my row based replication post about half way down for the reasons.

The slide for HA + Scale out has dual master with all of the slaves hanging off of one master. It’s better to have slaves hanging off each master because if the active master dies hard enough that you can’t get binlog offsets from it then there isn’t a good way to change slaves over to the existing master. Balancing your slaves will make it easier to failover. Ah excellent someone up front just pointed this out.