Archive for the ‘Geek’ Category.

Where did 5.0.79 enterprise come from?

While updating the mirror last week I was surprised to see that the newest MRU MySQL release is numbered 5.0.79. Previously enterprise releases had even numbers and community releases had odd numbers. I posted the question in #mysql-dev and HarrisonF was kind enough to explain it all.

MySQL 5.0 is running out of version numbers. There are limitations in mysql_get_version(), the executable comment syntax, and other places that mean MySQL can only have two digit release version numbers. MySQL Enterprise has started using odd and even version numbers to extend the life of 5.0.

This raises a few questions. What will happen to 5.0 when it runs out of release numbers? Is community going to be sacrificed to give enterprise more versions to use? Are the version restrictions going to be fixed in the future? For example if a feature is implemented in a community release the executable comment version syntax isn’t suitable for preventing it from being executed in a newer enterprise release because the version scheme doesn’t differentiate between enterprise and community.

I think this is now rock solid proof that there were too many features packed into 5.0 and it was released too early. I hope there will be more major releases in the future with fewer features so these problems are prevented. By the way the advanced vs pro enterprise binaries add a whole new layer to the MySQL version issues.

WTF is EMT?

EMT provides an easy way to gather common system performance metrics, as well as providing a simple plugin-based interface to collect custom application-specific metrics. This data can be viewed on the servers that are collecting it or, through the output handler interface, be sent to centralized servers.

I started building EMT because it was very difficult to do ad hoc comparisons of performance metrics when trying to diagnose why systems are or were overloaded. Often times the existing monitoring was only setup to gather data from the operating system such as cpu or network usage. The solution was always to build something quickly by hand, run it for a while, and hope it gathers something meaningful. That’s OK for solving one problem but what about the next time?

EMT runs a set of very light weight scripts out of cron that in turn execute commands and parse data from them. The data can be stored locally (highly recommended) or shipped off to be aggregated on central servers, or both. The power of EMT is really in the local storage. The emt_view command can be used to compare any or all metrics from the system to look for patterns. Example usage can be found in the manual. I highly recommend checking it out.

The installation instructions say to grab the source from svn and use the included script to build an rpm. That’s still the recommended way but I’ve created a snapshot of the source and starter rpm for this blog entry. In the future I’ll setup a more defined release process.

Now for the bad news. EMT is still under heavy development. The view code and how instances work are is going to change a lot before I release it. Things like viewing data from different instances of running programs doesn’t work. That being said it shouldn’t destroy your servers or steal your children if you decide to run it in production. Proven Scaling and a few of our clients have been using it for quite some time with very few issues. If there is enough feedback I’ll bump the version to 0.3 and start stabilizing it for a real release.

If you want to become involved with the project there is a google code page and a google group for discussion. I’ll post development updates to the group page. If you find any issues please report them on the google code issue page.

Google Summer of Code and #mysql-dev, who is supposed to answer the questions?

The #mysql-dev irc channel on freenode was created with the idea of getting the community people more involved in active discussion about mysql internals and development. When the channel was first created this happened for a few weeks and I was pretty happy to be able to observe and participate in the discussion. Now it’s mostly idle.

It seems that some people at Sun think there is still active discussion or internal developers paying attention to the channel because the GSoC web page directs people to #mysql-dev as a point of contact. The problem is there isn’t anyone there answering questions. I’ve seen quite a few people over the past few weeks ask questions that have gone unanswered. I think it’s time to restart the movement to open development and using #mysql-dev for discussion.

MacBook Pro Hinges

About a year ago I upgraded from a MacBook to a MacBook Pro 17″ to get the additional screen space. I used to work a lot from my couch before the upgrade. It gets tiring work from a desk after about 8hrs. The upgrade caused some problems though. I opened up my shiny new MBP, sat on the couch, leaned back, put it up on my knees and bam! the screen crashed down on my fingers.

I took it to the apple store who checked it and told me to call apple care. I called apple care from outside the apple store and after explaining the situation they agreed to service the computer to tighten the hinges. I got my MBP back with slightly tighter hinges but no where near what I was used to on my MacBook. For the past year or so I haven’t been able to work from the couch like I want.

When apple revved the 15″ MBP I checked to see if the 17″ style hinges had made it in the 15″ MBP. They had. I was crushed and kept my eyes out for a new brand of laptop. I hate running linux on a laptop and windows is out of the question, so I waited. When Apple released the new 17″ MBP I read the reviews. There was little mention of the hinges except that it opens further than the old 17″. I started watching the apple forums for reports of loose hinges and found a lot of reports from 15″ MBP owners. Feeling hopeful I went to the Apple store to see if the hinges on the new unibody 15″ are as bad as they are on my 17″, they were.

That was about a week ago and the new unibody 17″ wasn’t in the stores yet. I noticed apple changed the shipping time on their website down to 3-5 business days. I thought if the shipping time is down they must have some in stores. I went back about a week alter and they had some on display. The hinges on the unibody 17″ are significantly tighter than those on the unibody 15″ and the old style 17″. So I bought one, and it’s great. I typed this from my couch without smashing my fingers.

Thank you Apple for learning from your mistakes (although you should have learned from the old 17″ and not pissed off all the new 15″ unibody owners). The hinges on this new MBP are almost perfect and the extended battery life isn’t too shabby either.

Select distinct fail

A few months ago I got a strange email from one of my clients that contained two very simple looking select queries. The only difference between the two queries is that one included the distinct keyword and the other didn’t. The strange part is that the query that used distinct returned zero rows. I spent a few days narrowing down the clients data into a small test case then created a generic test case from that. I also traced the problem to the code that decides which index to use for a group by loose index scan which can be used to resolve queries using distinct.

The example can be found in the bug and in this sql file. My patch was a step in the right direction but not complete enough to solve all the issues. Since this isn’t a crashing but I was tempted to make this blog post into one of those sql quiz questions but decided to be nice instead. Feel free to use the sql file to fool your friends though.

How to force Mail.app to show messages in plain text

If you read a lot of emails from exchange users this is a must have. It fixes the super small fonts that exchange seems to like to send by default. Simply run this in a terminal:

defaults write com.apple.mail PreferPlainText -bool TRUE

I’m not sure why apple made fixed width fonts a preference pane option and decided to leave plain text out. Having both seems seems important to me.

Gmail Themes

I’ve been using gmail since the name gold rush days. The interface is intuitive and the conversation threading makes reading mailing lists much easier. One thing has always bothered me and that’s the bright white color. It makes reading email painful in low light. Gmail recently added themes that changed most of that. The message bodies are still white but the chrome around the outside is much darker. My favorite theme is the mountains. The color scheme works well to remove some of the brightness while not contrasting too much with the messages bodies. If they manage to incorporate some rivers and fishing into the theme it will be perfect.

On MySQL 5.1 going GA

When MySQL 5.1 first went GA I had the same knee jerk reaction as most of the community, “It’s not ready! There are still bugs!”. After thinking about it for a week or so I don’t think this matters. It’s true that MySQL isn’t really ready for GA but it doesn’t matter since most MySQL users I know wait several releases before even trying out a new GA release anyway. This varies wildly of course. Some users love the bleeding edge while others are still back on 4.1. This isn’t MySQL specific either. We do the same thing with most software. I don’t like that MySQL changed their requirements for a RC candidate to move 5.1 along but for most users it doesn’t matter. We will sit quietly and wait for 5.1 to stabilize before even thinking about deploying it. I’ll give it another six months.

Ptrace on threads and linux signal handling issues

At Proven Scaling it’s not always all about scaling databases. Sometimes we get to solve other problems not related to scaling at all. We have a client that has been using jmap (unsupported) to grab memory statistics from java. They found that after they ran jmap they were unable to shutdown the jvm without it hanging.

After working on the problem for a bit I found that after jmap ran ps showed the java process as stopped. This is strange since java was still able to process requests. In linux threads are treated as processes, they get a pid just like any other process. To be POSIX compliant linux has the notion of thread groups and a thread group leader so signals can be delivered to an entire thread group.

[Update: 2009-05-19 The version of jmap that ships with jdk-1.6 detaches correctly]

jmap gets memory statistics by using ptrace on the pid of the thread group leader. When ptracing a thread group leader only that thread is stopped and analyzed. Other threads are free to continue processing. Jmap is a bit buggy in that it attaches to a thread but never detaches. Linux has a safe guard that if the parent of a traced process quits then linux changes the traced processes’ state from traced to stopped because traced processes can’t be killed.

When jmap quits the ps shows the process in state T which means stopped. What it doesn’t say is that only the thread group leader is stopped. To get around the limitations of ps and process states I went directly to proc to get the process state. The example below was done using mysqld instead of java. It shows the process state of all the threads, including the leader during a simulated jmap run.

In this example 20924 is the pid of mysqld. I substituted mysql for java in this example because I had it handy on my dev server. It reacts the same way java does. The bad_trace app simulates jmap by doing a ptrace attach and exiting before a detach. It sleeps for a big in the middle so I can get the process state of a normal traced process. Here is the source for bad_trace if you want to follow along at home.


#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/ptrace.h>

int main(int argc, char **argv)
{
pid_t pid = 0;

if (argc < 2)
{
printf("First arg should be a pid\n");
return 1;
}

printf("argc %d\n", argc);
pid = atoi(argv[1]);
printf("Attaching to pid (%d)\n", pid);
ptrace(PTRACE_ATTACH, pid, NULL, NULL);
printf("Sleeping for ten seconds\n");
sleep (10);

return 0;
}

The bad_trace app does a ptrace attach and exits without detaching.

This is the normal state on an idle server all threads are sleeping.

ebergen@kamet:(/proc/20924/task) grep State */status
20924/status:State: S (sleeping)
20926/status:State: S (sleeping)
20927/status:State: S (sleeping)
20928/status:State: S (sleeping)
20929/status:State: S (sleeping)
20931/status:State: S (sleeping)
20932/status:State: S (sleeping)
20933/status:State: S (sleeping)
20934/status:State: S (sleeping)

I execute bad_trace which puts the thread group leader into traced mode waiting for bad_trace to examine it.


ebergen@kamet:(/proc/20924/task) ~/bad_trace 20924
Attaching to pid (20924)
Sleeping for ten seconds
[1]+ Stopped ~/bad_trace 20924

I suspend bad_trace to grab the stats. You can see that the thread group leader has stopped in tracing mode waiting for bad_trace to examine it and tell it to continue.


ebergen@kamet:(/proc/20924/task) grep State */status
20924/status:State: T (tracing stop)
20926/status:State: S (sleeping)
20927/status:State: S (sleeping)
20928/status:State: S (sleeping)
20929/status:State: S (sleeping)
20931/status:State: S (sleeping)
20932/status:State: S (sleeping)
20933/status:State: S (sleeping)
20934/status:State: S (sleeping)
ebergen@kamet:(/proc/20924/task) fg
~/bad_trace 20924

bad_trace resumes and exits without detaching from the traced process. To enable an admin to kill the traced process linux does some cleanup work by changing it’s state from traced to stopped. Now we can send signals to the thread group and they will be able to respond.


ebergen@kamet:(/proc/20924/task) grep State */status
20924/status:State: T (stopped)
20926/status:State: S (sleeping)
20927/status:State: S (sleeping)
20928/status:State: S (sleeping)
20929/status:State: S (sleeping)
20931/status:State: S (sleeping)
20932/status:State: S (sleeping)
20933/status:State: S (sleeping)
20934/status:State: S (sleeping)

Naturally we don’t want the thread group leader stopped if everything is OK so I send it a continue signal to resume operation. This is where things get weird.


ebergen@kamet:(/proc/20924/task) kill -CONT 20924

Instead of the thread group leader resuming operation linux decides that it’s a good idea to stop all threads instead.


ebergen@kamet:(/proc/20924/task) grep State */status
20924/status:State: T (stopped)
20926/status:State: T (stopped)
20927/status:State: T (stopped)
20928/status:State: T (stopped)
20929/status:State: T (stopped)
20931/status:State: T (stopped)
20932/status:State: T (stopped)
20933/status:State: T (stopped)
20934/status:State: T (stopped)

Sending a second continue signal does the right thing and the process resumes.


ebergen@kamet:(/proc/20924/task) kill -CONT 20924
ebergen@kamet:(/proc/20924/task) grep State */status
20924/status:State: S (sleeping)
20926/status:State: S (sleeping)
20927/status:State: S (sleeping)
20928/status:State: S (sleeping)
20929/status:State: S (sleeping)
20931/status:State: S (sleeping)
20932/status:State: S (sleeping)
20933/status:State: S (sleeping)
20934/status:State: S (sleeping)

I did some digging in the kernel and didn’t see any specific reason for this behavior. I suspect it’s a kernel bug that has never been uncovered because tracing a single thread of a threaded process isn’t a very common operation.

Splitting flush logs command

Last week I was working with a client that rediscovered a bug where setting expire_logs_days and issuing a flush logs causes the server to crash. It’s MySQL Bug #17733 if you want to have a look. Seeing MySQL crash was enough inspiration to fix something that I and others have wanted to fix in MySQL for years.

Currently a flush logs command tries to flush all of the following logs in order:

  • General Log
  • Slow Query Log
  • Binary Log
  • Relay Log
  • Store Engine Logs (If available)
  • Error Log

The reason I wanted to fix this is because my client was issuing a flush logs to rotate the error log on a server with no replication. The crash was caused by replication. With individual flush logs it’s less likely for this to happen again in the future. People can simply issue a query for the log they want to flush. The new commands flush logs named in the command. They are:

  • flush general log;
  • flush slow log;
  • flush binary log;
  • flush relay log;
  • flush engine logs;
  • flush error log;

The words log and logs are interchangeable. The query “flush general log” is just as valid as “flush general logs” even though there is only one log. I submitted the patch as a fix for MySQL Bug #14104.

The patch, flush_logs.patch was diffed against 6.0.4 but also applies on 5.1.24.

Rotation for different log files isn’t uniform. Rotating the slow log simply closes and opens it. I’m planning to write a second patch that rotates log files using the same numbered scheme as binary logs. This fixes the rotation for slow and general log as well as eliminating the annoying issue of error logs being destroyed after they are rotated to foo.log-old.

This patch hasn’t been accepted or committed yet so if you have any suggestions on how to make it better please let me know.