Archive for the 'MySQL' Category

BACKUP/RESTORE TABLE lose auto increment

Friday, January 5th, 2007

One small fact about backup table and restore table that isn’t listed in the manual is that these commands lose the auto increment value if rows at the head of the table are deleted. For example if you have auto increment values of 1,2,3,4 in a table the auto increment value is 5. If you delete row 4 the next auto increment will still be 5. If you backup/restore the table the auto increment will be reset to 4. The auto increment value in myisam is stored in the MYI file. Since this file isn’t backed up myisam restores the auto increment value from the highest existing value in the table. This value may or may not be the actual value of auto increment. As the manual says these commands should not be used. Instead use mysqlhotcopy.

Innochecksum

Sunday, November 26th, 2006

Innochecksum is a small undocumented (at least in the manual) utility that verifies the checksum of ibdata file pages. I stumbled across it the other day while wandering through the source tree. It has a small bug that prevents it from being able to check files larger than 4G. This patch fixes it in linux. Someone that knows more about large file support in different OSs please comment on the patch.

It lives in the extras folder in the MySQL source. Here is the credits comment:

/*
InnoDB offline file checksum utility. 85% of the code in this file
was taken wholesale fron the InnoDB codebase.

The final 15% was originally written by Mark Smith of Danga
Interactive, Inc.

Published with a permission.
*/

I think as a community we need to make sure that tools like this are kept in the forge and actively developed. More on this when I have had some sleep.

Proven Scaling

Friday, November 10th, 2006

When Jeremy left yahoo a lot of people were left wondering “who would be the next Jeremy?” some people thought it would be me. Since then I have been asked several times why I’m not going to take a turn in the ivory tower. Now that things are in place the secret can be let out. Jeremy’s startup Proven Scaling is not just Jeremy’s startup but our startup. We have decided to take our MySQL skills and apply them to the problems of several companies.

Eric Bergen
MySQL Geek / Owner
Proven Scaling L.L.C.
eric@provenscaling.com

You bring yourself, we’ll bring the beer.

Wednesday, November 8th, 2006

At the first ever MySQL Camp Proven Scaling is holding a session/BOF on replication. We want to hear what you like and don’t like about replication. What better to get conversation going but FREE BEER!

There is one tiny problem. We don’t know how much to buy or what kind. If you’re going to the camp help us out by putting your name and your favorite brew (brand and type) on the MySQL Replibeertion page.

Where was 5.1?

Tuesday, October 17th, 2006

If you can remember back to mid ‘05 when MySQL 5.0 was being released there was something missing. I have been thinking about writing about MySQL 5.0 being released too soon but I don’t think that was the case. Was 5.1 released too late? Maybe. Was something wrong with the 5.0 -> 5.1 schedule and feature set? Oh yeah.

Looking back at the change logs for 4.1, 5.0, and 5.1 I noticed something that I hadn’t though about before. I knew there was a difference in the release dates for 5.0->5.1 from 4.1->5.0 but I didn’t realize how drastic the difference was until I drew it out (on paper, sorry) today. Here is a summary of the major milestones from 4.1 Alpha to today:

  • Apr 03 - 4.1 Alpha
  • Dec 03 - 5.0 Alpha
  • Jul 04 - 4.1 Beta
  • Aug 04 - 4.1 Gamma
  • Oct 04 - 4.1 Release
  • Mar 05 - 5.0 Beta (Where is 5.1 Alpha?)
  • Sep 05 - 5.0 Gamma.
  • Nov 05 - 5.1 Alpha (Oh! Here it is!)

There is a 6 month period where 4.1 and 5.0 are simultaneously in alpha. 4.1 was in beta for 2 months and gamma for 2 months. 5.0 was in beta for 6 months and gamma for 1 month. 5.0 was in beta for 4 months longer than 4.1 but was in gamma for only a month. Does this mean that 5.0 was in beta too long or wasn’t in gamma for long enough. I think it was the latter and Jeremy seems to agree.

Why am I writing about this now? This post has been braincrack for quite a while. What brought it back to the surface is Kaj’s blog entry about MySQL Community/Enterprise, “This way, contributors don’t have to wait until the next major release for their improvements to get into use, and enterprise users can continue using 5.0 without seeing any destabilisation of the code base due to new functionality being introduced.” I think we have to wait because 5.1 wasn’t developed in parallel with 5.0 like 4.1 and 5.0 were. I can’t think of a valid reason for 5.1 not to follow the same pattern as 5.0.

I think the short gamma period for 5.0 was caused by Oracle’s acquisition of Innobase. 5.0 was in beta for 4 months longer than 4.1 and yet only in gamma for 1 month. It was released a few weeks after the announcement. On a side note, I’m happy that Oracle presented a few new features at the user conference showing that they aren’t going to kill innodb and are continuing development of it.

Show @&!# status again!

Friday, September 22nd, 2006

I’m not the first person to run into this and I certainly won’t be the last. This is also already covered in: Bug #19093 I just want to say again how annoying it is that the default for show status is session instead of global. It bit me again yesterday. I ran show status and was confused because most of Com_* is 0 yet uptime was a few days. Then it hit like a brick to the face that I was working on 5.0 instead of 4.1 and the default for show status is session. I know the default has changed and it still bites me. This is going to cause people hours of confusion the first time they do what I just did.

If you want the default changed please go to Bug #19093 and add a comment.

Useless use of if award

Friday, July 21st, 2006

Similar to the useless use of cat award. The useless use of if award highlights code examples where people use the if function or ternary operator when the return of the expression does the exact same thing. I first noticed this with returning boolean values from php functions. To protect the innocent the winner of today’s award will remain anonymous.

<anonymous> to my knowledge, youll have to SUM(IF(your_field <> “”, 1, 0)) as total_non_empty

Ignoring that the whole query should be using where your_field != ” and group by the non if() way to write this is:

SUM(your_field <> ”)

These examples aren’t an award to a specific person since I’m digging them up from my memory. This pretty much applies to both C and PHP.

< ?php

function foo()

{

$str = 'foo';

return $str == 'foo' ? TRUE : FALSE;
}

?>

Can be written as

< ?php

function foo()

{

$str = 'foo';

return $str == 'foo';
}

?>

I hope this helps you save a few keystrokes when writing code in the future.

Why uptime is bad

Wednesday, June 28th, 2006

Growing up in the world of linux uptime was always considered a good thing. On IRC every once in a while someone would post an uptime. Everyone else in that channel would then check their uptime and if it was greater or close they would post it in the channel. Most of these systems were home linux boxes used for compiling random programs or maybe hosting a webserver for experimenting. It was fun to see how long we could keep them running for. Since those days I have come to realize that high uptimes are a bad thing.
Keeping a server up for months or even years means that you aren’t maintaining it. It hasn’t been kept up to date with new kernels that have fixes for security holes. It doesn’t have new packages or new tools that can help it run more efficiently and have features that can make using it easier. It’s also not up to date with new servers that are being deployed which means that people logging into your server with a high uptime have to adjust themselves to the older software and possible missing tools.

Hardware fails, colos lose power, network connections, and sometimes catch on fire. If you’re entire system depends on a single server, say a mysql master. It’s going to fail. I know there are mysql servers out there that have been up for years. Those are going to fail. It’s inevitable. If you’re system is not designed to withstand the failure of a master it should be fixed. Jeremy Cole and I gave a tutorial at the 2006 MySQL User Conference about MySQL replication and failover. See Jeremy’s blog for links to the presentation and photos.

“But I can’t take down my master to fix it?” It’s much better to do a planned downtime than it is to get paged at 3am because the master died and the whole site is down. Take some time. Plan to take down the master and fix the system. It will be worth it in the end. If your manager says no to a planned downtime to make your website fault tolerant. Find a new job. Preferably at Yahoo! :)

By building a system that can handle the failure of a master it’s much easier to upgrade MySQL so you can take advantage of all the new nifty features.

The wrong way to upgrade MySQL

Friday, May 19th, 2006

Expect a longer post in the near future on upgrade procedures. For now enjoy this quote from a gentoo user illustrating the worst way to upgrade.

linolium: is there a way to wipe every single table and start over from scratch?
linolium: ( the upgrade from mysql 4 to 5 didn’t go as smoothly as planned
me: did you read the upgrade notes?
linolium: no, I just hoped that portage would be kind enough to do those things for me

MySQL User Conference 2006

Monday, May 1st, 2006

Every year the user conference gets better and better. I’m not sure if it’s the actual conference or just that I know so many more people than I did the year before so I’m that much more excited to see them all again. I was a speaker this year which is something like being a C celebrity. The attendees at the conference were split into a few very distinct groups. High order geeks, geeks with questions, and business people. The sessions seemed to be setup to appeal to one of these three groups. Of the sessions I attended my favorites were the row based replication session (which inspired Row based replication and application developers) and Timour Katchaounov’s session on new features of the 5.0 optimizer.

It was interesting walking around the conference meeting different people from various backgrounds and professions. The conference sessions were well suited to the crowd. Some where internals oriented, some howto, and some for the business people. I would like to see more internals presentations next year but that is because I’m more of an internals guy. Overall the sessions were of great quality and the ones I was able to attend were very informative. In between sessions people broke off into little conversation groups. This conversation settled around either constructive critisism about mysql or mysql ab (bugs.mysql ect) or thinking through a problem and coming up with a solution. One of the projects I want to do because of this conversation is run some stats on the bugs database vs mysql releases and the frequency of those releases. I have a feeling there is a sharp increase in the number of bugs due to premature releases in the 5.1 tree. The bugs site has statistics on the number of bugs by month but no correlation to the patch version of mysql at the time. I think it’s important to collect these metrics to see how accurate the dot-twenty theory is. I have a feeling it’s pretty close but that a more exact version can be found based on the time between releases and the number of bugs filed against that release. That’s for another blog entry.

The geeks questions were mostly centered around issues of scalability and replication. I noticed that most people don’t do any sort of capacity planning before deploying a system. In the storage BoF I talked to a person that was launching a new site and had no idea what their expected traffic is or what their current system can handle. She was looking for advice on optimizing mysql but didn’t know if her current system was capable of handling traffic or not. When optimizing any system it’s important to have a target number. If not you never know when to stop optimizing. I’m going to try to do a capacity planning and general scalability session next year. Other questions were about storage capacity and using replication. It seems like most people understood the basic concept behind replication but weren’t fully aware of how to design a system to use it for offloading reads and taking backups. I have a feeling this is something that will need to be taught at every user conference for years to come. Most people seem to understand it after a presentation but a lot of tricks simply aren’t listed in the manual (I’m not sure if they belong there).

This leaves us with the business people. I talked to a few business people with good ideas and a few with bad one. I met a few people from pivot3. They are creating storage technology to use iSCSI to build raids over multiple disks in a single box and multiple boxes. Very cool stuff and I hope it works great with mysql. It has a very good chance of solving the ever expanding storage needs problem. Instead of buying something expensive like a netapp and adding shelves one can add more commodity boxes in a cluster and tell the storage layer to extend on to them. The strangest (and possibly worst) idea was a company that is trying to build an embedded version of mysql with a special storage engine that will peek into the optimizer to figure out exactly which columns mysql is requesting and fetch them instead of fetching the entire row. Strange very strange. I can understand why he wants to know the columns being fetched but the method he was taking to resolve the isssue just seems backwards.
Since I have already covered row based replication I won’t go into it much here. I’m sure I will have much more to talk about when I actually get to try to put it into production. Timour’s session on the 5.0 optimizer was very impressive. In 5.0 MySQL will have the ability to use multiple indexes on the same table by essentially joining them (much in the same way that it joins tables). While this doesn’t completely eliminate the need for multi part keys it certainly helps reduce them.

I was very excited to see that an internal QA team has been formed over the past year. I am looking forward to spending some of my spare time spamming Omer BarNir with questions and ideas for MySQL’s internal QA process. I know QA is boring to most people but it’s closely related to ops so I am very interested to learn about the process and see progress from inside MySQL AB. It would be nice to see some more QA related posts on planetmysql.org as well as routine updates on metrics related to the stability of major versions. It’s common knowledge that most people wait until .10 or even .20 to start trusting a major release. It would be nice to be able to use a more epirical process to pick when to switch to a new version rather than a superstisious number

After a quick scan of planetmysql I can see that a lot of people are just getting home. I hope everyone had a safe trip. I know I did, all eight miles of it :-P