Every year the user conference gets better and better. I’m not sure if it’s the actual conference or just that I know so many more people than I did the year before so I’m that much more excited to see them all again. I was a speaker this year which is something like being a C celebrity. The attendees at the conference were split into a few very distinct groups. High order geeks, geeks with questions, and business people. The sessions seemed to be setup to appeal to one of these three groups. Of the sessions I attended my favorites were the row based replication session (which inspired Row based replication and application developers) and Timour Katchaounov’s session on new features of the 5.0 optimizer.
It was interesting walking around the conference meeting different people from various backgrounds and professions. The conference sessions were well suited to the crowd. Some where internals oriented, some howto, and some for the business people. I would like to see more internals presentations next year but that is because I’m more of an internals guy. Overall the sessions were of great quality and the ones I was able to attend were very informative. In between sessions people broke off into little conversation groups. This conversation settled around either constructive critisism about mysql or mysql ab (bugs.mysql ect) or thinking through a problem and coming up with a solution. One of the projects I want to do because of this conversation is run some stats on the bugs database vs mysql releases and the frequency of those releases. I have a feeling there is a sharp increase in the number of bugs due to premature releases in the 5.1 tree. The bugs site has statistics on the number of bugs by month but no correlation to the patch version of mysql at the time. I think it’s important to collect these metrics to see how accurate the dot-twenty theory is. I have a feeling it’s pretty close but that a more exact version can be found based on the time between releases and the number of bugs filed against that release. That’s for another blog entry.
The geeks questions were mostly centered around issues of scalability and replication. I noticed that most people don’t do any sort of capacity planning before deploying a system. In the storage BoF I talked to a person that was launching a new site and had no idea what their expected traffic is or what their current system can handle. She was looking for advice on optimizing mysql but didn’t know if her current system was capable of handling traffic or not. When optimizing any system it’s important to have a target number. If not you never know when to stop optimizing. I’m going to try to do a capacity planning and general scalability session next year. Other questions were about storage capacity and using replication. It seems like most people understood the basic concept behind replication but weren’t fully aware of how to design a system to use it for offloading reads and taking backups. I have a feeling this is something that will need to be taught at every user conference for years to come. Most people seem to understand it after a presentation but a lot of tricks simply aren’t listed in the manual (I’m not sure if they belong there).
This leaves us with the business people. I talked to a few business people with good ideas and a few with bad one. I met a few people from pivot3. They are creating storage technology to use iSCSI to build raids over multiple disks in a single box and multiple boxes. Very cool stuff and I hope it works great with mysql. It has a very good chance of solving the ever expanding storage needs problem. Instead of buying something expensive like a netapp and adding shelves one can add more commodity boxes in a cluster and tell the storage layer to extend on to them. The strangest (and possibly worst) idea was a company that is trying to build an embedded version of mysql with a special storage engine that will peek into the optimizer to figure out exactly which columns mysql is requesting and fetch them instead of fetching the entire row. Strange very strange. I can understand why he wants to know the columns being fetched but the method he was taking to resolve the isssue just seems backwards.
Since I have already covered row based replication I won’t go into it much here. I’m sure I will have much more to talk about when I actually get to try to put it into production. Timour’s session on the 5.0 optimizer was very impressive. In 5.0 MySQL will have the ability to use multiple indexes on the same table by essentially joining them (much in the same way that it joins tables). While this doesn’t completely eliminate the need for multi part keys it certainly helps reduce them.
I was very excited to see that an internal QA team has been formed over the past year. I am looking forward to spending some of my spare time spamming Omer BarNir with questions and ideas for MySQL’s internal QA process. I know QA is boring to most people but it’s closely related to ops so I am very interested to learn about the process and see progress from inside MySQL AB. It would be nice to see some more QA related posts on planetmysql.org as well as routine updates on metrics related to the stability of major versions. It’s common knowledge that most people wait until .10 or even .20 to start trusting a major release. It would be nice to be able to use a more epirical process to pick when to switch to a new version rather than a superstisious number
After a quick scan of planetmysql I can see that a lot of people are just getting home. I hope everyone had a safe trip. I know I did, all eight miles of it