The lost art of the join.

One of the things I always like about growing up with MySQL is that I had to learn joins. There was no sub select to fall back on if I couldn’t get a join to work correctly. With MySQL 4.1 the norm and 5.0 quickly on it’s heels I see more and more problems that people are solving with sub selects that could as easily be solved with joins if they knew how to write them. In my experience joins are faster and easier to optimize than sub selects but I seem to be in the minority now with that idea.

Don’t worry join syntax. I won’t forget you!

5 Comments

  1. Lukas says:

    The only real disadvantage for subqueries are that they force a nested loop join. Aside from that the RDBMS more or less as the same chances of finding a high quality query plan for a sub select just as for the equivalent join. However the fact of the matter is that most RDBMS got joins before they got sub queries, so they just had more optimizations in place for joins. Over time sub select implementations will improve, but since they came second they are less popular and by that same token of less interest for optimization. Its a chicken and egg problem of sorts.

    On the plus side I think sub selects are much more maintainable and in some cases a sub select can break out earlier if you just have an EXISTS.

    /me hugs sub selects

  2. lenbust says:

    dude, you should totally write up a comparative study on sub-selects and joins. I’d read it (no fingers crossed).

  3. Lukas says:

    I have some slides on the topic:
    http://pooteeweet.org/files/phpconf05/fast_portable_SQL_II.pdf

    page 11 and 12 compare joins with sub selects

  4. Inspired by your blog post I just started a little initiative to “save the art of the join”: See my post at http://www.futhark.ch/mysql/122.html – anybody who wants to JOIN me?

  5. Ok, optimization is one point, but I maintain that the “find none-matchin rows problem” and even the “find at least one matching row” problem is best solved using a NOT EXISTS and EXISTS subqueries respectively.

    The EXISTS keyword clearly expresses what you intend to do. Not something that can be said of a (OUTER) JOIN for these cases. Odd when you think of it, JOIN, that is put together, to find out that nothing’s found on the other end to put together.

    In mysql, i do tend to use the LEFT JOIN syntax to solve the “find none-matching rows” problem, but only because the LEFT JOIN performs signiicantly faster there. However, on Oracle, I find comparable performance for the subquery and the JOIN, and often, the EXISTS syntax is faster than a LEFT JOIN.

    Now, I’ll just go on waiting for the perfect computing world that boldygoes where no RDBMS has gone before, and where optimizers are so smart that performance will be equal in these functionally comparable cases.

Leave a Reply