Table statistics draft 2, the slow query log

I’ve posted a new table statistics patch which is the next version of the session table/index statistics patch This version of the patch adds slow query log output. If a query is logged to the slow query log it will have row count statistics added to it.

I’m not sure about the format of the log which is why I’m posting this so early. The first format I tried was:

# Time: 100119 19:24:37
# User@Host: [ebergen] @ localhost []
# Query_time: 10 Lock_time: 0 Rows_sent: 7 Rows_examined: 3
# Rows_read: sbtest.foo:3, sbtest.bar:3,
select * from foo a, bar b where sleep(1) = 0;

Where there would be an additional line for each of rows_changed, rows_changed_x_indexes and index_rows_read. This seemed verbose so I tried a different format of:

# Time: 100119 20:27:16
# User@Host: [ebergen] @ localhost []
# Query_time: 6 Lock_time: 0 Rows_sent: 6 Rows_examined: 14
# Rows Stats: sbtest.foo 18 0 0, sbtest.bar 15 3 3,
# Index Stats: sbtest.bar.u 6,
select * from foo a, bar b where b.u=4 order by sleep(1);

Where the row stats has 3 columns per table of rows_read, rows_changed, rows_changed_x_index. I’m leaning towards the second format but I’m open to ideas. What do you think?

The new patch is here

10 Comments

  1. Eric,

    Great stuff. Can I suggest a different format for the slow query log. Parsing slow query logs is already far too complex. (See the test suite for Maatkit…) If we keep the format parseable by Maatkit’s existing parser it’ll be great, although I appreciate that you’re putting a lot of stuff into the log here and it’s a little complex.

    The slow query log format is key:value entries separated by spaces. This is both the easiest and the most efficient to parse. So I would suggest something like this:

    # Time: 100119 20:27:16
    # User@Host: [ebergen] @ localhost []
    # Query_time: 6 Lock_time: 0 Rows_sent: 6 Rows_examined: 14
    # Row_stats/sbtest.foo: 18,0,0 Row_stats/sbtest.bar: 15,3,3
    # Index_stats/sbtest.bar.u: 6
    select * from foo a, bar b where b.u=4 order by sleep(1);

    The :value part of the key:value pair is not going to be an atomic value and will need to be decomposed for analysis, but I think that’s OK. At least the general line format remains the same here.

    Alternatively, I could suggest this:

    # Time: 100119 20:27:16
    # User@Host: [ebergen] @ localhost []
    # Query_time: 6 Lock_time: 0 Rows_sent: 6 Rows_examined: 14
    # Row_stats: sbtest.foo=18,0,0;sbtest.bar=15,3,3
    # Index_stats: sbtest.bar.u=6
    select * from foo a, bar b where b.u=4 order by sleep(1);

    If you want, before you bake the format fully we can write test cases for the Maatkit parser. I would be very interested in adding analytical support for this to mk-query-digest. The code that does the aggregation is pretty complex (for speed) and it would be nice if it’s as easy as possible to support.

  2. Actually now that I think about it, the second format I showed (where the name: part of the name:value pair does NOT vary) is going to be far more efficient for Maatkit, which self-generates code for efficiency after it’s seen enough samples, and would fail to recognize arbitrarily varying things like Row_stats/variable.variable. So consider this a vote for Row_stats: as a prefix, and something parse-able as a value :-)

  3. Eric Bergen says:

    Thanks Baron, I’ll update the patch to use the second format and repost it.

  4. Ben Smith says:

    Reducing the log output to “Rows Stats: sbtest.foo 18 0 0″ or “# Row_stats: sbtest.foo=18,0,0″ makes it hard for someone who doesn’t know what those stats are to pick up and parse them. It also makes it harder for someone looking at the logs manually to understand them. The cost of storing some extra text is minimal, especially with the increasing size of storage space these days. I would suggest being more verbose, any extra text can be parsed out by whatever script/tool you are using to parse it. Also, if you are using a tool not familiar with the logs, being more verbose can help it understand it without having to do any configuration(i.e. Splunk). To that end, doing something like this instead would be more effective:

    # Time: 100119 20:27:16
    # User@Host: [ebergen] @ localhost []
    # Query_time: 6 Lock_time: 0 Rows_sent: 6 Rows_examined: 14
    # Rows Stats: database=sbtest table=foo rows_read=18 rows_changed=0 rows_changed_x_index=00
    # Rows Stats: database=sbtest table=bar rows_read=15 rows_changed=3 rows_changed_x_index=3
    # Index Stats: database=sbtest table=bar.u index_rows_read=6
    select * from foo a, bar b where b.u=4 order by sleep(1);

  5. Eric Bergen says:

    The database.table notation is common enough that I think people will recognize it without having to specify separate identifiers for each. With that in mind I realize it’s easier to parse something with tools like splunk if the format is database=foo table=bar. I’m leaning towards something like:
    Row Stats: sbtest.foo rows_read=18 rows_changed=0 rows_changed_x_index=0

  6. Splunk isn’t really a tool I’d aim at the slow query log; I’d point it at the error log. The slow query log format is pretty regular except for the insane exceptions it already has, and I’d suggest trying to stick to its basic format if at all possible. Again, it’s already far harder to parse than it seems, and building a FAST parser is really hard.

    As a more verbose alternative I could suggest this:

    # Time: 100119 20:27:16
    # User@Host: [ebergen] @ localhost []
    # Query_time: 6 Lock_time: 0 Rows_sent: 6 Rows_examined: 14
    # Row_stats: sbtest.foo:rows_read=18,rows_changed=0…;sbtest.bar:rows_read=…
    # Index_stats: sbtest.bar.u:rows_read=6
    select * from foo a, bar b where b.u=4 order by sleep(1);

  7. Rick James says:

    Well, why not make it JSON, with some extra whitespace? Lump all the key:value stuff in the JSON, perhaps as a single line.

    Further, skip any values that are “0″ or “no”, as wasting space. This would assist in UPDATEs versus SELECTs — different Rows_* are applicable to the two query types.

  8. Rick James says:

    How can Rows_read be smaller than Rows_examined?

    # Time: 110725 6:35:04
    # User@Host: …[...] @ ….yahoo.com [...]
    # Thread_id: 34392211 Schema: …
    # Query_time: 2.002966 Lock_time: 0.000137 Rows_sent: 100 Rows_examined: 200 Rows_affected: 0 Rows_read: 1
    # Bytes_sent: 161592 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0
    SET timestamp=1311600904;
    SELECT * FROM … w INNER JOIN … c ON w.docid = c.docid WHERE w.docid IN ( … );

    (There were 100 items in the IN clause.)

  9. Eric Bergen says:

    JSON isn’t a bad idea. I left the values with zeros in them to make it easier for parsing. You’re right in that this is harder for humans. Which version of mysql are you using that has Rows_read? I know Rows_examined gets reset in a bunch of different places including every sub query execution. See https://bugs.launchpad.net/maria/+bug/807198

  10. Rick James says:

    Rows_read (and a bunch of other things) are probably in the Percona Xtradb extensions of 5.1.

Leave a Reply