Sunday, April 22, 2012

Replication Is Bad for MySQL Temp Tables

Experienced MySQL DBAs know that temp tables cause major problems for MySQL replication.  It turns out the converse is also true:  replication can cause major problems for temporary tables.

In a recent customer engagement we enabled Tungsten Replicator with a MySQL application that originally ran on a server that did not use replication.  QA promptly discovered reports that previously ran in 10 seconds were now running in as many minutes.  It turned out that the reports used temp tables to assemble data, and these were being written into the master binlog.  This created bloated binlogs and extremely slow reports.  We fixed the problem by enabling row replication (i.e., binlog-format=row in my.cnf).

A common DBA response to temp table problems is to try to eliminate them completely, as suggested in the excellent High Performance MySQL, 3rd Edition. (See p. 502.)  Elimination is a good philosophy when applications use temp tables to generate updates.  However, it does not work for reporting.  Temp tables allow you to stage data for complex reports across a series of transactions, then pull the final results into a report writer like JasperReports.  This modular approach is easy to implement and maintain afterwards.  Eliminating temp tables in such cases can create an unmaintainable mess.

The real solution with report temp tables is to keep them out of the master binlog.  Here is a list of common ways to do so.  Let me know if you know others.

* Turn off binlog updates.  Issue 'SET SESSION SQL_LOG_BIN=0' when generating reports.  The downside is that it requires SUPER privilege to set.  Also, if you make a code mistake and update normal tables with this setting, your changes will not be replicated.

* Use a non-replicated database.  Configure the master my.cnf with binlog-ignore-db as follows to ignore any update (including on temp tables) that is issued when database 'scratch' is the default database:

binlog_ignore_db = scratch

This approach does not require special privileges.  However coding errors or connection pool misconfigurations are obvious liabilities.  Your application must either connect to the scratch database or issue an explicit use command. Otherwise, temp table operations will be logged, as in the following example:

use not_scratch;
create temporary table scratch.report1_temp(name varchar(256), entry_time date, exit_time date);

* Use a slave with the binlog disabled.  Remove the log-bin option from my.cnf.  This works well if you have extra reporting slaves that are caught up.  However, it may not work if the reports must be fully up-to-date or you need the ability to promote the slave quickly to a master, in which case the binlog must be enabled.  

* Use row replication.  You can set row replication at the session level using 'SET SESSION binlog_format=row', which requires SUPER privilege, or overall by setting binlog-format in my.cnf.  In this case CREATE TEMPORARY TABLE and updates on temp tables do not appear in the binlog at all.  The downside of enabling row replication fully is that it can lead to bloated logs and blocked servers if you have very large transactions.  SQL operations like DELETE that affect multiple rows are stored far more compactly in statement replication.  Also, reloading mysqldump files can be very slow in row replication compared to statement replication, which can handle block inserts generated by the --extended-insert option.

The proper solution to keep replication from hurting your use of temp tables will vary depending on your application as well as the way you run your site.  For my money, though, this is a good example of where row replication really helps and deserves a closer look.  

MySQL could use some feature improvements in the area of temp tables and replication.  I find it surprising that mixed mode replication does not fully suppress temp table binlog updates.  Only row replication does so.   Second, it would be great to have a CREATE TABLE option to suppress logging particular tables to the binlog.  This would allow applications to make the logging decision at schema design time.  Finally, global options to suppress binlogging of specific table types, such as temp tables and MEMORY tables would be useful.  Perhaps we will see some of these in future MySQL releases.  

15 comments:

Anonymous said...

Are you perfectly sure that "'SET SESSION SQL_LOG_BIN=0'" requires SUPER privilege?

I have not checked this particular case but SESSION variables can normally be SET by any user.

-- Peter Laursen

XL said...

Honestly I don't see the problem with temp tables and the master. In SBR only few statements go to the binlog for creation/population of the temp table. And in RBR nothing is binlogged at all.

What you certainly don't want, is the slave to process such scratch tables. For that you can use a replicate-[wild]-ignore-table clause together with a distinctive name for such tables (i.e. table name starting with "scratch_")

But the first question would be: why do you generate reports on the master anyway? Why not using a slave?

Ben said...

The SUPER privilege enables an account to enable or disable logging. It makes sense as otherwise an administrator might not be able to guarantee the consistency of a slave.

Check the MySQL docs for confirmation..

http://dev.mysql.com/doc/refman/5.5/en/privileges-provided.html#priv_super

or to see it mentioned even more explicitly see here:

http://dev.mysql.com/doc/refman/5.5/en/binary-log.html

"A client that has the SUPER privilege can disable binary logging of its own statements by using a SET sql_log_bin=0 statement."

barryhunter said...

re your "non-replicated database" option. Use

replicate-wild-ignore-table=scratch.%

That will stop the tmp tables replicating, even without the use of the "USE" keyword. Although we leave in the binlog_ignore_db as well - just incase.

(this is how we solved the issue with replication and temp tables)

Robert Hodges said...

@Anonymous, as pointed out by Ben, only SUPER can manipulate the binlog:

mysql> SET SESSION SQL_LOG_BIN=0;
ERROR 1227 (42000): Access denied; you need (at least one of) the SUPER privilege(s) for this operation

Robert Hodges said...

@barryhunter, the option you suggest, replicate-wild-ignore-table, is a slave-side option. It definitely helps with the problem of keeping certain tables from ever applying to the slave. However, it does not keep them out of the binlog in the first place, which is the problem we are trying to avoid.

Robert Hodges said...

@XL, Many if not most MySQL sites run statement or mixed replication. Imagine you run a lot of reports and they each generate 10K rows in temp tables. That could easily result in 10Mb of binlog written *per report* if each row has 1000 bytes of data. This slows down reports, can block the binlog, which is a serialized resource especially if sync-binlog=1, and is a waste of storage.

Using a slave is in fact option #3. However, it is not a solution in all cases, as noted above.

barryhunter said...

Doh! Didnt notice that!

I guess using statement replication, never really noticed the bloat in the binlog. (as only the statement is logged anyway)

So logging the statement in your case was a significant cause of slowdown. Interesting.

Robert Hodges said...

@barryhunter, , The problem becomes quite noticeable if your storage is slow for any reason, for example excessive fragmentation or storing binlogs and data files on the same underlying disk.

barryhunter said...

Wouldn't row based replication, generally result in bigger binlogs anyway? So if disks slow, will be slower across the board.

(yes definitly seperate physical disks for data and logs :)

Robert Hodges said...

@barryhunter, Depends on your transaction workload. For most workloads it is somewhat bigger due to the fact that RBR stores before and after images of updated rows and extra columns beside keys for deletes. The real problems come when you do something like 'UPDATE emp SET salary = salary * 1.1' when table emp has 50M rows. We have seen cases where people end up putting multiple GB of data into the binlog for single transactions without being aware of it, then wondering why the slaves are suddenly lagging.

The real lesson is to look at what's going into the binlog.

shlominoach said...

@Peter:

root@mysql-5.1.51> grant select, insert, update on world.* to 'world_user';

quit


bash$ ./use -u world_user

world_user@mysql-5.1.51> set sql_log_bin=0;
ERROR 1227 (42000): Access denied; you need the SUPER privilege for this operation

Aftab Khan said...

sql_log_bin is a session variable, so it can be used to disable/enable binary logging for the current session only (and it requires SUPER grant)

>But the first question would be: why do you generate reports on the master anyway?

e.g. Aggregate queries that store data into summary tables.

> Filtered MySQL Replication
Make sure you aren't using combination of database and table options. This can sometimes lead to results that seem counterintuitive. So, avoid mixing replicate-ignore-db=global and replicate-wild-ignore-table=global.% as unexpected results may occur

http://dev.mysql.com/doc/refman/5.0/en/replication-rules.html

Similary, if the sql that create temporary table is issued from stored procedure then replicate-[wild]-ignore-table won't work:

"This option applies to tables, views, and triggers. It does not apply to stored procedures and functions."
http://dev.mysql.com/doc/refman/5.0/en/replication-options-slave.html#option_mysqld_replicate-wild-do-table


>Honestly I don't see the problem with temp tables and the master.

There is a problem with temp tables and replication slave (when using SBR). On replication slave, temporary tables that are open for use in updates that have not yet been executed on the slave. If you stop the slave server, the temporary tables needed by those updates are no longer available when the slave is restarted.. so replication would break!

Anonymous said...

Where we need to set binlog-format=row on slave or master or both?

Robert Hodges said...

@Anonymous, you must set binlog=row on the master. If you plan to failover to the slave and allow updates there you must set it there as well.

Scaling Databases Using Commodity Hardware and Shared-Nothing Design