Statement Based Logging and Row Based Logging
It is widely known that MySQL now supports two kinds of replication: statement based replication (SBR) and row based replication (RBR). And the corresponding logging methods used in replication are called statement based logging (SBL) and row based logging (RBL). In this post, I am going to find out a few more details inside of different logging methods.
Before that, make sure you have read the basics in MySQL official document. Please keep in mind that binlog is used for recovering data and replication is nothing but an online recovery process. That will make it easier to understand.
1. Binary Logging Formats
2. Advantages and Disadvantages of Statement-Based and Row-Based Replication
OK, let’s begin our journey to binlog’s design. The first question, what will you do if you want to save all the data changes in log?
The easiest way may be to record every commands related to data changes into log. That is SBL and here is a log example for insertion:
# at 463 #120618 6:01:57 server id 1 end_log_pos 577 Query thread_id=1 exec_time=0 error_code=0 SET TIMESTAMP=1339999317/*!*/; insert into test(idx,data) values (2,3),(4,5),(6,7)
SBL seems to be good enough until we find some records are not completely same after recovery. Why is datetime filed different from it was before? We are talking about nondeterministic functions here. Try to imagine the followings:
1. Insert one record with one datetime filed by execute “insert into test values (sysdate())”
2. Data in MySQL will be “2012-06-16 08:31:47″ and the command in binlog (SBL format) will be “insert into test values (sysdate())”
3. One month later, recover data from binlog. Then we will find that record’s value change to “2012-07-16 08:31:47″
The problem here is that nondeterministic functions cannot return same value all the time. The only way to make SBL reliable is to record the nondeterministic commands accompanied by all the decisive factors. For example, sysdate() will be followed by a timestamp, and rand() will be followed by a seed. Umm, it seems to be fine. But soon or later, inconsistency appears again! This time, try to take trigger and UDF into account.
Because trigger and UDF are extremely agile, MySQL has no way to collect all the decisive factors in such a situation. It is even easier to record the changed data itself than to record the command and decisive factors. Here comes RBL and the log example for insertion:
# at 480 #120618 6:03:02 server id 1 end_log_pos 480 Table_map: `test`.`test` mapped to number 33 #120618 6:03:02 server id 1 end_log_pos 548 Write_rows: table id 33 flags: STMT_END_F ### INSERT INTO test.test ### SET ### @1=1 ### @2=2 ### @3=3 ### INSERT INTO test.test ### SET ### @1=2 ### @2=4 ### @3=5 ### INSERT INTO test.test ### SET ### @1=3 ### @2=6 ### @3=7
(Some people are wondering the difference between binlog and redo log. Actually, they have nothing in common. Binlog, SBL and RBL both, is a logic log applying to mostly all of the MySQL storage engines. On the contrary, redo is a physical log and only works for InnoDB.)
RBL is much safer than SBL to use. But the problem behind safety is the significantly increased binlog size.
However RBL is not too laborious to record any changes row by row. For example, RBL will record administrative commands directly into binlog as same as SBL does.
# at 617 #120618 6:03:02 server id 1 end_log_pos 718 Query thread_id=1 exec_time=0 error_code=0 SET TIMESTAMP=1339999382/*!*/; alter table test add index idx_2(data) /*!*/; # at 718 #120618 6:03:02 server id 1 end_log_pos 800 Query thread_id=1 exec_time=0 error_code=0 SET TIMESTAMP=1339999382/*!*/; truncate table test /*!*/; # at 800 #120618 6:03:02 server id 1 end_log_pos 881 Query thread_id=1 exec_time=0 error_code=0 SET TIMESTAMP=1339999382/*!*/; drop database test /*!*/;
Mixed Logging is very similar to RBL. I am going to skip it. Maybe you are not very satisfied with my post and try to dig much deeper. There is an interesting post on source forge, I think you will enjoy the details:
MySQL Internals Binary Log