1. Queue ourselves for flushing.
2. Grab the log lock, which might result is blocking if the mutex is
already held by another thread.
3. If we were not committed while waiting for the lock
1. Fetch the queue
2. For each thread in the queue:
a. Attach to it
b. Flush the caches, saving any error code
3. Flush and sync (depending on the value of sync_binlog).
4. Signal that the binary log was updated
4. Release the log lock
5. Grab the commit lock
1. For each thread in the queue:
a. If there were no error when flushing and the transaction shall be committed:
- Commit the transaction, saving the result of executing the commit.
6. Release the commit lock
7. Call purge, if any of the committed thread requested a purge.
8. Return with the saved error code
@todo The use of @c skip_commit is a hack that we use since the @c
TC_LOG Interface does not contain functions to handle
savepoints. Once the binary log is eliminated as a handlerton and
the @c TC_LOG interface is extended with savepoint handling, this
parameter can be removed.
@param thd Session to commit transaction for
@param all This is @c true if this is a real transaction commit, and
@c false otherwise.
@param skip_commit
This is @c true if the call to @c ha_commit_low should
be skipped (it is handled by the caller somehow) and @c
false otherwise (the normal case).
*/
int MYSQL_BIN_LOG::ordered_commit(THD *thd, bool all, bool skip_commit)
{
处理flush 队列,在flush binlog之前先刷redolog ,然后循环的处理每个线程,刷的是在线程的缓存中,不应该是在binlog cache中刷吗?
flush_error= process_flush_stage_queue(&total_bytes, &do_rotate,
&wait_queue);
/*
Fetch the entire flush queue and empty it, so that the next batch
has a leader. We must do this before invoking ha_flush_logs(...)
for guaranteeing to flush prepared records of transactions before
flushing them to binary log, which is required by crash recovery.
*/
THD *first_seen= stage_manager.fetch_queue_for(Stage_manager::FLUSH_STAGE);
assert(first_seen != NULL);
/*
We flush prepared records of transactions to the log of storage
engine (for example, InnoDB redo log) in a group right before
flushing them to binary log.
*/
ha_flush_logs(NULL, true);
DBUG_EXECUTE_IF("crash_after_flush_engine_log", DBUG_SUICIDE(););
assign_automatic_gtids_to_flush_group(first_seen);
/* Flush thread caches to binary log. */
for (THD *head= first_seen ; head ; head = head->next_to_commit)
{
std::pair<int,my_off_t> result= flush_thread_caches(head);
total_bytes+= result.second;
if (flush_error == 1)
flush_error= result.first;
#ifndef NDEBUG
no_flushes++;
#endif
根据设置决定是否在sync后更新binlog的位置
update_binlog_end_pos_after_sync= (get_sync_period() == 1);
if (!update_binlog_end_pos_after_sync)
update_binlog_end_pos();
不是sync后更新,那么flush后更新
/*
Stage #2: Syncing binary log file to disk
*/
/*
Shall introduce a delay only if it is going to do sync
in this ongoing SYNC stage. The "+1" used below in the
if condition is to count the ongoing sync stage.
When sync_binlog=0 (where we never do sync in BGC group),
it is considered as a special case and delay will be executed
for every group just like how it is done when sync_binlog= 1.
*/
if (!flush_error && (sync_counter + 1 >= get_sync_period()))
stage_manager.wait_count_or_timeout(opt_binlog_group_commit_sync_no_delay_count,
opt_binlog_group_commit_sync_delay,
Stage_manager::SYNC_STAGE);
final_queue= stage_manager.fetch_queue_for(Stage_manager::SYNC_STAGE);
if (flush_error == 0 && total_bytes > 0)
{
DEBUG_SYNC(thd, "before_sync_binlog_file");
std::pair<bool, bool> result= sync_binlog_file(false);
sync_error= result.first;
}
if (update_binlog_end_pos_after_sync)
{
THD *tmp_thd= final_queue;
const char *binlog_file= NULL;
my_off_t pos= 0;
while (tmp_thd->next_to_commit != NULL)
tmp_thd= tmp_thd->next_to_commit;
if (flush_error == 0 && sync_error == 0)
{
tmp_thd->get_trans_fixed_pos(&binlog_file, &pos);
update_binlog_end_pos(binlog_file, pos);
}
}
提交阶段判断是否设置了order commit