Backup-safe binlog information

Registered by Alexey Kopytov

One inefficiency of the backup locks feature is that even though LOCK
TABLES FOR BACKUP as a light-weight FTWRL alternative does not affect
DML statements updating InnoDB tables, LOCK BINLOG FOR BACKUP does
affect them by blocking commits. This blueprint defines necessary
server-side changes to let XtraBackup avoid using LOCK BINLOG FOR BACKUP
under some circumstances.

Blueprint information

Status:
Complete
Approver:
Alexey Kopytov
Priority:
Undefined
Drafter:
Alexey Kopytov
Direction:
Needs approval
Assignee:
Alexey Kopytov
Definition:
Approved
Series goal:
Accepted for 5.6
Implementation:
Implemented
Milestone target:
milestone icon 5.6.26-74.0
Started by
Alexey Kopytov
Completed by
Alexey Kopytov

Related branches

Sprints

Whiteboard

https://github.com/percona/percona-server/pull/122

XtraBackup uses LOCK BINLOG FOR BACKUP to:

1. retrieve consistent binary log coordinates with SHOW MASTER
   STATUS. More precisely, binary log coordinates must be consistent
   with the REDO log copy and non-transactional tables. Therefore, no
   updates can be done to non-transactional tables (this is achieved by
   an active LOCK TABLES FOR BACKUP lock), and no commits can be
   performed between SHOW MASTER STATUS and finalizing the redo log
   copy, which is achieved by LOCK BINLOG FOR BACKUP.

2. retrieve consistent master connection information for a replication
   slave. More precisely, the binary log coordinates on the master as
   reported by SHOW SLAVE STATUS must be consistent with the REDO log
   copy, so LOCK BINLOG FOR BACKUP also block the I/O replication thread.

3. For a GTID-enabled PXC node, the last binary log file must be
   included into an SST snapshot. Which is a rather artificial
   limitation on the WSREP side, but still XtraBackup obeys it by
   blocking commits with LOCK BINLOG FOR BACKUP to ensure the integrity
   of the binary log file copy.

Depending on the write rate on the server, finalizing the REDO log copy
may take a long time, so blocking commits for that duration may still
affect server availability considerably.

This task is to make the necessary server-side change to make it
possible for XtraBackup to avoid LOCK BINLOG FOR BINLOG in case #1, when
cases #2 and #3 do not apply, i.e. when no --slave-info is requested by
the XtraBackup options and the server is not a GTID-enabled PXC node.

Lifting limitations for cases #2 and #3 is also possible, but that is
outside the scope of this task.

The idea of the optimization is that even though InnoDB provides a
transactional storage for the binary log information (i.e. current file
name and offset), it cannot be fully trusted by XtraBackup, because that
information is only updated on an InnoDB commit operation. Which means
if the last operation before LOCK TABLES FOR BACKUP was an update to a
non-transactional storage engine, and no InnoDB commits occur before the
backup is finalized by XtraBackup, the InnoDB system header will contain
stale binary log coordinates.

One way to fix that would be to force binlog coordinates update in the
InnoDB system header on each update, regardless of the involved storage
engine(s). This is what a Galera node does to ensure XID consistency
which is stored in the same way as binary log coordinates: it forces XID
update in the InnoDB system header on each TOI operation, in particular
on each non-transactional update.

Another approach is less invasive: XtraBackup blocks all
non-transactional updates with LOCK TABLES FOR BACKUP anyway, so instead
of having all non-transactional updates flush binlog coordinates to
InnoDB unconditionally, LTFB can be modified to flush (and redo-log) the
current binlog coordinates to InnoDB. In which case binlog coordinates
provided by InnoDB will be consistent with REDO log under any
circumstances.

The patch for this blueprint implements the latter approach.

This feature does not introduce any changes in the SQL syntax.

New server and status variables
-------------------------------

have_backup_safe_binlog_info

  This is a server variable implemented to help other utilities decide
  if LOCK BINLOG FOR BACKUP can be avoided in some cases. When the
  necessary server-side functionality is available, this server system
  variable exists and its value is always YES.

The XtraBackup-side functionality is tracked in
https://blueprints.launchpad.net/percona-xtrabackup/+spec/lockless-binlog-info

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.