Log Archiving for xtrabackup: restoring

Registered by Nickolay Ihalainen

Implement the possibility of applying archived logs.

The following options should be implemented and used only with --prepare option:
  --archived-logs-dir - the path to archived logs directory
  --to-archived-lsn - lsn to which logs should be applied

The important feature of logs applying must be ability to apply log files incremental.

Blueprint information

Alexey Kopytov
Vlad Lesin
Series goal:
Accepted for 2.1
Milestone target:
milestone icon 2.1.5
Started by
Alexey Kopytov
Completed by
Alexey Kopytov


To apply archive logs there should be path to base backup and path to the folder with archived logs as an input of xtrabackup. The output is the base directory with data files. The following new command line options are involved:
--archived-logs-dir - the path of stored log archives
--archived-max-lsn - the maximum LSN to which logs must be applied
Those options can be used only with --prepare.
xtrabackup --prepare --target-dir=/data/backups/mysql/ --archived-logs-dir=/data/backups/archived-logs/

Logs are applied in innodb_init()-->innobase_start_or_create_for_mysql()-->recv_recovery_from_checkpoint_start()-->recv_group_scan_log_recs(). recv_recovery_from_checkpoint_start() applies logs from all groups in log_sys->log_groups linked list. The main idea is to form log group for archived logs and to push it after the main log group.

The stack trace of log group initialization is innobase_start_or_create_for_mysql()-->open_or_create_log_file()-->log_group_init(). open_or_create_log_file() tries to open or create each log file. If operation is successful it invokes fil_space_create() with file name if it's the first file in a group to create tablespace and fil_node_create() to add file to the tablespace. Then tablespace id is passed to log_group_init() to initialize log group.

The main trouble here is log_sys is initialized inside of innobase_start_or_create_for_mysql() and we can't edit it before innodb initializing and log recovering. The solution may be in patching server code so it would invoke callbacks instead of recv_recovery_from_checkpoint_start() and recv_recovery_from_checkpoint_finish() if corresponding callback pointers are not null. The custom recovery function could create a new log group and invoke old recovery function. If the trick doesn't work for some reason recv_recovery_from_archive_start() is a good example of custom recovery function.

The uncovered question is how to modify archived logs and data files to make recovering work.

Vlad Lesin UPD:
There are some changes which were involved during feature implementation and code exploring.

1) The main thing we should split backup preparing and archived log applying. Those are logically different operations. Backup preparing is an operation to restore data consistency. Archived logs applying is a kind of incremental backup, it should apply delta to existing (and consistency) data. We can apply archived logs to backuped data several times for example for decreasing backup size or time of applying.

2) We don't need to work with log groups at all. It's enough just to read log data to some buffer and invoke recv_scan_log_recs() on this buffer. We should check if archived logs minimum checkpoint number is less or equal then the maximum checkpoint number - 1 in current log file. The last is decreased because innodb engine does checkpoint before shutdown which happens during backup preparing.


Work Items

This blueprint contains Public information 
Everyone can see this information.


No subscribers.