Add incremental backup support for existing backup API

Registered by Murali Balcha

Cinder backup API does not have support for incremental backup. As the size of the volume grows and the changes to volume between backups tend to be low, performing full backups on the volume may become resource intensive. This proposal suggests following flags to existing backup api to support incremental backup functionality.

Enhance current backup api to include —incr option
cinder backup-create <volumeid> —incr <full backup container>

 —incr option specified for incremental backups. To keep the API implementation simple, we only support incremental from the last full backup. In order to keep the deltas manageable size, user can periodically take full backups and then incrementals.

The other option is snapshot based backups. To support snapshot based backups, the backup api is changed as follows:

cinder backup-create <volumeid> —snapshot
This cli takes a snapshot of the volume and performs the backup of the snapshot. The volume can remain online and in-use for the duration of the operation. At the end of the backup, the snapshot is deleted.

cinder backup-create <volumeid> —snapshot —incr <full backup container>
This cli takes a snapshot of the volume and uploads only the changes since last backup to swift. Again at the end of the backup, the snapshot will be deleted.

Blueprint information

Status:
Complete
Approver:
John Griffith
Priority:
Medium
Drafter:
Murali Balcha
Direction:
Approved
Assignee:
Murali Balcha
Definition:
Approved
Series goal:
Accepted for kilo
Implementation:
Implemented
Milestone target:
milestone icon 2015.1.0
Started by
John Griffith
Completed by
Mike Perez

Related branches

Sprints

Whiteboard

Gerrit topic: https://review.openstack.org/#q,topic:incremental-backup,n,z

Addressed by: https://review.openstack.org/99995
    Proposal for incremental backup functionality

Gerrit topic: https://review.openstack.org/#q,topic:bp/incremental-backup,n,z

Addressed by: https://review.openstack.org/110068
    Add support to incremental backup in cinder

Addressed by: https://review.openstack.org/118298
    Add support to backups in xiv_ds8k driver

<jdg>
There's a number of concerns raised from core team members about the design here and the ability to extend it to incremental in the future (currently it only handles differentials). Personally I think this is "ok" but I do agree that rushing reviews at the end of the cycle is probably not worth the trouble it may cause down the road. I'm going to propose we retarget this to Kilo in tomorrows weekly meeting.

<jdg>
Hi Murali,
This is a great start and you've got some really good work here. At this stage of the release however I think it might be best if we step back and spend some more time going through the details of the design and getting more input on the implementation. I'd like to move this to Kilo, and ideally would like to see it worked on VERY early in the Kilo release. Please let me know if you have any questions.

<mdb>
The current implementation of differential backup expects parent-id argument for differential backup. The differential changes are computed based on the full backup identified by the parent-id. The full backup process is quite simple. Every time a full backup is taken, the backup process divides the volume data into fixed blocks of size CONF.backup_swift_block_size and calculates sha-256 for each block. The contents of the volume as well as the shafile that includes the sha-256 of each block is uploaded to the swift. When a differential backup is requested, the backup process calculates new set of sha-256s for current data. It then compares each block’s sha-256 with corresponding full backup sha-256. If both shas match then there is no change in the block data. Otherwise the block is changed and needs to be backed up. The backup process calculates the largest modified extent and uploads the extent to swift.

During restore, if a differential backup needs to be restored, the restore process first restores the full backup identified by the parent-id of the differential backup and overlays it with data from differential backup to complete the restore.

Extending existing differential algorithm to implement incremental backups is straight forward. Differential backups are calculated with respect to full backups where as incremental backups are done with respect to last backup. Incremental backups usually contain a chain of parents where as differential backups usually has a reference to full backup.

For example an incremental backup chain may look like: Full <- incr1 <- incr2 <- incr3

To take another incremental for this existing chain user can specify
Cinder backup-create —parent-id “incr3” volume_id

The backup process will compute new shafile for the current data. It then compares the sha-256 of each block of current data with sha-256 of incr3 data. The data that is identified as changed is uploaded to swift. The backup process is no different than differential backup process.

When restoring an incremental backup, the restore process need to walk all incremental chains until it find the full backup to complete the restore process. This is slight change from differential backup where restoring differential backup only require traversing to full backup.
________________

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.