Add block based backup (space/bandwidth efficient)

Registered by Fausto Marzi on 2014-10-14

Currently Freezer checks only ctime and mtime inode information to verify if files are changed or not.
While this approach gives speed (time efficient), it is not bandwidth and storage efficient. Freezer needs to support both librsync and libarchive approach to execute incremental backups and restore.

- Freezerc has to be able to compute changes for blocks of files

- The blocks size needs to be configurable. By default will be 65K

- Signature for files blocks size needs to be generate for level 0 full backups

- The comparation needs to be done against:
    - Available files (signature against local files)
    - Available signatures (signature against signature)
    - Available streams (signature against streams)

- The restore of the files can be done under the following conditions:
    - Downloading valid incremental backups (i.e. level 0 to level n) and apply write
        the differences on the file incrementally, block by block and in stream

- Implementation has to be memory efficient (operations are executed against blocks and not against whole files or whole streams)

- Implementation has to be Disk I/O/space efficient (operations are executed from and to stream of data and not anything like: download files/stream locally and then execute action against local files)

Implementation thoughts:

Methods/Functions:

1) gen_sig_from_file(fs_path, ex_signature=None): Level 0 signature generation from file
    return a new full signature signature data struct (dict)
2) gen_sign_from_stream(read_pipe, ex_signature=None): Level 0 Signature generation from stream
    return a new full signature signature data struct (dict)
3) gen_sign_from_file(read_pipe, ex_signature=ex_signature): Comparation between signature and local file (for backup level > 0)

4) gen_sign_from_stream(read_pipe, ex_signature=ex_signature): Comparation between signatures and blocks of streams (for backup level > 0)

5) patch_from_file(fs_path_soruce, fs_path_dst, read_pipe=None): File patching for restore from files to files (for restore).

6) patch_from_stream(read_pipe, fs_path_dst): File patching for restore from stream to files (for restore)

Blueprint information

Status:
Started
Approver:
Fabrizio Fresco
Priority:
High
Drafter:
Fausto Marzi
Direction:
Approved
Assignee:
Ruslan Aliev
Definition:
Approved
Series goal:
Accepted for pike
Implementation:
Beta Available
Milestone target:
milestone icon pike-2
Started by
Fausto Marzi on 2015-02-05

Related branches

Sprints

Whiteboard

Code review:

- https://review.openstack.org/#/c/159804/

Gerrit topic: https://review.openstack.org/#q,topic:bp/block-based-backup-restore,n,z

Addressed by: https://review.openstack.org/159804
    Support for rsync block based backups

Gerrit topic: https://review.openstack.org/#q,topic:bp/are,n,z

Gerrit topic: https://review.openstack.org/#q,topic:(detached,n,z

Addressed by: https://review.openstack.org/408223
    [WIP] Block based incremental support - Rsync

Addressed by: https://review.openstack.org/409796
    [WIP] - Block based incremental support - rsync

Done by: https://review.openstack.org/#/c/409796/
another enhanced version exists here: https://review.openstack.org/#/c/442420/

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.