Add ability to encrypt backup stream

Registered by George Ormond Lorch III

The primary purpose of this functionality is to allow a parallel, streamed backup to be encrypted where the encryption may also be executed in parallel.

Encryption will be done through the libgcrypt library which can be found documented here: http://www.gnupg.org/documentation/manuals/gcrypt

Addition of libgcrypt requires now that libgcrypt and libgpgerror packages be installed.

Many package repositories still contain older versions of libgcrypt. The current stable version of 1.5.0 is reccomended and will take advantage of the AES-NI instruction set if available.

Similar to compression, encryption is not supported on streamed tar backups.

New options:
  --encrypt=<algorithm> : algorithm may be 'NONE', 'AES128', 'AES192' or 'AES256'. 'NONE' is used mainly for testing and is simply pass-through. Please refer to the libgcrypt manual for more information on these ciphers.

  --encrypt_key=<key> : a proper length encryption key to use. It is not reccomended to use this option where there is uncontrolled access to the machine as the command line and thus the key can be viewed as part of the process info. See option --encrypt_key_file.

  --encrypt_key_file=<keyfile> : the name of a file where the raw key of the appropriate length can be read from. The file must be a simple binary (or text) file that contains exactly the key to be used.

  --encrypt_threads=<threadcount> : number of threads used to encrypt in parallel (default=1).

  --encrypt_chunk_size=<size> : size (in bytes) of the working encryption buffer size for each encryption thread (default=64K).

  --compress_chunk_size=<size> : size (in bytes) of the working compression buffer size for each compression thread (default=64K).

encrypt_key and encrypt_key_file are mutually exclusive. If --encrypt is specified, one of these MUST be specified else an error will occur.

The coding task was fairly straight forward with some restructure/refactoring approved by Alexey K:
  - Update/set copyright notice in new and touched files to Copyright (c) 2011 Percona Ireland Ltd.
  - Implemented new ds_stdout data sink.
  - Split the ds_stream data sink into two new, more specific data sinks, ds_archive and ds_xbstream.
  - Implement write callback model for ds_xbstream similar to libarchive.
  - Change ds_archive and ds_xbstream over to using callback write models.
  - Add new option to compression --compress_chunk_size to allow user specified chunk size management.
  - Implement xbcrypt format reader/writer. Format encapsulated as follows:
      8 bytes - magic string "XBCRYP01"
      8 bytes - reserved
      8 bytes - original size
      8 bytes - encrypted size
      4 bytes - checksum
      'encrypted size' bytes - encrypted data
  - Implement a new ds_encrypt datasync modeled after the existing compression datasync which will use libgcrypt for actual encryption task and write callbacks.
  - Add new options, options validations and pass through innobackupex.
  - Implement new utility xbcrypt modeled after xbstream to perform encryption outside of xtrabackup for metadata and to offer a means of decrypting an encrypted backup.
  - Implement new tests and refactor test script structure a little to make adding new, similar test cases a little easier.

Blueprint information

Status:
Complete
Approver:
Alexey Kopytov
Priority:
Essential
Drafter:
George Ormond Lorch III
Direction:
Approved
Assignee:
George Ormond Lorch III
Definition:
Approved
Series goal:
Accepted for 2.1
Implementation:
Implemented
Milestone target:
milestone icon 2.1.0-alpha1
Started by
Alexey Kopytov
Completed by
Alexey Kopytov

Sprints

Whiteboard

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.