Encrypt separate files in xbstream container

Registered by Sergei Glushchenko

Current xbcrypt implementation encrypts entire xbstream archive and puts it in custom container. Should one want to extract just single file from unencrypted xbstream archive, one can filter out the chunks using information contained in chunk header. For encrypted xbstream one must decrypt entire archive to perform this filtering.

This blueprint is to implement following.

1. Encrypt every single file separately (this is already done for local backups)
2. Pack encrypted files into usual xbstream archive
3. Enhance xbstream format to hide file names from chunk headers
   a) put hash of the file name into chunk header
   b) write down tuples (hash, filename) into separate encrypted file and put it into xbstream archive

Blueprint information

Status:
Complete
Approver:
None
Priority:
Undefined
Drafter:
Sergei Glushchenko
Direction:
Needs approval
Assignee:
Sergei Glushchenko
Definition:
Drafting
Series goal:
None
Implementation:
Implemented
Milestone target:
milestone icon 2.3.1-beta1
Started by
Sergei Glushchenko
Completed by
Sergei Glushchenko

Related branches

Sprints

Whiteboard

Do we really need a hash->filename mapping file?
How else do we know which chunk belongs to which file upon restore? Encrypt not only payload but the whole unencrypted chunk? Add additional encrypted field?

Yes, basically make the full path a part of encrypted payload or an extra encrypted field, depending on how you look at it. Keeping that info in a separate file doesn't solve the issue anyway: it has to be stored before all other files in the stream to be useful, but that imposes serious restrictions on how the stream is generated.

The disadvantage of keeping file name as part of payload is following. Should we want to implement command `xbcloud --list backup_name` to list contents of backup, we must download and decrypt at least one chunk of every file with mangled file name.

Also, do we really need to obscure file names? They are pretty easy to guess anyways. Every backup will have ibdata1 and other common files.

https://github.com/percona/percona-xtrabackup/pull/31

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.