Ideas on replacing tar

Registered by Kenneth Loafman

This is a collection of ideas on how to replace tar in duplicity. I'll be adding/replacing contents in the Whiteboard as I find candidates. Some of the candidates I'll need to harvest from the mail list. Feel free to add comments/suggestions. If you find a candidate, please include the URL.

Blueprint information

Status:
Not started
Approver:
None
Priority:
Undefined
Drafter:
None
Direction:
Approved
Assignee:
None
Definition:
New
Series goal:
None
Implementation:
Unknown
Milestone target:
None

Related branches

Sprints

Whiteboard

http://code.google.com/p/libarchive/wiki/LibarchiveFormats

LibarchiveFormats
Summary of formats supported by the library and command-line tools.
Introduction

Libarchive is highly modular. It was designed from the beginning to make it relatively easy to add new archive formats and compression algorithms. Note, however, that each program that uses libarchive chooses which formats it wants to use, so support in libarchive does not guarantee support in any particular program. Of course, the bsdtar and bsdcpio programs included in the libarchive formats do enable all libarchive formats by default.

For developers: Note that libarchive is modularized in such a way that you pay nothing for formats you don't use. If you choose to omit a particular format, no code for that format will be linked into your program. In particular, you only need zlib, bzlib, or lzma libraries if you specifically enable the corresponding formats.
Filter Support

Starting with libarchive 2.6, there is now support for multiple filters when reading archives. This is not of much practical use yet (it is rarely helpful to bzip2 a file that has already been gzipped), but will allow a future version of libarchive to automatically support tar.gz.uu and similar combinations.

    * gzip (read and write, uses zlib)
    * bzip2 (read and write, uses bzlib)
    * compress (read and write, uses an internal implementation)
    * separate command-line compressors with fixed-signature auto-detection
    * xz and lzma (read and write using liblzma)
    * lzma (if you lack liblzma, you can get read-only lzma support through the lzmadec library; this will likely be dropped as soon as liblzma is stable and widely-available)
    * Starting with libarchive 2.7, most of the above will fall back to using command-line tools if the libraries were unavailable at build time. Note that the command-line tools are usually slower than using the libraries directly.

Archive Formats Supported

    * tar (read and write, including GNU extensions)
    * pax (read and write, including GNU and star extensions)
    * cpio (read and write, including odc and newc variants)
    * ISO9660 (read only, with some limitations)
    * Zip (read only, with some limitations, uses zlib; write support starting with libarchive 2.8)
    * mtree (read and write)
    * shar (write only)
    * ar (read and write, including BSD and GNU/SysV variants)
    * empty (read only; in particular, note that no other format will accept an empty file)
    * raw (read only, starting in libarchive 2.8)

----

Would using an existing format not have many advantages ?

http://en.wikipedia.org/wiki/7z
http://en.wikipedia.org/wiki/PEA_archive_format

----

I have been using DAR in place of TAR for a few years now, very satisfactorily.

http://dar.linux.free.fr
http://en.wikipedia.org/wiki/DAR_%28Disk_Archiver%29

# Support for slices, archives split over multiple files of a particular size.
# Option of deleting files from the system which are removed in the archive.
# Incremental backup.
# Per-file compression with gzip or bzip2 (as opposed to compressing the whole archive). An individual can choose not to compress already compressed files based on their filename suffix.
# Fast-extracting of files from anywhere in the archive.
# Fast listing of archive contents through saving the catalogue of files in the archive.
# redundancy checksumming
# API

----

What about the XAR format?

http://code.google.com/p/xar/
http://en.wikipedia.org/wiki/Xar_%28archiver%29

It seems to have a lot of the features listed on the duplicity wishlist for a new archive format:

http://duplicity.nongnu.org/new_format.html#benefits

Including:

  * XML table of contents
  * Random access

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.