Extend innochecksum to display file format in InnoDB files

Registered by Alexey Kopytov

In some cases it is useful to know the version and/or format which was
used to create an InnoDB data file. For example, importing data files
created with versions before 5.1.7 may result in server crashes, because
those versions did not initialize some fields in tablespace/page headers
correctly (see the note about incompatibility in
http://www.percona.com/doc/percona-server/5.5/management/innodb_expand_import.html?id=percona-server:features:innodb_import_table_from_xtrabackup&redirect=2
referring to bug #1000221 and bug #727704).

The goal of this BP is to introduce the new '-f' option to innobackupex.
When the '-f' option is specified, innochecksum should read file format
information from a given InnoDB data file by checking the page and
tablespace flags available in the first page, and then exit instead of
scanning the entire file. As only the first page needs to be read to
detect the format/version information, it can also be used on a running
server.

Blueprint information

Status:
Complete
Approver:
Alexey Kopytov
Priority:
Undefined
Drafter:
Alexey Kopytov
Direction:
Approved
Assignee:
Alexey Kopytov
Definition:
Approved
Series goal:
Accepted for 5.5
Implementation:
Implemented
Milestone target:
milestone icon 5.5.28-29.2
Started by
Alexey Kopytov
Completed by
Stewart Smith

Whiteboard

The following format combinations are available from the first
tablespace page:

Antelope, pre-5.1.7:

InnoDB versions before MySQL 5.1.7 always initialized the FIL_PAGE_TYPE
field in the page header to 0 (unless it is an index page, but the first
page in a tablespace can never be an index page)

Antelope, 5.1.7 or later:

Newer InnoDB versions initialize the FIL_PAGE_TYPE field in the page header
to FIL_PAGE_TYPE_FSP_HDR (8).

Tablespace flags (the FSP_SPACE_FLAGS field in the tablespace header) in the
Antelope format are always 0.

Barracuda:

Tablespaces in the Barracuda format have DICT_TF_FORMAT_ZIP in
FSP_SPACE_FLAGS. Additionally, some bits in flags also contain a
non-zero value indicating the compressed page size
(i.e. ROW_FORMAT=COMPRESSED), or 0 for uncompressed tablespaces
(i.e. ROW_FORMAT=DYNAMIC).

It is also possible to differentiate between
ROW_FORMAT=REDUNDANT/COMPACT for the Antelope format, but that also
requires reading an index page, rather than the first page in the
tablespace, and is insignificant for the purposes of this task.

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.