RPM

Commit to a UUID scheme for metadata identification

Registered by devzero2000

I will attach the original mail. Not a summary at all, but i find difficult to track the mail if so important. apologies.

**********************************************************************

In order to permit timely discussion, I need to describe
how UUID's can be attached to metadata in *.rpm packages.

The issue (nearing proposal stage, its already is more or less
what @rpm5.org is gonna do, though there's likely a few
more details that need to be worked through) is how
to attach an identifier to all *.rpm packages that
is both "portable" and "general".

(aside)
Feel free to invent your own counter-proposals if you wish; I'm
describing an existing implementation, not advocating my own
implementation particularly, here. I do believe that UUID's attached
to *.rpm packages is necessary no matter how that is achieved.

The original implementation was done quite some time ago, the
thread is/was (at least the right timeframe) here:

      http://rpm5.org/community/rpm-devel/2614.html

And the current version of RPM in Cooker has been built --with-uuid
enabling the functionality (Thank you! to Per Oyvind). Note that there's one more patch
expected shortly to permit --queryformat display of UUID strings
from the CLI (as used in the two examples I will attach); the ability\
to generate UUID's in binary (but not display) form is already deployed.

UUID's are described in RFC 4122 here:
 http://www.ietf.org/rfc/rfc4122.txt

The RPM implementation uses the OSSP UUID package, (not the ext2 UUID library
that linux tends to use), that has been in Cooker and other distros for quite some time.

RPM (from @rpm5.org, in Cooker) uses two mapping techniques for time and digest.
Time elements from metadata (like RPMTAG_BUILDTIME) are mapped directly into a
UUIDv1. DIgest elements (like the header+payload MD5) are mapped into a UUIDv5
based on a SHA1 of a conventional string representation whose plaintext ends
up looking like (for a RPMTAG_PKGID) a prefix with several configurable components, the
last of which is the header+paload MDV digest displayed in hex. So literally the
the configurable/conventional string from which a UUIDv5 is constructed for RPMTAG_PKGID
looks like this (in the @rpm5.org implementation) atm:
 http://rpm5.org/package/Pkgid/c9a13d03c9a45512781677bae89fe908

The lead-in "http://rpm5.org"; dictates an administrative authority, not unlike
comf.example.whatever in java and can be changed to taste. The administrative
authority for this UUID scheme is most definitely "rpm5.org" no matter what
anyone else chooses to do.

The "/package" piece is from D.J. Bernstein's packaging scheme to inject
some hierarchical path ordering into the plain text. Entirely arbitarily
chosen, but I *am* a djb fan bois).

RPM then adds the tagname (i.e. RPMTAG_PKGID -> "/Pkgid") and the value in hex is
appended.

Since UUID's are likely quite mysterious to most, here are two examples of
how to use the OSSP uuid tool with UUID's generated by RPM. The /usr/bin/uuid isn't hard
to use at all even if the global nature of attaching UUID's to all *.rpm packages
that have ever been produced is likely a bit daunting:

Two examples of build system related information mapped into UUID's should illustrate
how existing *.rpm packages could/should/would fit into UUID's as "identification",
as well as illustrate the OSSP /usr/bin/uuid executable:

1) RPMTAG_BUILDTIME -> UUIDv1 (direct conversion from {seconds,microsecs} time stamp)

$ uuid -d `rpm -q --qf '%{buildtime:uuid}\n' bash`
encode: STR: f22c5c00-4828-11e0-8000-0007e96e1a1a
       SIV: 321903502045740071773501337054363916826
decode: variant: DCE 1.1, ISO/IEC 11578:1996
       version: 1 (time and node based)
       content: time: 2011-03-06 19:35:52.000000.0 UTC
                clock: 0 (usually random)
                node: 00:07:e9:6e:1a:1a (global unicast)

2) RPMTAG_PKGID -> UUIDv5 (by appending hex to a conventional string based on djb's /package)

$ uuid -d `rpm -q --qf '%{pkgid:uuid}\n' bash`
encode: STR: 4210c283-5b32-5a0b-b9a1-606754021e72
       SIV: 87816069666117410060395078597516336754
decode: variant: DCE 1.1, ISO/IEC 11578:1996
       version: 5 (name based, SHA-1)
       content: 42:10:C2:83:5B:32:0A:0B:39:A1:60:67:54:02:1E:72
                (not decipherable: truncated SHA-1 message digest only)

gain, please note that its the *next* build of RPM that has the ":uuid"
header format extension wired up to display UUID's in string (i.e. the "STR" representation)
format.

There's lots of ways to generate UUID's in scripting, and with bindings, and
with a tremendous amount of flexibility to handle other conventions. That
is in fact the power of using UUID's for an identification names space: you
can avoid re-inventing the wheel again and again and again (Mageia might
wish to figger how their sensible binrepo identifiers map into UUID's too).

I'd suggest that there's only benefit to committing to a UUID namespace form
for *.rpm package identifiers no matter what RPM one chooses to use. The
above scheme retrofits onto all *.rpm packages that have RPMTAG_BUILDTIME
and RPMTAG_PKGID tags in metadata, and generalizes (through the hierarchy
I've cobbled together with "/package/Taghere/Valuehere" mappings and
also extends (by choosing NEVRA as a hierarchical identifier) directly
to binary/source package components rather straightforwardly).

But feel free to propose your own "identification" mapping if you wish,
this just happens to be already implemented, and the UUID name space in RFC 4122
can support bajillions of "Have it your own way!" identifications schemes
with any difficulty whatsoever.

Blueprint information

Status:
Complete
Approver:
Jeff Johnson
Priority:
Medium
Drafter:
Jeff Johnson
Direction:
Approved
Assignee:
Jeff Johnson
Definition:
Review
Series goal:
Accepted for 5.3
Implementation:
Implemented
Milestone target:
None
Started by
devzero2000
Completed by
devzero2000

Related branches

Sprints

Whiteboard

Yup, already done. Search in CHANGES "permit --qf '%{RPMTAG:uuid}' UUIDv1/UUIDv5 output display." by jbj

(?)

Work Items

Dependency tree

* Blueprints in grey have been implemented.

This blueprint contains Public information 
Everyone can see this information.

Subscribers

No subscribers.