Add a collation reproducing the pre-5.1.24 behavior of utf8_general_ci and ucs2_general_ci to Percona Server 5.1

Registered by Alexey Kopytov

Fix for MySQL bug http://bugs.mysql.com/bug.php?id=27877 introduced an incompatible change in utf8_general_ci and ucs2_general_ci collations in MySQL 5.1.24: German letter "U+00DF SHARP S" 'ß' became equal to 's'. As a result:

1. any indexes on columns defined with those collations and containing ß must be rebuilt after upgrading from 5.0 or 5.1.23 or lower
2. unique constrains may get broken after upgrade due to possible duplicates.

There is a proposed patch in http://bugs.mysql.com/bug.php?id=43593 which addresses this problem by introducing new collations utf8_german3_ci and ucs2_german2_ci reproducing the old sorting order provided by pre-5.1.24 versions of xxx_general_ci. So one can use, for example, ALTER TABLE ... CONVERT TO CHARACTER SET utf8 COLLATE utf8_german3_ci after upgrade from 5.0 on affected tables, or change the collation on specific columns.

That patch, however, has not been merged to any MySQL trees at the moment of this writing. It is also unclear whether there are any plans to merge it.

This blueprint is to port collations from that patch, possibly under different names, because "german" looks confusing to many people storing values not necessarily specific to the German language in utf8_general_ci columns before an upgrade, so questions about the naming are quite frequent.

The proposed new names are utf8_general50_ci and ucs2_general50_ci.

References:

http://bugs.mysql.com/bug.php?id=27877
http://bugs.mysql.com/bug.php?id=43306
http://bugs.mysql.com/bug.php?id=43593

Blueprint information

Status:
Complete
Approver:
Stewart Smith
Priority:
Undefined
Drafter:
Alexey Kopytov
Direction:
Needs approval
Assignee:
Alexey Kopytov
Definition:
Approved
Series goal:
Accepted for 5.1
Implementation:
Implemented
Milestone target:
milestone icon 5.1.58-12.9
Started by
Alexey Kopytov
Completed by
Alexey Kopytov

Sprints

Whiteboard

(?)

Work Items

This blueprint contains Public information 
Everyone can see this information.