strings should be normalized
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Won't Fix
|
Low
|
Unassigned |
Bug Description
Unicode strings should be normalized in some form, probably NFC for better legacy compatibility.
Right now translators can type with decomposed or composed characters but launchpad doesn't normalize the strings when saving, nor when searching.
For example, a translator might use a keyboard with precomposed characters such as 'é' and another a keyboard with composed characters such as "é". Launchpad doesn't consider these two to be the same yet Unicode defines them as being equivalent.
Another example is the search, searching "é" or "é" give different results when it should give the same results.
NFC is strongly suggested since it is the form used by the W3C Charater model. See http://
See http://
Changed in launchpad: | |
importance: | Undecided → Medium |
See here for a function that can do this for us: http:// www.python. org/doc/ 2.4/lib/ module- unicodedata. html