Merge lp:~max-rabkin/ibid/unihan-simp-trad into lp:~ibid-core/ibid/old-trunk-1.6
Status: | Merged |
---|---|
Approved by: | Stefano Rivera |
Approved revision: | 928 |
Merged at revision: | 928 |
Proposed branch: | lp:~max-rabkin/ibid/unihan-simp-trad |
Merge into: | lp:~ibid-core/ibid/old-trunk-1.6 |
Diff against target: |
123 lines (+35/-14) 1 file modified
ibid/plugins/conversions.py (+35/-14) |
To merge this branch: | bzr merge lp:~max-rabkin/ibid/unihan-simp-trad |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Keegan Carruthers-Smith | Approve | ||
Jonathan Hitchcock | Approve | ||
Stefano Rivera | Approve | ||
Review via email: mp+23995@code.launchpad.net |
Commit message
Add traditional and simplified variants to unihan information.
Clean up white-space style in unihan.
Description of the change
In the People's Republic of China and Singapore, some Han characters have been simplified. This can sometimes cause confusion when these characters are discussed, if some of the conversants are only familiar with one version. The Unihan database contains this information, so this patch adds this to Ibid's unicode information.
Some examples for testing:
U+56FD (国) and U+570B (國) are respectively simplified and traditional versions of each other.
U+4E00 (一) and U+65E5 (日) are the same in the simplified and traditional script, though the former has other variants (which we ignore for now).
+ def variant (self):
Whitespace style, please.
+ variant, _ = variant. contents[ 0].split( None, 1)
I'm not mad about using _ for an ignore as it's often used as a translation function.