Comment 42 for bug 989496

Revision history for this message
az (az-debian) wrote :

i've dug into this a bit more deeply and found an explanation for the logging gotchas under certain locales:

as per the python wiki (http://web.archive.org/web/20120425192131/http://wiki.python.org/moin/UnicodeEncodeError) when you run somestring.decode(whicheverencoding) python2 does weird
encode-and-then-decode things if your somestring is already unicode.

log.py uses s.decode('utf8',ignore) - which fails to ignore errors on some locales (apparently in the down-EN-coding step before the decode...).

check out the attached test script, which contains an iso8859 string to log. if you run it (at least under python 2.6) with LC_CTYPE=anything utf8 or plain C/POSIX then it works fine. if your CTYPE is iso88591, then the first decode of x works but the second decode fails, and we get the ascii can't encode complaint.

i've just changed the debian version (0.6.20-2) to do the decode in log.py conditionally:

_logger.log(DupToLoggerLevel(verb_level), s if (isinstance(s,unicode)) else s.decode("utf8", "ignore"))

regards,
az