Comment 1 for bug 1408088

Revision history for this message
Kurt Huwig (k-huwig-f) wrote : Re: not able to upload binary files when booting a vm

For me the description in the bug is confusing as you can encode every binary file into UTF-8 as it is 8 bit clean.

The error message I get is

ERROR: 'ascii' codec can't decode byte 0xdc in position 24: ordinal not in range(128)

Therefore for me it sounds as if the code tries to transcode the file from ASCII to UTF-8. As ASCII only allows 7 bits every byte with the high bit set cannot be interpreted. You get the same effect when trying this:

$ recode ascii..utf8 < /bin/ping > /dev/null
recode: Invalid input in step `ANSI_X3.4-1968..UTF-8'

Also you cannot interpret binary files as UTF-8 as not all byte combinations are valid:

$ recode utf8..iso8859-1 < /bin/ping > /dev/null
recode: Invalid input in step `UTF-8..ISO-8859-1'

IMHO the root problem is the interpretation of a file as some binary string encoding. If there is a need to interpret is as a string, I suggest to use an 8 bit clean charset like ISO-8859-1:

$ recode iso8859-1..utf8 < /bin/ping > /dev/null

works fine. Certainly this would break multi-byte UTF-8 characters in text files iff they are not to be copied somewhere but interpreted as text. But this is the fundamental problem: binary files should be treated as such and not interpreted as text while text files should be treated with the encoding - e.g. UTF-8 - configured for the platform.