qbzr crashed with UnicodeDecodeError in run_subprocess_command()

Bug #686735 reported by Vasily
42
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Bazaar
Invalid
Undecided
Unassigned
QBzr
Status tracked in Trunk
0.20
Fix Released
Critical
Martin Packman
Trunk
Fix Released
Critical
Martin Packman
bzr (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

Binary package hint: bzr

Something about ascii decoding - my english is bad :(

ProblemType: Crash
DistroRelease: Ubuntu 10.10
Package: bzr 2.2.1-0ubuntu1
ProcVersionSignature: Ubuntu 2.6.35-23.41-generic 2.6.35.7
Uname: Linux 2.6.35-23-generic i686
NonfreeKernelModules: nvidia
Architecture: i386
BzrDebugFlags: set()
BzrVersion: 2.2.1
CommandLine:
 ['/usr/bin/bzr',
  'qsubprocess',
  '--bencode',
  'l8:checkout13:--lightweight47:http://bazaar.ubercart.org/drupal7-uc3/ubercart21:/home/vasily/ubercarte']
CrashDb: bzr
Date: Tue Dec 7 22:48:18 2010
EcryptfsInUse: Yes
ExecutablePath: /usr/bin/bzr
FileSystemEncoding: UTF-8
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release i386 (20100429)
InterpreterPath: /usr/bin/python2.6
Locale: ru_RU.utf8
Platform: Linux-2.6.35-23-generic-i686-with-Ubuntu-10.10-maverick
ProcCmdline: /usr/bin/python /usr/bin/bzr qsubprocess --bencode l8:checkout13:--lightweight47:http://bazaar.ubercart.org/drupal7-uc3/ubercart21:/home/username/ubercarte
ProcEnviron:
 LANG=ru_RU.utf8
 SHELL=/bin/bash
PythonVersion: 2.6.6
SourcePackage: bzr
Title: bzr crashed with UnicodeDecodeError in run_subprocess_command()
UserEncoding: UTF-8
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare

Traceback (most recent call last):
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 912, in exception_to_return_code
    return the_callable(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 1112, in run_bzr
    ret = run(*run_argv)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 690, in run_argv_aliases
    return self.run(**all_cmd_args)
  File "/usr/lib/python2.6/dist-packages/bzrlib/commands.py", line 705, in run
    return self._operation.run_simple(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/cleanup.py", line 135, in run_simple
    self.cleanups, self.func, *args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/cleanup.py", line 165, in _do_with_cleanups
    result = func(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/bzrlib/plugins/qbzr/lib/commands.py", line 767, in run
    return run_subprocess_command(cmd, bencoded)
  File "/usr/lib/python2.6/dist-packages/bzrlib/plugins/qbzr/lib/subprocess.py", line 876, in run_subprocess_command
    val = unicode(val)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 13: ordinal not in range(128)

Related branches

Revision history for this message
Vasily (piatachki) wrote :
tags: removed: need-duplicate-check
Martin Pool (mbp)
description: updated
Martin Pool (mbp)
visibility: private → public
Revision history for this message
Martin Packman (gz) wrote :

Can hit the problem by making the blackbox tests use a non-ascii path like so:

=== modified file 'bzrlib/tests/blackbox/test_conflicts.py'
--- bzrlib/tests/blackbox/test_conflicts.py 2010-11-07 16:32:51 +0000
+++ bzrlib/tests/blackbox/test_conflicts.py 2011-04-13 14:05:01 +0000
@@ -43,7 +43,7 @@
         ('%s/myfile' % (this_path,), 'contentsa2\n'),
         ('%s/my_other_file' % (this_path,), 'contentsa2\n'),
         ])
- this_tree.rename_one('mydir', 'mydir3')
+ this_tree.rename_one('mydir', u'mydir3\xa0')
     this_tree.commit(message='change')
     this_tree.merge_from_branch(other_tree.branch)
     return this_tree, other_tree

Changed in bzr:
assignee: nobody → Martin [gz] (gz)
importance: Undecided → Low
status: New → In Progress
Revision history for this message
Martin Packman (gz) wrote :

A user on IRC has just run into this as well with QBzr 0.21.0.dev, also during a checkout.

Extract from his log is at:
<http://pastebin.com/H8vaabTX>

Changed in qbzr:
status: New → Confirmed
Revision history for this message
Martin Packman (gz) wrote :

Ignore comment #2 which is intended for bug 686161 not this one.

Changed in bzr:
assignee: Martin [gz] (gz) → nobody
importance: Low → Undecided
status: In Progress → Invalid
Revision history for this message
Martin Packman (gz) wrote :

This is a problem from the changes in r1234 (really), which blindly calls unicode() on all non-private attributes of an exception instance. If an exception has a non-ascii bytestring in its arguments, this exception will occur. Other objects that can't be stringified are also a potential issue.

The code is currently:

    try:
        return commands.run_bzr(argv)
    except Exception, e:
        d = {}
        for key, val in e.__dict__.iteritems():
            if not key.startswith('_'):
                if not isinstance(val, unicode):
                    val = unicode(val)
                d[key] = val
        print "%s%s" % (SUB_ERROR,
                        bencode.bencode((e.__class__.__name__,
                                         encode_unicode_escape(d))))
        raise

Adding more try/catch is an option, or being more careful about accessing the exception instance.

Revision history for this message
Alexander Belchenko (bialix) wrote :

So, qbzr failing to properly report the error to the main gui process from subprocess. I think the wrong part here is blind `val = unicode(val)` which in fact should explicitly specify the encoding, i.e. val = unicode(val, ENCODING) and maybe even errors handling as 'replace'.

But I'm not sure what is the correct encoding should be here.

Changed in qbzr:
importance: Undecided → High
Revision history for this message
Alexander Belchenko (bialix) wrote :

Ghaa, annotation suggest that I wanted encode unicode with `unicode-escape` encoding. But I didn't do that. Can't say why.

Changed in qbzr:
importance: High → Critical
Revision history for this message
Alexander Belchenko (bialix) wrote :

I don't think there is any problem with bzr.

Changed in bzr (Ubuntu):
status: New → Invalid
Revision history for this message
Alexander Belchenko (bialix) wrote :

@Marting[gz]: here is what I think should be the fix for this problem:

=== modified file 'lib/subprocess.py'
--- lib/subprocess.py 2010-10-21 11:32:48 +0000
+++ lib/subprocess.py 2011-04-14 12:55:40 +0000
@@ -890,9 +890,12 @@
         d = {}
         for key, val in e.__dict__.iteritems():
             if not key.startswith('_'):
- if not isinstance(val, unicode):
- val = unicode(val)
- d[key] = val
+ if isinstance(val, unicode):
+ d[key] = val
+ elif isinstance(val, str):

What do you think about it?

@Vasily: если можешь повторить проблему на своей машине, то пожалуйста протестируй мой патч.
+ d[key] = unicode(val, osutils.get_user_encoding(), 'replace')
+ else:
+ d[key] = repr(val)
         print "%s%s" % (SUB_ERROR,
                         bencode.bencode((e.__class__.__name__,
                                          encode_unicode_escape(d))))

Revision history for this message
Alexander Belchenko (bialix) wrote :

I've attahced the patch for qbzr, please test it if you can reproduce this problem.

Changed in qbzr:
status: Confirmed → In Progress
assignee: nobody → Alexander Belchenko (bialix)
milestone: none → 0.20.1
Revision history for this message
Martin Packman (gz) wrote :

That's the right idea Alexander, but the ability to pass multiple parameters to the unicode() function is there to catch you out - it only works on buffer-type objects, and exception instance attributes can be anything.

    >>> unicode(object())
    u'<object object at 0x00AB14C8>'
    >>> unicode(object(), "ascii")
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: coercing to Unicode: need string or buffer, object found

I'm still not quite sure what exactly this exception stringifying needs, but will put up a branch for review.

Revision history for this message
Martin Packman (gz) wrote :

Sorry Alexander, your patch actually avoids that problem, I misread. There's another corner case with repr I think may be worth handling though.

summary: - bzr crashed with UnicodeDecodeError in run_subprocess_command()
+ qbzr crashed with UnicodeDecodeError in run_subprocess_command()
Changed in qbzr:
assignee: Alexander Belchenko (bialix) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.