scanner core dumps on loggerhead trunk + python 2.6.2

Bug #586122 reported by Max Kanat-Alexander
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Meliae
Fix Released
High
John A Meinel

Bug Description

I modified my local loggerhead to use bzrlib.breakin, and then did a SIGQUIT immediately after loggerhead started up.

Then, doing scanner.dump_all_objects('meliae.dump') causes this:

 python: Objects/typeobject.c:2672: type_traverse: Assertion `type->tp_flags & (1L<<9)' failed.
 Aborted (core dumped)

Revision history for this message
Max Kanat-Alexander (mkanat) wrote :
Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

Here's a normal "bt" (not "bt full"), which may be easier to read if you don't need the (mostly-optimized-out) local variables.

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 586122] [NEW] scanner core dumps on loggerhead trunk + python 2.6.2

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Max Kanat-Alexander wrote:
> Public bug reported:
>
> I modified my local loggerhead to use bzrlib.breakin, and then did a
> SIGQUIT immediately after loggerhead started up.
>
> Then, doing scanner.dump_all_objects('meliae.dump') causes this:
>
> python: Objects/typeobject.c:2672: type_traverse: Assertion `type->tp_flags & (1L<<9)' failed.
> Aborted (core dumped)

 sigh

I've never seen this before in the wild, but it seems that:

 /* Because of type_is_gc(), the collector only calls this
    for heaptypes. */
 assert(type->tp_flags & Py_TPFLAGS_HEAPTYPE);

I'm using tp_traverse to find referenced objects. I could check that an
object is in the heap first, but I actually make active use of objects
that aren't in GC, but do reference other objects. (StaticTuple)

Maybe this is happening because you are running a debug build of python,
while I've only run with release versions, so I've never seen the assert
trip.

I suppose we could do a specific 'if is PyType and not TPFLAGS_HEAPTYPE'
check...

For now, I would try to avoid that check somehow, if you're just trying
to get info. I've never had problems, to understand why the assertion is
there.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkv99rwACgkQJdeBCYSNAAOhsgCeMEYH7QqD/dYZyXz+LuBr1gXL
0IoAnjqNSGjuIaTg/rqnlpA015jntHRC
=+xOy
-----END PGP SIGNATURE-----

Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

  Okay. FWIW, this is the current version of Python that comes with Fedora 12. Looking at how many "value optimized out" statements there are in the backtrace, it looks more like a heavily-optimized one than a debug build, but I will look at the build options used in the SRPM.

Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

Okay, so looking over it, it's actually build with NDEBUG, and --with-pydebug is not set. Here's some python-config output:

[build@es-compy ~]$ python2.6-config --cflags
-I/usr/include/python2.6 -I/usr/include/python2.6 -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC
[build@es-compy ~]$ python2.6-config --libs
-lpthread -ldl -lutil -lm -lpython2.6

Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

John: Could I see the output of the equivalent python-config commands on your system where meliae can successfully scan loggerhead?

Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 586122] Re: scanner core dumps on loggerhead trunk + python 2.6.2

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Max Kanat-Alexander wrote:
> John: Could I see the output of the equivalent python-config commands on
> your system where meliae can successfully scan loggerhead?
>

Well, I'm normally using Windows, which I don't think helps you much.
I've also been successful with Ubuntu (Lucid), which has this:

% python-config --libs
- -lpthread -ldl -lutil -lm -lpython2.6
% python-config --cflags
- -I/usr/include/python2.6 -I/usr/include/python2.6 -fno-strict-aliasing
- -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes
% python-config --ldflags
- -L/usr/lib/python2.6/config -lpthread -ldl -lutil -lm -lpython2.6

I tried to push up a fix here:
  lp:///~jameinel/meliae/skip_static_type_traverse_bug_586122

Can you give that branch a try, and see if it works for you?

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwWlm4ACgkQJdeBCYSNAANN8wCfZDrecWytDHkKl/duocs+dp9R
qJ4AoMjcqMCrllVmDe0Clx0uXpoaSL05
=C4FD
-----END PGP SIGNATURE-----

Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

Hmm. Well, it no longer crashes, but the dump file is 273 bytes and contains only three lines, even though loggerhead is using 473MB of RAM.

Perhaps it's related to Pyrex version? I have 0.9.8.4 installed.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Max Kanat-Alexander wrote:
> Hmm. Well, it no longer crashes, but the dump file is 273 bytes and
> contains only three lines, even though loggerhead is using 473MB of RAM.
>
> Perhaps it's related to Pyrex version? I have 0.9.8.4 installed.
>

Certainly doesn't seem right. And I really don't see how the changes I
made would have caused that. It would probably be easiest for us to
debug if we can find some overlap time on IRC.

I certainly can say that I ran a quick 'bzr branch' on the local system,
used SIGQUIT to interrupt it and get a debug shell, dumped memory and
got out a 129MB dump file, which is at least ballpark correct, (not 273
bytes and only 3 lines. :).

At first, all I can say is make sure you properly rebuilt the library
and were using the correct version.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwhMk4ACgkQJdeBCYSNAANJDACeLTkeap8R57qnnr3NtBV70RYa
BtIAn35QTxWN5dxGRnVwZJPlB1A3ZOxO
=6ZM0
-----END PGP SIGNATURE-----

Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

Hey John. Well, I'm pretty sure that I rebuilt it properly and am using the right version. I removed any other meliae file or directory in /usr/lib64/python2.6/site-packages/ and then did a "python setup.py install" on the checkout. Then I did a SIGQUIT on serve-branches to get a shell, and then I did:

from meliae import scanner
scanner.dump_all_objects('my.dump')

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Max Kanat-Alexander wrote:
> Hey John. Well, I'm pretty sure that I rebuilt it properly and am using
> the right version. I removed any other meliae file or directory in
> /usr/lib64/python2.6/site-packages/ and then did a "python setup.py
> install" on the checkout. Then I did a SIGQUIT on serve-branches to get
> a shell, and then I did:
>
> from meliae import scanner
> scanner.dump_all_objects('my.dump')
>

hmmm... you might want to set PYTHONPATH and then run from source.

python setup.py build_ext -i

SIGQUIT
> import sys
> sys.path.append(...) # or insert(0, ...)
> from meliae import scanner
> scanner.dump_all_objects('my.dump')

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwhXf4ACgkQJdeBCYSNAAPFFQCgojhiiLvHfHZ2GINQwJgNxYWd
oJEAn3cTkb5fADxtWun0aiZTHvKKU17U
=Zt/e
-----END PGP SIGNATURE-----

Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

I tried removing all the meliae stuff from /usr/lib64/ and symlinking in the built lib.x86_64.whatever/meliae/ directory into my loggerhead directory, with the same result.

John A Meinel (jameinel)
Changed in meliae:
milestone: none → 0.2.1rc1
status: New → Fix Released
importance: Undecided → High
assignee: nobody → John A Meinel (jameinel)
Revision history for this message
Max Kanat-Alexander (mkanat) wrote :
Revision history for this message
Max Kanat-Alexander (mkanat) wrote :

So, the real problem here is that even though python-config --cflags says NDEBUG, the Fedora 12 python (and the Fedora 11 python) are actually built without setting NDEBUG.

So perhaps the checks only need to be wrapped in ifdefs for when we're on a debug build.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.