X server at 100% CPU due to superkaramba?

Bug #103603 reported by Zoltan Peczoli
8
Affects Status Importance Assigned to Milestone
linux-restricted-modules-2.6.20 (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Daily updates feisty. xserver-xorg (7.2-0ubuntu11) freezes when left running for a longer period. I use kubuntu on both machines so it might be a KDE bug. When I leave my Kubuntu box running overnight, in the morning X takes 100% CPU time and does not redraw the window decorations (oddly enough it does redraw the window contents and the contents of kicker). I managed to reproduce this on two machines, an ATI and an NVidia card, both running the restricted-modules driver. See my strace output, top and dpkg -l output attached.

Revision history for this message
Zoltan Peczoli (peczoli) wrote :
Revision history for this message
Zoltan Peczoli (peczoli) wrote :
Revision history for this message
Zoltan Peczoli (peczoli) wrote :
Revision history for this message
Ralph Janke (txwikinger) wrote :

There is no swapspace configured. Do you have the same problem with swapspace configured ?

Revision history for this message
Zoltan Peczoli (peczoli) wrote :

The attached files came from the NVidia box. The ATI box has 2GB swap, and exhibits the same problem.

Ralph Janke (txwikinger)
Changed in xorg:
assignee: rjanke → nobody
status: Needs Info → Unconfirmed
Revision history for this message
Zoltan Peczoli (peczoli) wrote :

Trying to investigate the cause, I'm starting to suspect the bug is related to lack of desktop activity for a time period. Last time it happened after ~2 hours of inactivity. Screensaver was disabled, next I disable DPMS to see if it has any effect.

Revision history for this message
Zoltan Peczoli (peczoli) wrote :

I turned off DPMS in the KDE system settings, and the problem's gone on both machines.

Revision history for this message
Zoltan Peczoli (peczoli) wrote :

Unfortunately I was wrong. The bug _did_ occur with DPMS switched off. Tell me if you need more info.

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

It might be interesting to see dmesg for the nvidia machine when this happens...

Revision history for this message
Zoltan Peczoli (peczoli) wrote :

It happened again on the nvidia box. See the attached dmesg, I couldn't see anything wrong in it. Also see a new strace and the X log.

Revision history for this message
Zoltan Peczoli (peczoli) wrote :
Revision history for this message
Zoltan Peczoli (peczoli) wrote :
Revision history for this message
Voltaire (jkrueger-muenster) wrote :
Download full text (4.4 KiB)

After upgrading to Feisty I have the same Problem. When I left my computer for several hours alone the Desktop seems to work (Clock changes, Knewsticker is running) but when I move the mouse the Display freeze and the Keyboard stops working. In some cases the effect could happen after a few minutes. 2 days ago, I disabled the Screensaver, but the Bug still occurs.

The xorg process is running with nearly 100% CPU Time. An strace shows, that it loops on this command:

--- SIGALRM (Alarm clock) @ 0 (0) ---
sigreturn() = ? (mask now [])

dmesg shows something interesting:

[ 153.572298] NET: Registered protocol family 4
[ 153.636382] NET: Registered protocol family 3
[ 153.727573] NET: Registered protocol family 5
[ 159.658624] ISO 9660 Extensions: Microsoft Joliet Level 3
[ 159.931681] ISO 9660 Extensions: RRIP_1991A
[36590.032279] NETDEV WATCHDOG: eth1: transmit timed out
[36590.032286] eth1: Tx timed out, excess collisions. TSR=0x1e, ISR=0x8, t=2035.
[36590.480003] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf26
[36598.474546] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf27
[36606.469088] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf28
[36614.463632] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf29
[36622.458174] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf2a
[36630.452719] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf2b
[36638.447264] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf2c
[36646.441807] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf2d
[36654.436351] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf2e
[36662.430896] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf2f
[36670.425442] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf30
[36678.419985] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf31
[36686.414530] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf32
[36694.409075] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf33
[36702.403618] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf34
[36710.398162] NVRM: Xid (0005:00): 16, Head 00000000 Count 005eaf35
[36936.795626] NETDEV WATCHDOG: eth1: transmit timed out
[36936.795633] eth1: Tx timed out, excess collisions. TSR=0x1e, ISR=0x8, t=2036.
[37134.660593] NETDEV WATCHDOG: eth1: transmit timed out
[37134.660600] eth1: Tx timed out, excess collisions. TSR=0x1e, ISR=0x8, t=46288.
[37144.653773] NETDEV WATCHDOG: eth1: transmit timed out

Take a look at the lag between [ 159.931681] [36590.032279].

After killing xorg with kill -9 there are new entris in dmesg:

[36936.795626] NETDEV WATCHDOG: eth1: transmit timed out
[36936.795633] eth1: Tx timed out, excess collisions. TSR=0x1e, ISR=0x8, t=2036.
[37134.660593] NETDEV WATCHDOG: eth1: transmit timed out
[37134.660600] eth1: Tx timed out, excess collisions. TSR=0x1e, ISR=0x8, t=46288.
[37144.653773] NETDEV WATCHDOG: eth1: transmit timed out
[37144.653780] eth1: Tx timed out, excess collisions. TSR=0x1e, ISR=0x8, t=1289.
[37450.445078] NETDEV WATCHDOG: eth1: transmit timed out
[37450.445086] eth1: Tx timed out, excess collisions. TSR=0x1e, ISR=0x8, t=74039.
[37460.438258] NETDEV WATCHDOG: eth1: transmit timed out
[37460.438265] eth1: Tx ...

Read more...

Revision history for this message
Sitsofe Wheeler (sitsofe) wrote :

Voltaire:
Looks like a different, more severe problem. The only people who can help you decode NVRM: Xids are NVIDIA... Probably best to contact them or post your issue on their forums and hope for the best: http://www.nvnews.net/vbulletin/forumdisplay.php?f=13

Revision history for this message
Voltaire (jkrueger-muenster) wrote :

This is now fixed for me. After removing the Network Card (eth1) the system runs stable.

Revision history for this message
Patrick Salami (pat-entitycom) wrote :
Download full text (3.2 KiB)

This appears to be identical to bug 120347.

I'm running Kubuntu Feisty and I have an nvidia geforce 6200. I encounter the exact same problem: after leaving the computer running over night, I come back and xorg is at 100% (on one of my CPUs) and the window decorations have disappeared. Since I have a dual CPU system, my guess is that it is more tolerant towards this, so although X is completely frozen (only the mouse moves), I can ctrl+alt+F1 to a console and take down the system cleanly. (however, if I switch back to the X server before rebooting, the window decorations are gone)

After a reboot everything is fine. I also ran strace on xorg when it goes berserk, but the only thing that was noteworthy that I came up with was also the SIGALRT message. Comparing it to an strace of xorg in normal conditions (when it's not going berserk), however, revealed that the SIGALRT message also comes up during normal operation, and neither that nor any other calls that xorg makes while it's berserk indicate anything out of the ordinary.

Further, I have taken advantage of the dual CPUs to examine all logs in detail as the problem occurs: nothing out of the ordinary, in fact, since the system has been sitting idle, hardly any activity is logged. I checked the KDE settings and they are showing the screen saver to be off. I do have kpowermanager installed (because I had a problem where my network card would stop working after a while otherwise), but it's set to "Performance" and the screen saver, auto-suspend for the screen, and in fact all other power-saving options are unchecked. My screens do turn blank, however, after a period of inactivity, so it's possible that this is related.
On a side note, I have three monitors on two nvidia cards, running on xinerama with the 9755 nvidia binary drivers. All signs point to this not being a driver problem, however, unless nvidia and intel use some of the same libraries in their drivers.

I also wanted to point out that I usually don't lock my screen when I leave, so that might not be relevant to this particular problem.

In addition, my GLX does not work for some reason. Whenever I try to run a glx-enabled app, the app crashes, although it is installed, because the glxinfo command outputs a grid with the glx info.
Zoltan, please let me know the status of your GLX (try running a GLX screensaver and see what happens), as well as any power management software (such as kpowersave) that you may have running on your system. Also, do you have a screen save enabled and do your screens go into auto-suspend mode after a period of inactivity?

This is a really tricky problem and I really have to find a solution because it's starting to happen even after shorter periods of inactivity. The intervals are seemingly random, making the problem virtually impossible to reproduce and difficult to troubleshoot, but usually the problem happens after long periods of idle time (although it has occurred after only a few minutes of idle time), and every time it happens my workflow is interrupted, so it's really starting to bother me. It's also embarrassing if you're about to show someone (i.e. a client) something on your computer and it's co...

Read more...

Changed in xorg:
status: New → In Progress
Revision history for this message
Zoltan Peczoli (peczoli) wrote :

Do you use superkaramba? Here the problem is gone on both computers (ATi and nVidia) since I stopped using superkaramba and disabled DPMS. My GLX is set up with the ATi driver at home but as I mentioned at work I have an nVidia box which had the same bug, but I stopped using superkaramba and disabled DPMS there and now the problem is gone on both boxes. I don't know if really any of these or something else solved the problem, I'm just guessing.

Revision history for this message
Patrick Salami (pat-entitycom) wrote :

I am using superkaramba, but I don't know if the problem is related. How did you turn off DPMS? I can try shutting down acpid but I don't know if that will prevent my monitor from going into standby mode.

Revision history for this message
Zoltan Peczoli (peczoli) wrote :

Just try and disable superkaramba and see if it helps. You can always turn off dpms by "xset -dpms" but I really suspect superkaramba.

Revision history for this message
Bryce Harrington (bryce) wrote : linux-restricted-modules-2.6.20 is obsolete

This package has become obsolete so we're closing out the bug report as WONTFIX.
Thanks for reporting it though!

Changed in linux-restricted-modules-2.6.20:
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.