[nvidia-glx] Frequent lockups when NV 3d is enabled (solved)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
linux-restricted-modules-2.6.20 (Ubuntu) |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
Fiesty worked well until Nvidia 3d was enabled. ( nice new driver manager btw ;) )
Now system locks up hard in 3 - 30 seconds.( unless 3d disabled)
Browsing the forums it seems several people have this problem, a few have fixed it by using older (?) drivers or enabling 3d via envy or automatix.
Amd64
Ubuntu 7.04(64), herd 5- updated
Nvidia 7300
(btw, why won't the 'package choice' in this bug report acknowlege fiesty 7.04?)
Ric95 (pamric-shaw) wrote : | #1 |
Ric95 (pamric-shaw) wrote : | #2 |
Narrowing the bugsearch:
I have changed to Kubuntu 6.10, lockups are now very rare. The only lockup I have had with this is when dragging a Blender render window.
So it seems likely to be a mix of the Nvidia 3d driver with the later (gnome) window functions ( post Dapper).
I'd be happy to continue working on this bug if anyone wants ask me to try anything particular....
Sitsofe Wheeler (sitsofe) wrote : | #3 |
Ric95:
See if there is mention of NVRM: Xid in /var/log/messages ...
Phillip Heath (mstlyevil) wrote : Re: [nvidia-glx] Frequent lockups when NV 3d enabled | #4 |
I can confirm this bug as I had the same result using the restricted driver manager on a fresh feisty installation. I uninstalled the driver using the RDM them proceded to install the drivers manually. Manual installation did not fix the issue and neither did reconfiguring xorg using dpkg-reconfigure xserver-xorg. The kernel module continued to refuse to load forcing me to do a fresh installation.
After doing a fresh installation I installed the nvidia drivers both manually and with Automatix and this bug was not reproduced either time.
Phillip Heath (mstlyevil) wrote : | #5 |
I forgot to mention I am also n Feisty AMD64 7.04 and I used the alternate installation cd.
Phillip Heath (mstlyevil) wrote : | #6 |
Also I am using a Nvidia 7600 GT.
John Jason Jordan (johnxj) wrote : | #7 |
I was using the nvidia-glx on my laptop with a GeForce4 420 Go 64M and Ubuntu Edgy amd64. All was well until I upgraded to Feisty. At the end of the upgrade the reboot dumped me to the command line and I had to edit xorg.conf and change "nvidia" to "nv" in order to get X to load. I tried reinstalling it, but it won't boot. I also tried the nvidia-glx-new driver, but it also won't boot. I really need the nVidia driver back because it works so much better than the nv driver. Is there any news on when something might be fixed?
Sitsofe Wheeler (sitsofe) wrote : | #8 |
John:
There is not enough information in your comment to identify what the problem might be (plus it doesn't sound like your particular issue is a lock up). Please can you file a new bug report (attaching your xorg.conf file and Xorg.0.log) and post a link back here?
Ric95 (pamric-shaw) wrote : | #9 |
Solved.
I've noted the times when the system locks up, and in my case there was no events logged in the log files. But I switched to Xubuntu 7.04 and it runs clean and stable :)
[See if there is mention of NVRM: Xid in /var/log/messages ...]
Sorry, but I've wiped the partition when I installed Xubuntu. :(
Sitsofe Wheeler (sitsofe) wrote : | #10 |
Ric95's last comment makes this sounds like Bug #13530 . If so Ric95 is less likely to see the problem but assuming that the nvidia binary module is still being used the issue will flare up with programs like firefox...
Ric95 (pamric-shaw) wrote : | #11 |
Ya, it looks like several people have had problems with that.
Firefox hasn't lockedup, but my beloved Blender has ( rarely ).
What can I do to fix this and hopefully help the community ?
Can I recompile the NV driver ? ( I read an old page describing an Nvidia supplied compiler to tweak for the system... still available? )
I'm interested in copiling the kernel for my hardware , will that help?
Would it help to compile Blender from source? , a static compile is an option too.
Sitsofe Wheeler (sitsofe) wrote : | #12 |
Ric95:
There's very little to recompile - it a binary only driver. The only thing you could recompile is the glue layer which isn't really going to make much difference. Given that this bug lockups up your entire system the problem can't possibly lie with Blender so recompiling it would likely not help (the only thing you could possibly do is try and work out exactly what operation blender did which lead to a lock up and basically see if you could rewrite blender not to do it).
It might be interesting to find out whether disabling RenderAccel (as described here; http://
Ric95 (pamric-shaw) wrote : | #13 |
Too early to say if "RenderAccel" "0" helps, but trying the "nvidia-glx-new" broke it bad with a version mismatch. And reverting back to "nvidia-glx" didn't work, I ended up re-installing from scratch!. ( now it occurs to me that I should have purged nvidia, then re-installed. )
Ric95 (pamric-shaw) wrote : | #14 |
Nope. "RenderAccel" "False" ( not "0" btw) Doesn't help.
But at least Blender is the only thing that crashes it. ( pity, Blender is the most important piece of software for me.)
Blender is available in a static build that doesn't use system openGL libraries. I'll try that.
If it works I could submit the static version to repositories, but there are probably too few people who would benefit from that.
Sitsofe Wheeler (sitsofe) wrote : | #15 |
Ric95:
That's too bad. This is a long shot but I'm out of ideas:
Add the following new line just after #! /bin/sh in /etc/init.
exit 0
and then reboot. Any change?
Ric95 (pamric-shaw) wrote : | #16 |
I worried I may be too optimistic, but that seems to have cured it !!!!
On a hunch, I first tried turning off the service... and locked it up 30 seconds later. But then I edited that line in and have been working windows around blender just fine :) Thanks!,... what does "exit 0" do anyhow?
Bryce Harrington (bryce) wrote : | #17 |
'exit 0' simply causes a script to exit. So effectively, that change causes powernowd to exit without doing anything.
I gather this indicates the bug has something to do with power management and cpu frequency scaling.
Sitsofe Wheeler (sitsofe) wrote : | #18 |
Ric95:
Exactly what Bryce said.
Turning off services can be a bit fraught and the powernowd service has two different scripts which can cause it to be run. Unfortunately powernowd doesn't seem to support a setting in /etc/default/
Bryce:
You are bang on the money. This looks like Bug #109643 (or rather that bug looks like this one since this was filed first).
Ric95 (pamric-shaw) wrote : Re: [nvidia-glx] Frequent lockups when NV 3d is enabled and cpu scaling is on | #19 |
I'm just glad we could resolve it :) Thats wierd that Nvidia would effect that, but now that I think about it That sort of makes sense.
Bryce Harrington (bryce) wrote : | #20 |
Ric95, excellent to hear the problem is resolved for you. :-) Of course, the underlying issue of powernowd being bugged still exists, but that's covered by the other bug report.
Sitsofe, cool, that bug already has one dupe for it, so I'll mark this a dupe of it as well, and update its description.
Ric95 (pamric-shaw) wrote : | #21 |
Yes this is definatly a big improvement.
But I still get a lockup very rarely. More like once/ 2 hours rather than once/ 10 min.
Is it safe to uninstall Powernowd ?
Is Nvidia and the coders of Powernowd working to resolve this? ( mabey Nvidia-glx-new already has the fix....)
Sitsofe Wheeler (sitsofe) wrote : | #22 |
Ric95:
Drat. It looks like powernowd wasn't the sole cause of your problems. It's probably unwise to uninstall powernowd though. There's nothing the coders of powernowd can do about because their program is just a daemon that asks the kernel to change the speed of the CPU. There's nothing the kernel developers can do because they don't have the source to the NVIDIA binary drivers. If there is a problem with CPU scaling and the NVIDIA binary drivers the only people who can explain where the problem lies and how to fix it are NVIDIA as they are the only ones with the source code to their driver. That's the way it goes with binary only drivers...
As for whether things are any better with nvidia-glx-new... I don't know. In theory your card is supported so if you might be able to try those drivers out (PS I suspect your inability to revert back to the nvidia-glx driver last time was because of Bug #106217 )...
Ric95 (pamric-shaw) wrote : | #23 |
I may try nvidia-glx-new when I have time to tinker with it. I would be nice if there was an easier way to revert back to nvidia-glx. Hopefully with Gutsy both drivers will be in the driver manager. It sounds like they may build a sort of x-org crash recovery system. Cool.
When I do try, what code could completely remove mvidia-glx-new from terminal ? [ sudo purge nvidia ] ?
Then I could [ apt-get install nvidia ]
Sitsofe Wheeler (sitsofe) wrote : | #24 |
Unduplicating this bug based on Ric95's recent comment.
Ric95:
Generally speaking only one driver (the latest driver that supports your card) will be recommended for a given card. Anything else may coincidentally work but won't be "supported" by NVIDIA (but you will have to check with NVIDIA on that).
I suspect an apt-get purge won't be any better than a remove. Use apt to remove the package then remember to go and remove the dotfile mentioned in Bug #106217 afterwards...
Ric95 (pamric-shaw) wrote : | #25 |
Hi again. I've been exploring many other distros and they seem to have the same problems with Nvidia combined with the more recent Linux kernels. A redhat bugchase may implicate the'i2c handler' ( beyond my knowledge what that is).
Sitsofe Wheeler (sitsofe) wrote : Re: [nvidia-glx] Frequent lockups when NV 3d is enabled | #26 |
Ric95:
I suspect your best bet is to talk to NVIDIA directly about this (e.g. via http://
Sitsofe Wheeler (sitsofe) wrote : | #27 |
Ric95:
Can you include the output of
lspci -nn | grep -i nv
in this bug report?
Changed in linux-restricted-modules-2.6.20: | |
status: | Unconfirmed → Needs Info |
Ric95 (pamric-shaw) wrote : | #28 |
ric@ric-desktop:~$ lspci -nn | grep -i nv
01:00.0 VGA compatible controller [0300]: nVidia Corporation GeForce 7300 GS [10de:01df] (rev a1)
[I suspect your best bet is to talk to NVIDIA directly about this] . Ya, I would need to use the most up to date driver. They didn't exactly make that easy, what driver is in the gutsy betas ?
Sitsofe Wheeler (sitsofe) wrote : | #29 |
Ric95:
Currently _not_ the most up to date one - see Bug ##120943, .
Changed in linux-restricted-modules-2.6.20: | |
status: | Needs Info → Unconfirmed |
Sitsofe Wheeler (sitsofe) wrote : | #30 |
(That should have been Bug #120943)
Ric95:
Just before you switch to newer drivers can you also check whether the module parameter specified in Bug #115267 makes any difference?
BTW: if you want to try building a package of the newer drivers you may want to give Envy (http://
Sitsofe Wheeler (sitsofe) wrote : | #31 |
Ric95:
Could you also post the output of
lspci -vnn | grep -i nv -A 5
?
Ric95 (pamric-shaw) wrote : | #32 |
ric@ric-desktop:~$ lspci -vnn | grep -i nv -A 5
01:00.0 VGA compatible controller [0300]: nVidia Corporation GeForce 7300 GS [10de:01df] (rev a1) (prog-if 00 [VGA])
Subsystem: Unknown device [19f1:1fe2]
Flags: bus master, fast devsel, latency 0, IRQ 18
Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
Memory at d0000000 (64-bit, prefetchable) [size=256M]
Memory at fb000000 (64-bit, non-prefetchable) [size=16M]
I'll try installing the newer drivers asap.
I also want to try adding 'acpi=off' and 'noapic' to the kernel parameters line in /boot/grub/menu.lst
..but one thing at a time :)
Sitsofe Wheeler (sitsofe) wrote : | #33 |
Ric95:
Hopefully you will get this before you switch drivers...
Can you check that using the module parameter specified in https:/
Ric95 (pamric-shaw) wrote : | #34 |
Its hard to say. But I'm now thinking that I'm chasing two bugs that cause the same lockup :(
After making that change as per bug115267 I didn't notice the random lockup, but I still induced a lockup by dragging a window. I'll try to find time to chase that separately.
Would it be possible for the developers to make a hacked kernel ( with debugging features compiled in ) available in repositories ?
Sitsofe Wheeler (sitsofe) wrote : | #35 |
(I'm not an Ubuntu dev). I don't think you are going to have much luck debugging this unless you know how to set up a serial console and I would be surprised if someone built a debug kernel just for you, however there's nothing stopping you rebuilding your own kernel with various extra options. My feeling is that having a kernel with more debug features on would not help you as it sounds like the problem lies in the binary part of the NVIDIA module and you will not have any debug symbols for that (only NVIDIA do). It sounds more and more like your best chance with this is to talk to NVIDIA (http://
Ric95 (pamric-shaw) wrote : | #36 |
I'm back just to post an update in hopes I can help make Ubuntu the best ever. ( a postcard from the edge ;) )
I've used many Linux distros lately, and almost all had a similar lockup on my hardware. ( note to self: no more compaq )
Now I'm using Sabayon/Gentoo. 2.6.21, 64 bit, it had the lockup problem until someone mentioned using the nvidia driver vers; 1.0.9755-r1. this has completely fixed that random lockup.:)
http://
I still like Ubuntu. I'm crossing fingers and toes hoping you guys use 2.6.23 with nv 1.0.9755-r1 for 8.04 ;)
Ric95 (pamric-shaw) wrote : Re: [nvidia-glx] Frequent lockups when NV 3d is enabled(solved) | #37 |
My apologies to everyone who tried to help me.
The problem was in my BIOS. When I installed My PCI-Express video card the BIOS set itself to PCI, so the OS would end up looking in the wrong slot for video :(
Bryce Harrington (bryce) wrote : linux-restricted-modules-2.6.20 is obsolete | #38 |
This package has become obsolete so we're closing out the bug report as WONTFIX.
Thanks for reporting it though!
Changed in linux-restricted-modules-2.6.20: | |
status: | New → Won't Fix |
I have now downgraded to Edgy 6.10 (64bit) but still occasionally get the same lockup. None of the log files record the event, neither does terminal.
How can I help find this bug? ( I can reproduce it by moving windows around for a while)