kernel 2.6.20-xx incorrectly claims processor overheating

Bug #94862 reported by Felix Braun
30
Affects Status Importance Assigned to Milestone
linux-source-2.6.20 (Ubuntu)
Won't Fix
High
Unassigned
linux-source-2.6.22 (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Binary package hint: linux-image-2.6.20-12-generic

I've recently updated my laptop to Feisty. However, I have problems getting the computer to boot into the new kernel. On boot up the kernel very often claims that it detected overheating of the CPU. Ironically, this seems to happen especially often, when the computer is just being started. The ACPI subsystem reports that the temperature has reached 255 C and brings the computer to a halt. Later in the halting process the correct temperature is reported. Something along the lines of
"CPU reached critical trip point (26 C) halting now."

This problem does not occur on the same hardware using edgy or gentoo (both running an older kernel)

Changed in linux-source-2.6.20:
assignee: nobody → ubuntu-kernel-acpi
importance: Undecided → High
status: Unconfirmed → Confirmed
Revision history for this message
Felix Braun (felix-braun) wrote :

I've just compiled 2.6.20.3-vanilla on my gentoo partition and experienced a similar problem. This bug seems to affect vanilla kernels too.

Revision history for this message
Henri Cook (henricook) wrote :

I have a similar problem using 2.6.20-13-generic - after about an hour of running, the kernel will claim the CPU has reached > 100C and shut the machine down, this does not happen with 2.6.20-12-generic and below.

Please contact me if any additional info is required; Online as 'ph8' or via email

Revision history for this message
Tommi Asiala (tommi-asiala) wrote :

I get the critical temperature error only on the first boot but not on the reboot after the error. This is a desktop computer and I've checked the temperature from bios, it's not 91C.

I'm running Linux 2.6.20-10-generic with feisty. I can't run 2.6.20-11 or later due to an another bug.

Revision history for this message
willalbro (willalbro) wrote :

I upgraded my wife's laptop from Edgy, which was very stable, to Feisty on 4/8/07. She is using a HP ZE 5400. The computer would run for a few minutes after startup and then I would get the critical temperature error with the kernel claiming that the CPU has reached > 100C and then it would shut the machine down. I was running Linux 2.6.20-14-386-generic. I did a clean install of edgy and no more problem. I wanted to report this, until it is fixed I will stay with Edgy.

Revision history for this message
JaumeFigueras (jaume-figueras) wrote :

Same here,

perfectly stable in Edgy and impossible to install Feisty due to shutdown because of overheating (120C). I'm using a HP-Compaq NX9010.

Revision history for this message
Henri Cook (henricook) wrote :

Confirming this still occurs in 2.6.20-15 for me - I randomly get board temp exceeded 100C, shutting down messages and have to use -12 to stop this happening

Revision history for this message
JaumeFigueras (jaume-figueras) wrote :

Same again.
Played with Ubuntu Feisty LiveCD, managed to install it, but after install the sistem shut down due to temperature problems. kernel 2.6.20-15.

Revision history for this message
Art Jennings (noddy) wrote :

(This bug is duplicate of #22336 it seems to me)

I can confirm this bug on Kubuntu/Ubuntu (Feisty). My machine is an ACER 1522 notebook and has previously run WinXP, SUSE10, 10.1, 10.2 with no problems.

When compiling a large project after maybe 4mins the system shuts down, I have been able to repeat this 5 times in a row before stumbling on this bug report.
Note that the fan is working etc..

relevant section from /var/log/messages..

Apr 21 17:07:56 user-linux -- MARK --
Apr 21 17:18:44 user-linux kernel: [ 1248.734964] ACPI: Critical trip point
Apr 21 17:18:48 user-linux gconfd (user-5426): Received signal 15, shutting down cleanly
Apr 21 17:18:48 user-linux gconfd (user-5426): Received signal 15, shutting down cleanly
Apr 21 17:18:48 user-linux gconfd (user-5426): Exiting
Apr 21 17:18:56 user-linux exiting on signal 15

Seems a shame I will have to go back to opensuse as the rest of the OS is great, especially the synaptic package manager. oh well...

Revision history for this message
Felix Braun (felix-braun) wrote :

There seem to be several different bugs with similar manifestations. In my case (don't morph my bug! :-) the problem /only/ occurs at initial startup. Once I get past the first couple of seconds, the computer keeps running without any problems, no matter how much I stress the CPU.

As I stated in my initial report, this seems to be some kind of underflow: the CPU is so cold that the kernel interprets it as beeing very hot. Once the CPU gets warm, everything works fine.

Revision history for this message
JaumeFigueras (jaume-figueras) wrote :

I think this is my last report.
I'm happy because i have upgraded my BIOS and everything seems to be OK. No shutdown in eight hours. I've installed, beryl, eclipse, 3D games, etc. and everything runs perfectly.
You should check your BIOS and test again.
Sorry for blaming a devs bug, when it was MY bug :(

Revision history for this message
Madeira (joaopedrohenriques) wrote :

Unfortunaly its not only your bug. There is a lot of buggy acpi implementations.
You have luck that your vendor corrected it. I don't have a new bios update available.
With edgy every thing is running fine. It ignored or corrected the bugs somehow.
When I boot my laptop with feisty cd in the startup it warns me with a critical temperature reached and a temperature stupidly high like 1800º Celsius
or even normal like 45º Celsius.

Revision history for this message
Sébastien Boisvert (sebastien-boisvert) wrote :

Hi, my computer is overheating when watching videos.

Hp Pavilion dv 4000

Ubuntu Feisty Fawn GNU/Linux 7.04

$ vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
 r b swpd free buff cache si so bi bo in cs us sy id wa
 1 0 0 436468 14668 264540 0 0 95 61 538 1459 21 3 74 2
$ uname -a
Linux mephisto 2.6.20-15-generic #2 SMP Sun Apr 15 07:36:31 UTC 2007 i686 GNU/Linux

Revision history for this message
Sébastien Boisvert (sebastien-boisvert) wrote :

Adding dmesg from my computer.

Revision history for this message
Tommi Asiala (tommi-asiala) wrote :

This is a bios bug for me. After restarting computer twice after the bios stage, the bios does report correct temperature.

Felix, what does your bios report as the CPU temp? If it's incorrect for you too then I think this bug should be closed since it's a hardware/bios problem (like Jaume Figueras already reported).

Revision history for this message
Henri Cook (henricook) wrote :

Please read my description of the problem, I think you'll agree it's certainly not a bios problem - i'm still using -12 generic to stop my system shutting down (randomly?)

It could only be a bios problem if -13 was the first one to actually start checking the system temp maybe?

Revision history for this message
footer (footer) wrote :

I would like to add my .02 to this discussion. I'm running Feisty and kernel 2.6.20-15-generic. I'm getting this entry in my /var/log/messages:

May 6 23:01:07 kubuntu64 gconfd (user-9919): Received signal 15, shutting down cleanly
May 6 23:01:08 kubuntu64 gconfd (user-9919): Exiting

about every 24 hours. I have an AMD X2 dual core running 64bit Feisty ... which is all I've ever run on this machine (new in January). I've got the latest BIOS on the motherboard (an Asus M2N-SLI Deluxe, 0903 BIOS ver). This shutdown is not a restart of the machine but rather a logout. When I wake it up the next morning, it's waiting at the login prompt. My CPUs normally run between 25-35C.

Let me know if I need to provide any further information.

Thanks!

Revision history for this message
Felix Braun (felix-braun) wrote :

Tommi:

unfortunately I have no way to know the BIOS's idea of my CPU's temperature because my laptop's BIOS setup does not display this information. However, it's very unlikely that this is a BIOS bug because neither 2.6.17-ubuntu nor 2.6.21-vanilla (which I'm using at the moment) exhibit this behaviour.

Revision history for this message
Frank Abel (frankabel) wrote :

I just want say that I found a very similar bug report on https://bugs.launchpad.net/ubuntu/+bug/22336 and as you can see the "Assigned To" field are different ;)

Revision history for this message
Len Brown (len-brown) wrote :

> "ACPI subsystem reports that the temperature has reached 255 C and brings the computer to a halt.
> This bug seems to affect vanilla kernels too.

Felix,
Please make sure that lmsensors is disabled on this machine.

If you still have the problem, with 2.6.21.y or later, then please file an upstream bug here:

http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI
category = Power-Thermal

Please attach the output from a kernel booted with CONFIG_ACPI_DEBUG=y,
the output from acpidump, and the contents of your /proc/acpi/thermal_zone/ tree.

Revision history for this message
Pierre-Étienne Messier (pierre-etienne-messier) wrote :

Here's my comment from #22336, since it's closely related...

I've got a similar problem on my laptop (Dell Inspiron 9300, 1.73GHz Centrino, Kubuntu 7.04 [upgraded from 6.10]). The problem wasn't there before the upgrade.

I set the CPU Policy to "Dynamic" (aka "on demand"). When doing normal browsing for example:
$ cat /proc/acpi/thermal_zone/THM/temperature
temperature: 51 C

At this point, the CPU is at 800MHz.

Then, let's do some intensive task, like compile a simple app. The CPU goes to 1.73GHz as expected. but just after 2 seconds:

$ cat /proc/acpi/thermal_zone/THM/temperature
temperature: 66 C

And it continues to increase! There is NO WAY that the temperature is rising by 15C in 2 seconds!

Also, just being idle and setting the CPU Policy to "Performance" raises the temperature to 62C (and raising)...

And of course, when doing too intensive tasks, the trip point will be reached and the system will shut down. As I said, this didn't happened when I was on Edgy...

Revision history for this message
Sébastien Boisvert (sebastien-boisvert) wrote : Re: [Bug 94862] Re: kernel 2.6.20-xx incorrectly claims processor overheating

Salut Pierre-Étienne,

Tu es toujours en génie?

J'étais, comme toi, dans le gulus..

Content de savoir que mon ordi est pas le seul a chauffer avec Feisty.....

Selon Pierre-Étienne Messier <email address hidden>:

> Here's my comment from #22336, since it's closely related...
>
>
> I've got a similar problem on my laptop (Dell Inspiron 9300, 1.73GHz
> Centrino, Kubuntu 7.04 [upgraded from 6.10]). The problem wasn't there before
> the upgrade.
>
> I set the CPU Policy to "Dynamic" (aka "on demand"). When doing normal
> browsing for example:
> $ cat /proc/acpi/thermal_zone/THM/temperature
> temperature: 51 C
>
> At this point, the CPU is at 800MHz.
>
> Then, let's do some intensive task, like compile a simple app. The CPU
> goes to 1.73GHz as expected. but just after 2 seconds:
>
> $ cat /proc/acpi/thermal_zone/THM/temperature
> temperature: 66 C
>
> And it continues to increase! There is NO WAY that the temperature is
> rising by 15C in 2 seconds!
>
> Also, just being idle and setting the CPU Policy to "Performance" raises
> the temperature to 62C (and raising)...
>
> And of course, when doing too intensive tasks, the trip point will be
> reached and the system will shut down. As I said, this didn't happened
> when I was on Edgy...
>
> --
> kernel 2.6.20-xx incorrectly claims processor overheating
> https://bugs.launchpad.net/bugs/94862
> You received this bug notification because you are a direct subscriber
> of the bug.
>
>

Revision history for this message
Felix Braun (felix-braun) wrote :

Len,

thanks for your comment. My problems with falsely detected overheating only affects 2.6.20.x kernels (both distro and vanilla) I have not been able to reproduce it with 2.6.21.y

Revision history for this message
James Valentine (jamesdavidvalentine) wrote :

In my opinion, bug #22336 is _not_ related. That talks about actual overheating, whereas what we are experiencing here is alleged overheating when the hardware is cool.

Closer matches would be:
    * #94862: kernel 2.6.20-xx incorrectly claims processor overheating
    * #111460: acpi misreports temperature, shuts down laptop

Revision history for this message
Russ Price (rjp-ubu) wrote :

This is also happening in Gutsy with 2.6.22-14-generic (installed version 2.6.22-14.47) i386 on an Abit AN-M2 motherboard with an Athlon 64 X2 4800+.

The lm_sensors readings are normal when this occurs, and a front panel temperature gauge which reads from a temperature probe stuck in the heat sink doesn't show anything abnormal, either. The system randomly decides that it has reached 121 C (and /proc/acpi/thermal_zone/THRM/temperature shows this wacky temperature as well) and forces a shutdown. The CPU and case fans are running. I tried updating to the latest BIOS from Abit, and there has been no change.

The problem tends to occur out of the blue, and seems independent of CPU activity - I've had it happen while browsing packages in synaptic (lm_sensors reports 26 C), and while transcoding in MythTV (lm_sensors reads 52 C). It sometimes happens after days of uptime, sometimes after minutes of uptime.

I've attached a snippet of /var/log/syslog where this has happened.

Revision history for this message
Russ Price (rjp-ubu) wrote :

To elaborate on what is happening on my Abit AN-M2 running Gutsy, I've plotted a graph of temperature readings. Note the 121 C spike in the ACPI reading that is not reflected in reading from lm_sensors. In this instance, the spike only lasted five seconds, not lasting long enough to force a shutdown.

I've also reported this problem to Abit, just in case it's a motherboard and/or BIOS issue. Perhaps the kernel needs an "acpi=ignoretemp" option to deal with this. If I run with "acpi=off" I get problems shutting down (manual power-off needed), suboptimal IRQ assignments, and no CPU frequency scaling.

Revision history for this message
Zamiere Vonthokikkeiin (kikkeartworx) wrote :

Look this:
http://ubuntuforums.org/showthread.php?p=4305576
Csim found massive errors in DSDT.
It's an Acer 5315.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Felix,

You had commented that you were not able to reproduce this bug with a 2.6.21.y kernel. Can you confirm then that this bug was resolved with the 7.10 Gutsy Gibbon final release? Also have you tested with the latest Hardy Heron Alpha release by chance? http://www.ubuntu.com/testing , I'd appreciate any feedback from you since you are the original bug reporter. Thanks!

Changed in linux-source-2.6.22:
status: New → Incomplete
Revision history for this message
pdebelak (pdebelak) wrote :

Gave up on linux

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Leann Ogasawara
Sent: Wednesday, March 05, 2008 5:05 PM
To: <email address hidden>
Subject: [Bug 94862] Re: kernel 2.6.20-xx incorrectly claims processor overheating

Hi Felix,

You had commented that you were not able to reproduce this bug with a
2.6.21.y kernel. Can you confirm then that this bug was resolved with
the 7.10 Gutsy Gibbon final release? Also have you tested with the
latest Hardy Heron Alpha release by chance?
http://www.ubuntu.com/testing , I'd appreciate any feedback from you
since you are the original bug reporter. Thanks!

** Changed in: linux-source-2.6.22 (Ubuntu)
       Status: New => Incomplete

--
kernel 2.6.20-xx incorrectly claims processor overheating
https://bugs.launchpad.net/bugs/94862
You received this bug notification because you are a direct subscriber
of the bug.

Revision history for this message
Felix Braun (felix-braun) wrote :

I can confirm, that the bug was fixed with the kernel delivered in gutsy-final. The hardy kernel doesn't have this particular bug either.

Changed in linux-source-2.6.22:
status: Incomplete → Fix Released
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Thanks Felix. I'm just going to close out the 2.6.20 task.

Changed in linux-source-2.6.20:
status: Confirmed → Won't Fix
Revision history for this message
przemo24555 (przemo2) wrote :
Curtis Hovey (sinzui)
Changed in linux-source-2.6.20 (Ubuntu):
assignee: Registry Administrators (registry) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.