CPU overheats during high usage "throttling <not supported>"

Bug #22336 reported by Daugirdas on 2005-09-23
182
This bug affects 1 person
Affects Status Importance Assigned to Milestone
acpi
Fix Released
Unknown
Baltix
Undecided
Unassigned
Fedora
New
Undecided
Unassigned
acpi-support (Ubuntu)
Undecided
Unassigned
linux (Ubuntu)
Undecided
Unassigned
linux-source-2.6.17 (Ubuntu)
High
Unassigned
linux-source-2.6.20 (Ubuntu)
Critical
Unassigned

Bug Description

I can't compile anything larger than ndiswrapper. I couldn't even produce a
backtrace for gthum. I can't play tuxracer either. My laptop overheats while
running linux, and is powered off by kernel:
Sep 23 12:21:20 localhost kernel: [ 728.519975] Critical temperature reached
(80 C), shutting down.

I must stress that this issue doesn't occur while running WinXP x64 edition. I
ran CPU Burn-in for 7 min without any problems while it takes only 3min on linux...
I can play games for say half an hour (since later I just lose any interest).

My hardware is: Acer Aspire 1522WLMi (AMD64 3000+).
/etc/modules contains powernow-k8, and cpufreq_userspace.
I haven't tried any other distro on this laptop properly so I can't confirm if
it is ubuntu specific.

Matt Zimmerman (mdz) wrote :

CPU frequency scaling doesn't keep your laptop from overheating; is the fan
activating? Are the fan and thermal modules loaded?

Matthew Garrett (mjg59) wrote :

Is the fan switching on?

Daugirdas (daugirdas) wrote :

yes, thermal and fan are loaded. However it seems only BIOS has control of a fan
speed (I know this for sure, just don't ask me where from,- possibly acer.com)
The fan spins up at the full speed when I am CPU usage increases. So that all
seems ok. I guess you might argue that my laptop is defective. But on the other
hand I was not able to reproduce that on windows. There is AMD proc. driver for
windows which might be actually doing all the job.

Sorry I can't be any more specific.

Matt Zimmerman (mdz) wrote :

If the fan is running at full speed and yet the CPU is overheating, I don't see
how this can be the fault of the operating system

Daugirdas (daugirdas) wrote :

yet somehow windows works. I guess there should be some module/daemon which
tries to reduce CPU voltage in case too much heat is eliberated.

Matthew Garrett (mjg59) wrote :

This should be happening automatically when your system reaches the passive trip
point in /proc/acpi/thermal_zone/*/trip_points. Can you attach the contents of
that file?

Daugirdas (daugirdas) wrote :

Created an attachment (id=4031)
/proc/acpli/thermal_zone/THRC/

Daugirdas (daugirdas) wrote :

Created an attachment (id=4032)
.../THRS/

When the temperature of 80C is reached the laptop is shutdown (in software mode
rather than hard poweroff). As you mentioned in your message the temperature
should be lowered by readjusting [something] instead.

Matthew Garrett (mjg59) wrote :

Hmm. Interesting. Can you attach the contents of /proc/acpi/dsdt ?

Daugirdas (daugirdas) wrote :

Created an attachment (id=4035)
/proc/acpi/dsdt

Matthew Garrett (mjg59) wrote :

Ok. THRS is your system temperature rather than your CPU temperature - the
system is supposed to start slowing the CPU down once it gets to 75 degrees. Can
you try the following:

watch cat /proc/acpi/thermal_zone/THRS/trip_points

then do something CPU intensive. When the temperature gets above 75 degrees, cat
/proc/acpi/processor/*/throttling and see if it changes - a * should appear by
the currently active field.

In *theory*, once the temperature gets above 75 degrees, the processor should
throttle down heavily. In practice, 5 degrees may be sufficiently little that
the kernel may not ramp up the throttling fast enough to help. Your DSDT states
that the resampling should only take place every 30 seconds - if the kernel
doesn't throttle the machine sufficiently, it may reach the trip point before
that period has elapsed.

Daugirdas (daugirdas) wrote :

daugirdas@dtr-linux64:~$ cat /proc/acpi/processor/CPU0/throttling
<not supported>

I didn't run anything CPu hungry yet as I don't like [the above] at all. Any
modules or whatever else could be missing here?

Matthew Garrett (mjg59) wrote :

Ooh. That would explain it. I'll look into why that might be appearing.

Matthew Garrett (mjg59) wrote :

Ok, your system doesn't support throttling. That's a little surprising, but not
necessarily fatal. Can you attach the output of lsmod?

Matthew Garrett (mjg59) wrote :

Ah. I bet I know what it is. The ACPI layer should slow your CPU down, but
powernowd immediately notices that and speeds it up again. Can you stop
powernowd, make sure that powernow-k8 is still loaded and see if the system
still fails?

Daugirdas (daugirdas) wrote :

Created an attachment (id=4045)
output of lsmod

Daugirdas (daugirdas) wrote :

Created an attachment (id=4046)
CPU capabilities

generated on windows. The file supports the idea that throttling is not enabled
on this CPU.

Daugirdas (daugirdas) wrote :

Well I tried disabling powernowd. That doesn't seem to help at all. It still
powers down. ../THRS/temperature indicated 54C just before shutting down. There
is that enormous increase in just a fraction of a second.

Matthew Garrett (mjg59) wrote :

Hmm. Are you running the latest kernel (2.6.12-9)? That has some code that may
help in this respect (temperature events being reported slowly).

Daugirdas (daugirdas) wrote :

Yes, 2.6.12-9 k8 ubuntu stock kernel. I've been always running ubuntu latest kernel

Matthew Garrett (mjg59) wrote :

Still with powernowd disabled, can you try

echo 1 >/proc/acpi/thermal_zone/THRS/polling_frequency

and see if that results in the temperature being read any more smoothly?

Daugirdas (daugirdas) wrote :

Still I get poweroff @ 5X C. I wonder if it is the actual temperature that
kernel responds to in this case.

Chris Moore (dooglus) wrote :

I just noticed this bug.

I have the same problem. If I run a CPU intensive task, my laptop switches off.
 If I want to recompile the kernel for instance, I have to keep hitting
control-Z to pause the job.

I used to run both Windows XP and Mandrake Linux and didn't see the problem in
either of those.

Øivind Hoel (eruin) wrote :

Well, I guess this is a "me too". I haven't actually bothered reporting anything
until now as I've grown quite fond of the computer resetting and that famous
"recovering journal" message ;-)

I don't have overheating problems in windows, even while running very
cpuintensive games, but things like running aclocal while compiling software
will easily kill my poor laptop... Same story here - fan switches on until it
gets rather loud, then black screen and a lovely reboot.

The laptop in question is an elitegroup g556e (
http://www.ecsusa.com/products/g556.html ).

eruin@ubuntu:~$ cat /proc/acpi/processor/CPU1/throttling
<not supported>

Øivind Hoel (eruin) wrote :

Created an attachment (id=4149)
output of lscpi

Øivind Hoel (eruin) wrote :

Created an attachment (id=4150)
Output of lsmod, ofcourse...

Daugirdas (daugirdas) wrote :

I installed SUSE 10.0 amd64 and it seems to be much more stable. I haven't
managed to reproduce the problem using SUSE yet.

Windows are unaffected at all.

Matt Zimmerman (mdz) wrote :

Is this still a problem with 6.06 beta 2 or current Dapper?

Michele Campeotto (michelec) wrote :

Yes, I get similar problems on a P4 desktop. We have three identical computers (Acer 7600GT with NVidia FX6600) running Dapper here and two of them have this problem.

$ cat /proc/acpi/thermal_zone/THRM/trip_points
critical (S5): 90 C
$ cat /proc/acpi/thermal_zone/THRM/cooling_mode
<setting not supported>
cooling mode: critical
$ cat /proc/acpi/thermal_zone/THRM/polling_frequency
<polling disabled>
$ cat /proc/acpi/thermal_zone/THRM/state
state: ok

I just tried disabling powernowd so see if it helps.

Michele Campeotto (michelec) wrote :

Upon more investigation, it seems that 90°C is NOT my CPU's critical temperature, I don't know the exact CPU model I have, but all Intel's datasheets for P4s have lower values (67-73°C).

I stopped powernowd and set the polling frequency to 1s, the temperature hasn't changed so far. Let's see.

Michele Campeotto (michelec) wrote :

More info: my system is running at 62-63°C, while an identical system next to mine with Windows XP runs at about 54°C.

GreatBunzinni (greatbunzinni) wrote :

I've got an Acer Aspire 1524 that is running Kubuntu 6.06 and right now it shut down due to overheating while watching a google video.

If anyone wants some king of system info, please ask and preferably specify what command you wish that I log.

ChristianGramsch (kozzah) wrote :

I am experiencing the same problem here on a Fujitsu-Siemens Notebook (Amilo A1650G) with a Mobile Sempron 3400+.

It happened on Ubuntu 6.06 so I switched to Suse 10.1 because 10.0 worked fine a few months ago. But now I have the same problem as descriped above - so maybe it is not a problem with Ubuntu but a problem with a new version of a certain software.

When the CPU switches to 2000 MHz, the Notebook switches off within 5 minutes when the cpu-load keeps being high, /proc/acpi/thermal_zone/TZS0/temperature showing ~75 °C.

The problem doesn't occur when running very cpu-intensive applications on windows.

One thing seems very strange to me - I often read on several websites, that my notebook doesn't even have a sensor for the temperature, and on previous versions of Suse and Ubuntu I could not get any temperatures, the same on windows.

When I let the system run at 2000 MHz for a while and switch it manually to 800 MHz the temperature shown falls down from ~75 to ~50 within a second.

Daugirdas (daugirdas) wrote :

Well I upgraded to SUSE 10.1 AMD64 and it is working fine. Therefor the problem is ubuntu specific. Since I am not running ubuntu anymore on a laptop (my mum is still using breezy on a desktop i686 - that one is fine) I can't check it unless I buy a new hdd. Could someone please try booting into ubuntu with suse 10.1 amd64 kernel. If the problem is gone we would at least know where to begin.

Michele Campeotto (michelec) wrote :

I think I have my problem fixed, I'm still testing, but the module p4-clockmod. I have loaded it (and added to /etc/modules) and now powernowd seems to work and (most importantly) my CPU is way cooler.

Michele Campeotto (michelec) wrote :

err... no... with the conservative governor, the clock at 50% (1.27GHz) the temperature is back up to 62°C...

Sylvain (s-delahaies) wrote :

I've got the same problem, ie my toshiba SPA40 crashes when it gets too hot, which happens very often. I am using ubuntu (?), I don't know which version, how can I find which version I am running? I used Debian for about a year on this laptop and I never had any problem, same with Fedora core 4, and with Demudi.
Inspired by the last posts I used cpufreq-set to fix my cpu at 1.6 GHz , it works fine but a bit slow, no overheating so far, the problem now is that I can't change frequency anymore!!

Chris Moore (dooglus) wrote :

I run ubuntu and debian on the same laptop. ubuntu crashes if I use the CPU for more than a few minutes at a time, debian doesn't. I can tell when ubuntu is about to power down because the fan in the laptop starts running at full speed continuously, making quite a loud noise, whereas in debian the fan alternates between full speed and something slower. It's as if debian is noticing that the fan isn't able to cool the CPU enough and does something to make the CPU generate less heat, whereas ubuntu doesn't.

What are the significant differences between ubuntu and debian with regards to the CPU speed management? Hopefully I can find a way of either reproducing the problem in debian, or making it disappear in ubuntu - but what should I tinker with?

Chris C Moore (moochris) wrote :

I have the same laptop as the originator of this bug, slightly different model due to uprated CPU (Acer Aspire 1524Wlmi).

I have been experiencing the same problems due to overheating. I have fixed any compile errors and warnings in the DSDT, but still have no throttling support and the following CPU info is reported:

cat /proc/acpi/processor/CPU0/info
processor id: 0
acpi id: 0
bus mastering control: no
power management: no
throttling control: no
limit interface: no

powernow-k8 is loaded and I've tried stopping powernowd and changing the polling frequency, but didn't seem to help.

Trae McCombs (traemccombs) wrote :

I've got to pipe up with a "me too"[tm].

HP DV4150US Intel Pentium M Processor 1.6 Ghz with 2G of ram.

When I watch a video on YouTube, or do anything fairly CPU intesive for 3-5 minutes, the whole laptop shuts off. I've tried some of the above things, and get the same "throttling control: no & limit interface: no" stuff.

This is a major bug... you can typically find me on #ubuntu or #ubuntu-offtopic as "Trae" if you are looking for some sympathy or you know how to help me fix this issue.

tx in advance!
Trae

Trae McCombs (traemccombs) wrote :

Just as a note to my above comment. I went and nuked the same laptop, and put Gentoo Linux 2006.0 on it. I did an emerge of gnome, which took about 12hrs to compile and it would never overheat.

Then, I went and installed S.L.E.D. ( Novell's Suse Linux Enterprise Desktop ), and I was able to play a video for as long as I wanted without any problems.

Did anyone find a fix for this yet? I was thinking about trying FC6-T2 just as another definative test, but... I'm missing Ubuntu. :(

GreatBunzinni (greatbunzinni) wrote :

It's a shame because it seems that Ubuntu is the only distribution which suffers from this bug. To make matters worse, this bug is creating problems in Ubuntu for over than a year without a solution. Two releases of Ubuntu have passed and until now... nothing. Still overheating.

I love Kubuntu and I believe that until now it was the best linux distribution I had the pleasure to use but this bug is starting to be unbearable, not mentioning that it is a danger to the hardware.

Trae McCombs (traemccombs) wrote :

I just realized when you said this bug was a year old that, indeed, it was approaching a year. How come something that affects so many people gets put off for so long?

The sad thing is, I am jailed, held captive, as every other distro out there besides Ubuntu stinks. :) I've tried all of the recent offerings, Gentoo, S.L.E.D., Fedora Core, etc... and Ubuntu is the best, hands down.

Is anyone even looking at this or working on it? I am happy to do whatever it takes from my "non-programmer" abilities to help fix this bug. I'll do tests or whatever.

Thanks,
Trae

Øivind Hoel (eruin) wrote :

Re the status of this bug as NEEDINFO; what info is required at this point?

Personally I find I have to manually get the CPU fan going by slowly stressing metacity harder by moving windows around the screen before I can do any really CPU-intensive work. Seems my fan just doesn't want to start spinning fast - fast enough.

I'm on a centrino/PM1.6 laptop. Perhaps powernowd is the culprit?

Luka Renko (lure) wrote :

I was just burnt by this one today on my HP nw8240: I was building kdebase several times in a row and it takes around one hour for a clean build. The third try the machine just powered off. Interesting that it did not power on for 15-20 minutes until it cooled off a bit.

Nikolaus Filus (nfilus) wrote :

Hi, just happened to me on my Samsung P35 XVM1600 notebook with Dapper 6.06.01 (installed 4 days before). I was logged in in Gnome, with mplayer paused in fullscreen. I went away for some minutes, came back and it was reacting on any keypress/event once every tens of seconds. Tried to log into root console to find out more, but was unsuccessfull and the laptop powered off (thanks god!) after around 10 minutes, was extremely hot, but fan was off. Next boot showed 70°C.
Never had this before on Gentoo, even with the patched DSDT with rised fan values.
Power management seems broken on Ubuntu and I have reduced battery life, which seems also to be related to gnome's often polling - but this is another story.
How to provide more infos, without destroying my notebook?

Trae McCombs (traemccombs) wrote :

Possible Fix?

First, I'd like to thank Keybuk from #ubuntu-devel for his help on this. He took quite a lot of time walking me through things to check.

The thing that seems to have worked was this:
[do it in this order please!]

sudo /etc/init.d/powernowd stop

sudo sh -c 'echo -n ondemand > 'sys/devices/system/cpu/cpu0/cpufreq/scaling_governor'

...

Then, I was able to monitor my battery temp by doing the following:
watch " acpi -V"

Some other commands to get interesting information:
cat /proc/acpi/thermal_zone/*/cooling_mode
cat /proc/acpi/thermal_zone/*/trip_points

With me, this problem now seems to be fixed doing the above two commands { powernowd stopping, and ondemand echo }

I was able to play a video on YouTube for 22mins straight without it shutting off. My temps got up to around 94C at one point, but they would scale down here and there, and my laptop never shut off.

Thanks again Keybuk for all of your help! Let's hope this leads to this bug being resolved. Please post your notes on this here to this bug.

Trae

Øivind Hoel (eruin) wrote :

That didn't do it for me. Yes, the cpu is more responsible, but the fan now struggles even more to keep up, since the step top 1.6Ghz is even quicker.

The problem (here, atleast, seems to be the fan not starting fast enough).

Trae McCombs (traemccombs) wrote :

Bad news to report...

Tonight I was watching something on Youtube for about 5 mins in and it shut off on me again. :(

The odd thing is, I watched several videos the other day, with no problem. I wonder if the stuff in /etc/rc.local is actually getting parsed. Here is what I have:

# Powernowd is causing problems...
/etc/init.d/powernowd stop
# Use ondemand instead of powernowd...
sh -c 'echo -n ondemand > sys/devices/system/cpu/cpu0/cpufreq/scaling_governor'
# Turn off the pesky touchpad
/sbin/rmmod psmouse

Does this look right? Do I need & at the end?

Tx,
Trae

Øivind Hoel (eruin) wrote :
GuitarFingers (idrconsultants) wrote :

I have to add a "Me Too" here also. I reported this on Ubuntu Forums some time ago but the issue disappeared (probably because I wasn't doing anything particularly CPU intensive until now). Converting sound files guarantees me a shutdown within a minute or two of starting the process.

Running Ubuntu 6.06 on a HP Pavilion zd8000 laptop. This particular bug is pretty major to me as its begining to effect my day to day work as well. I'll keep a close eye on this topic and give any assistance where I can. :-)

GuitarFingers

GreatBunzinni (greatbunzinni) wrote :

Isn't there a way to manually scale the CPU/fan to minimize the overheating problem? It would not be a solution to the problem but it would help in all those times that we need to do something demanding like compiling some app, watching a video or even using clamAV.

So please, if someone knows how to hand-scale the CPU/fan... please let us know.

Nikolaus Filus (nfilus) wrote :

@GreatBunzinni: Yes, it's possible. But that would be only a workaround for people suffering from the problem until a real fix is found and implemente.
Have a look at /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq. AFAIK this works only for kernel-space scaling and NOT if using powernowd and that alike. In this case there should be config option.

@all:
fixed for me with: stop powernowd and use ondemand scaling

breaks for me: after resume from hibernation, sometimes thermal zone shows 0 and fan doesn't kick on. After short time cpu max throttling is enabled (by kernel?), what makes the system unusable because of reaction to events every ten seconds.
workaround: include thermal and fan in kernel, NOT module (tested before on gentoo, not verifed yet on ubuntu)

Is it too much to hope this fix will be in Edgy? And/Or released backwards
for Dapper once it's found? This, to me, is a MAJOR bug, as it keeps me
from viewing any video on the net on my laptop.

On 9/6/06, Nikolaus Filus <email address hidden> wrote:
>
> @GreatBunzinni: Yes, it's possible. But that would be only a workaround
> for people suffering from the problem until a real fix is found and
> implemente.
> Have a look at /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq.
> AFAIK this works only for kernel-space scaling and NOT if using powernowd
> and that alike. In this case there should be config option.
>
> @all:
> fixed for me with: stop powernowd and use ondemand scaling
>
> breaks for me: after resume from hibernation, sometimes thermal zone shows
> 0 and fan doesn't kick on. After short time cpu max throttling is enabled
> (by kernel?), what makes the system unusable because of reaction to events
> every ten seconds.
> workaround: include thermal and fan in kernel, NOT module (tested before
> on gentoo, not verifed yet on ubuntu)
>
> --
> laptop overheats when performing CPU intensive tasks.
> https://launchpad.net/bugs/22336
>

--
    Trae McCombs || http://occy.net/
  Founder - Themes.org // Linux.com
  CivicSpaceLabs - http://civicspacelabs.com/

@Nikolaus: Your suggestion worked! Well, it isn't perfect but being able to run a full ClamAV scan with the motherboard temperature never going beyong 70C was a big thing for me. Thanks! But is it possible to apply the same method to throttle up the CPU fan? That would indeed be the best workaround I could get.

P.S.: It seems that klaptop should be able to switch scaling governors. Unfortunately I don't know why but it doesn't. It displays the correct scaling governor when it is tweaked by hand though.

[private mail]

Hi,

GreatBunzinni wrote:
> @Nikolaus: Your suggestion worked! Well, it isn't perfect but being able
> to run a full ClamAV scan with the motherboard temperature never going
> beyong 70C was a big thing for me. Thanks! But is it possible to apply
> the same method to throttle up the CPU fan? That would indeed be the
> best workaround I could get.

No, I don't think so, as most laptops have either a broken or limited fan
control, which is heavily restricted by the BIOS and not controlable from the
OS. I, for example, had to modify the DSDT ACPI table in my BIOS to change the
on/off values for my fan. I can't even see the results in /proc - it's always on.

Nikolaus

Trae McCombs (traemccombs) wrote :

Errr... I realized I made a typo in my post above. Here is what one SHOULD
have in their /etc/rc.local

{Note: This still doesn't fix the problem for me}

# Powernowd is causing problems...
/etc/init.d/powernowd stop

# Use ondemand instead of powernowd...
echo -n ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

On 9/6/06, Trae McCombs <email address hidden> wrote:
>
> Is it too much to hope this fix will be in Edgy? And/Or released
> backwards for Dapper once it's found? This, to me, is a MAJOR bug, as it
> keeps me from viewing any video on the net on my laptop.
>
>
>
>
> On 9/6/06, Nikolaus Filus <email address hidden> wrote:
> >
> > @GreatBunzinni: Yes, it's possible. But that would be only a workaround
> > for people suffering from the problem until a real fix is found and
> > implemente.
> > Have a look at /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq.
> > AFAIK this works only for kernel-space scaling and NOT if using powernowd
> > and that alike. In this case there should be config option.
> >
> > @all:
> > fixed for me with: stop powernowd and use ondemand scaling
> >
> > breaks for me: after resume from hibernation, sometimes thermal zone
> > shows 0 and fan doesn't kick on. After short time cpu max throttling is
> > enabled (by kernel?), what makes the system unusable because of reaction to
> > events every ten seconds.
> > workaround: include thermal and fan in kernel, NOT module (tested before
> > on gentoo, not verifed yet on ubuntu)
> >
> > --
> > laptop overheats when performing CPU intensive tasks.
> > https://launchpad.net/bugs/22336
> >
>
>
>
> --
> Trae McCombs || http://occy.net/
> Founder - Themes.org // Linux.com
> CivicSpaceLabs - http://civicspacelabs.com/
>

--
    Trae McCombs || http://occy.net/
  Founder - Themes.org // Linux.com
  CivicSpaceLabs - http://civicspacelabs.com/

Hi All,

I've managed to shutdown Powernowd as per the instructions but can't use ondemand. I don't have the "/sys/devices/system/cpu/cpu0/cpufreq/scaling_governor" directory on my laptop, under cpu0 the only directory I have is cache. I've tried to create the directories using sudo but get an Operation Not Permitted error. Can anyone please help or offer other suggestions? This error is becoming a real problem and I'd prefer not to have to install something other than Ubuntu, I've finally got everything configured exactly how I like it.... :-(

GuitarFingers

Nikolaus Filus (nfilus) wrote :

For using the ondemand scaler you need to load the cpufreq_ondemand module.

# load module for current session
   sudo modprobe cpufreq_ondemand
# and add it permanently to booting sequence
   sudo echo cpufreq_ondemand >> /etc/modules

BTW: the ondemand scaler gives me around 10°C cooler system (47 against 58 with performance scaler, which seems default for the ubuntu kernel). The fan is most times off, which also causes a better battery runtime.

GreatBunzinni (greatbunzinni) wrote :

I've just tried that and nevertheless opening up youtube drives my laptop's thermal2 signature goes up to 80C in no time and keeps on rising. When I close up my browser the thermal2 signature takes a bit of time to come below 60C.

malluguy (biju-chacko) wrote :

I have the same problem. My Toshiba (Centrino 1.7 Ghz, 512 MB) heats up when I run Ubuntu. I have the Beta release of Edgy Eft. I have always wondered why my laptop heats up when I run Ubuntu. Ubuntu is by far the best distro I have seen. I would be delighted to see this bug fixed in the near future.

Ugh, I was hoping this would be fixed in Edgy. Surely this won't go a whole
year before being fixed. (If it doesn't make it into Edgy, it will have to
wait until April before it gets in!)

:(

On 10/11/06, malluguy <email address hidden> wrote:
>
> I have the same problem. My Toshiba (Centrino 1.7 Ghz, 512 MB) heats up
> when I run Ubuntu. I have the Beta release of Edgy Eft. I have always
> wondered why my laptop heats up when I run Ubuntu. Ubuntu is by far the
> best distro I have seen. I would be delighted to see this bug fixed in
> the near future.
>
> --
> laptop overheats when performing CPU intensive tasks.
> https://launchpad.net/bugs/22336
>

--
    Trae McCombs || http://occy.net/
  Founder - Themes.org // Linux.com

I've got HP compaq nx9005 laptop with Athlon XP Mobile 2500+ and cpu scalling works great for me - laptop never overheats and I see that fan is running a lot less than in Windows XP (with which the laptop was preinstalled) or Mandriva and performance in Kubuntu Edgy Eft is better, too.

GreatBunzinni (greatbunzinni) wrote :

Once again the laptop crashed on me due to overheating. This time it was while I was upgrading KDE through the Adept package manager.

I really hope this nasty bug is fixed in kubuntu 6.10. It is a very very nasty PitA.

Have you all simply given up on Ubuntu? How are you coping with the effects
of this bug? It has forced me to try just about every single distro out
there, and sadly, it affects most major distro's. The only ones I've found
it doesn't affect are Mandriva 2007 and Gentoo 2006.1 Both of which I feel
are just not up to the level of experience I want that Ubuntu gives me.

The only reason I even REMOTELY considered switching to another distro was
because of this bug. Now I'm stuck. My laptop is broken, and I can't use
anything else. And Edgy won't be fixed either it seems. This means, it'll
be a whole nother 6 months I have to wait, and hope this bug is fixed.

I can't afford to buy another laptop. I won't use windows. My only option
is to try and either use the busted Ubuntu, with the overheat issue (which
I've dealt with for 6 months -- at the peril of a lot of work being lost at
times), or go with Gentoo which is quite a painful thing for someone used to
the ease of Ubuntu. (Even if you've used Linux full-time for 10 years like
I have)

What are you guys doing with relation to this bug? Has everyone simply just
"dealt" with it, or have you switched distro's.

Thanks,
Trae

PS. I want my favorite distro back! [Ubuntu]

--
    Trae McCombs || http://occy.net/
  Founder - Themes.org // Linux.com

Hi All, I just got a new used laptop and decided to try a distro other than Fedora Core which I've been using for years(redhat). I really like ubuntu/debian more than FC. I'm supposed to be using Solaris 10 x86, but that's a longer story. The problem is that the laptop keeps overheating, just like all these others. I have an HP Pavilian dv1000. I can't get the fan to even turn on! Luckily, I'm mainly using the terminal for what I'm doing at work ( sunfire e25k VCS cluster), so if it croaks, I'm really not out anything. What gives? I see hints all over the place, but no solutions. Is there a summary of this issue in one place? I've tried all the solutions listed here and on many other sites to no avail. btw, my /proc/acpi/fan directory is empty. lm-sensors doesn't find any sensors. Am I out of luck with this laptop?

On Tue, 2006-10-17 at 01:31 +0000, Trae McCombs wrote:

> The only reason I even REMOTELY considered switching to another distro was
> because of this bug. Now I'm stuck. My laptop is broken, and I can't use
> anything else. And Edgy won't be fixed either it seems. This means, it'll
> be a whole nother 6 months I have to wait, and hope this bug is fixed.
>
This mail isn't really very helpful.

Nagging or wailing on a bug doesn't get it fixed.

This bug hasn't not been fixed (urgh, green wiggly lines) because
developers don't believe it's that important.

This bug has not been fixed because the developers have no idea why this
happens, and not enough information to find out. We don't even know
where to begin to ask further questions.

At this point, the most useful thing somebody with this bug can do is
get their laptop in front of a developer. Then the bug can be
demonstrated, and explicitly demonstrated as not occuring with Mandriva
2007.

Also we'd really like to know whether this bug occurs with the pristine
upstream kernel or not.

We can then also try various kernel packages on the laptop, bisecting
until we figure out exactly which line of code causes the behaviour to
occur or go away.

Perhaps somebody could get to UDS-MTV and bring their laptop?

Scott
--
Scott James Remnant
<email address hidden>

Atleast for me, this hasn't happened since I deleted my /etc/default/powernowd.

On 10/17/06, Christian Bjälevik <email address hidden> wrote:
>
> Atleast for me, this hasn't happened since I deleted my
> /etc/default/powernowd.

root@exultate:/etc/default# ls *power*
ls: *power*: No such file or directory
root@exultate:/etc/default#

I don't seem to have powernowd

--
> laptop overheats when performing CPU intensive tasks.
> https://launchpad.net/bugs/22336
>

--
    Trae McCombs || http://occy.net/
  Founder - Themes.org // Linux.com

No, you have to add it yourself if you want to tweak the default settings of the daemon. When I hear the fan run for a while I DO manually revert to the powersave governor though :-).

Hi there,

I've got an Aspire 1522 Wlmi too, and have had problems with overheating since shortly after I got it. I managed to kill it after about 11 months from overheating under both Linux and Windows with any serious CPU intensive task. It got worse the older the laptop got, and although I found how to avoid the problem not long after I got it, in the end it couldnt restore a backup before it switched off, which meant I couldn't work around it any more.

Although it does mean you can't use the CPU to it's fullest potential, the two methods for Windows and Linux that keep me going are:

Install RightMark under windows, and force it to 1.1V/800 Mhz.

Under Linux:
    modprobe powernow-k8
    modprobe cpufreq_powersave
    echo "powersave" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

Yeah I know, it'll force your CPU to run at it's slowest speed, but if you're desperate, then give it a go. There's a package called cpufrequtils in the repositories and you can edit /etc/default/cpufrequtils and change the governor to powersave so that it works on boot (disable powernowd if thats still around with BUM).

James

GreatBunzinni (greatbunzinni) wrote :

Well that sucks and sucks a lot. I've just installed Kubuntu 6.10 and it still suffers from the overheating problem. Those are not encouraging words...

eBobster (ebobster) wrote :

More data for the problem:
Compaq V4000 (laptop) - Centrino 1.73G. Max CPU Temp (100 C) (from Intel)
Using Ubuntu Dapper. Under high load would claim critical temp reached and halt. dmesg shows CPU reached (102 C).
Using powersave and kpowersave

For me this problem started out of the blue - not following any kernel change or particular apt-get updates.

I use "watch 1 acpitool -tfc" to keep watch over the system atm. Noteworthy mentions are:

  Fan : <not available>
  Throttling control : no
  Limit interface : no
  critical (S5): 100 C
  passive: 95 C: tc1=2 tc2=5 tsp=300 devices=0xdffea660

First 3 seems wrong for a Centrino -- but I guess that is a problem with the ACPI interface to BIOS here.

The fan (there is one) responds autonomously -- probably BIOS controlled? So does the above really matter.

Doing something like kernel compile I would see the CPU temp hovering between 80-100. Passive would kick in every now and again.

polling_interval was set to 2, I changed this to 30 and observed that sometimes the CPU temp spiked at 102, 105, 107 but for no more than 1 second then immediately dropped back to sub-100. No instability, so could be a glitch?

Sometimes Linux will hit 100+ on 30 seconds and halt.

My conclusions:

the polling is far too rigid. Perhaps it should take some averages over another interval or require a sustained critical temperature before ditching the system. (make this user configurable under /proc/acpi/ as is the rest). I like the idea of polling_interval being 2 but my system would be fine if it only acted on the critical temperature if the CPU was 100+ for more than 3 of these intervals.

The passive trip could be wrong, but that depends on the interpretation of the 100+ spikes.

I currently avoid the problem by changing things to:

  echo 5 > /proc/acpi/thermal_zone/THR0/polling_frequency
  echo 5 > /proc/acpi/thermal_zone/THR1/polling_frequency
  echo "110:102:90:60:50:40" > /proc/acpi/thermal_zone/THR0/trip_points

The 110 attempts to offset the spike (which is a rare spike); the 90 sets the passive kick-in which takes the CPU speed to 1.3G during the passive region.

Powersave and co (tried a few) seemed to be doing their job. (note Klaptop is the only thing that can successfully suspend to RAM for me)

I'm of the belief that my hardware (1+ year old always working) is showing some minor cracks with the 100+ temp spike. But I also think the kernel could be more forgiving of it.

http://www.columbia.edu/~ariel/acpi/acpi_howto.html was a very good read.

Careful!!

100ºC is a lot. A processor should work about 25ºC and decreases it's
life and performance over 50ºC or 70ºC.

You should know if your system is hot or it could be a crashed/buggy
thermal sensor.

If your fan is noising "crazy", I'll bet that the sensor is broken. All
fans, at top speed, can have the processor under 80ºC, no matter the
work they're doing.

--
Pedro Martínez Juliá
\ <email address hidden>
)| WebLog: http://www.pedromj.com/blog
/ Página web: http://www.pedromj.com
GoogleTalk: <email address hidden>
Socio HispaLinux #311
Usuario Linux #275438 - http://counter.li.org

Intel said the Max temp of a Centrino is (100 C).

Mine idles at 50-60 (C) @ 800mhz with no fan.

The fan is "crazy" from about 80 I think. This seems normal.

Btw, I am assuming the Thermal zone 1 is my CPU.

Can anyone point me at proper _fact_ statistics for Centrino temperatures - I actually found it hard to get the Max temp.

GreatBunzinni (greatbunzinni) wrote :

I'm far from being an expert on thermal effects on hardware but I do have some grasp on applied mechanics. Thermal differences like the ones present in working hardware (room temperature to 60C and then 60C to 100C) induce considerable thermal stresses, so those agressive cooling cycles do indeed add quite a lot of wear and tear onto the material, which ends up trashing the stuff. So the temperature climbing over the critical limit isn't the only way to break the hardware with the hardware-produced temperature. Therefore please be careful with what you do with those thermal limits. It may cost you quite a few bucks.

eBobster (ebobster) wrote :

thanks for the concern, its worth everyone giving it thought. However, once again, we need some facts here.

If anything I am helping the situation by causing the CPU to back-off earlier with the passive 90. The 110 is an attempt to deal with what seems to be a glitch. The rest seems like normal function. As far as I can tell from this laptop it is designed to run at (50 C) and progress quickly to (90 C) when *I* demand it -- or I want my money back.

Was none of this apparent from the post??

I have the same problem:

Laptop: IBM T41p
Ubuntu: Dapper works perfectly/Edgy gives Critical Temperature Reached messages in kern.log and halt with T ~= 94 C
Upgrade procedure from Dapper: /home kept, format and install of /
Good point: acpi kicks in and protect my hardware!
Bad point: CPU freq + throttling would be better.

Here are some infos about the system:

loa@cargo:~ $ cat /proc/acpi/thermal_zone/THM0/trip_points
critical (S5): 93 C
passive: 89 C: tc1=8 tc2=5 tsp=600 devices=0xdff4d338

loa@cargo:~ $ cat /proc/acpi/thermal_zone/THM0/polling_frequency
<polling disabled>

- This is strange, I echo 2 into it and got:

# echo 2 > /proc/acpi/thermal_zone/THM0/polling_frequency
# cat /proc/acpi/thermal_zone/THM0/polling_frequency
polling frequency: 2 seconds

# cat /proc/acpi/processor/CPU/info
processor id: 0
acpi id: 1
bus mastering control: yes
power management: yes
throttling control: yes
limit interface: yes

If I do a dmesg | grep -i acpi I have the following strange stuff:

 ACPI: Looking for DSDT ... not found!

So I suppose that the DSDT shipped with Edgy is broken but was ok with Dapper. Where is it possible to get the Dapper DSDT? I could test it and see if it is effectively the problem.

I am attaching the dmesg I have. My kernel etc... are all out of the box from Edgy.

Ok, I took a look at my DSDT and tried to recompile it with iasl and got some errors. So I will explore the DSDT case further. If you want to recompile your DSDT:

To get the compiler: https://help.ubuntu.com/community/ACPIBattery
To get your current DSDT and fix it: http://forums.gentoo.org/viewtopic.php?t=122145

I try to document my "work" to fix my problem here as most likely people will come here when having this problem and you may react and provide some info. If you consider that I should put somewhere else like forum/wiki, just tell me.

Interesting: Upgrading my bios and microcontroller version to the latest from 2.11 to 3.21 and now when I check my DSDT file and recompile it I have no errors.
http://www.thinkwiki.org/wiki/BIOS_Upgrade

More interesting for not T41p owners with such problems. As you can see in my previous comment, I had the following:

$ cat /proc/acpi/thermal_zone/THM0/trip_points
critical (S5): 93 C
passive: 89 C: tc1=8 tc2=5 tsp=600 devices=0xdff4d338

It means that I was switching from passive cooling to active at 89C for the CPU. It like doing an emergency break with a car. You need absorb a lot of energy to stop when at high speed. Here we need to extract very fast a lot of heat, which was not possible so hitting the 93C mark and emergency shutdown.

Solution: Put the passive cooling mark lower:

# echo -n "90:80:50:55:50:45" > /proc/acpi/thermal_zone/THM0/trip_points

That way it starts to be active at 50C and throttle/scale the frequency to stay cool early. I have updated the /etc/rc.local script to run the command at boot time and set also the polling frequency. Note that the DSDT file is still not found in the dmesg output.

Warning: My thermal zone is THM0 your one may be different, check before!

GreatBunzinni (greatbunzinni) wrote :

I've just looked into my thermal_zone values and my ./THRS/trip_points lists the following values:

rui@laptop:THRS$ cat trip_points
critical (S5): 80 C
passive: 75 C: tc1=2 tc2=5 tsp=300 devices=0xdff5d310

My ./THRC/trip_points lists the following values:

rui@laptop:THRC$ cat trip_points
critical (S5): 97 C
passive: 90 C: tc1=2 tc2=5 tsp=300 devices=0xdff5d310

It seems to me that THRC is the CPU temperature and THRS is the motherboard (system?) temperature. I admit that I'm clueless but it looks to me that setting the active temperature trip point for the CPU at 90 C and for the system at 75C is a bit silly. Aren't those values too high? Moreover, it doesn't seem reasonable to set a 5C and 7C threshold between passive cooling and shutting the system down.

Could someone please give me some insight on this?

I've had the same issue with my TravelMate 4404WLMi with a Turion ML-34 processor. Dapper worked fine, both x86 and amd64, while I had to run Edgy amd64 with scaling_max_frequency bolted to 800000 to prevent overheating. Turned out that polling was disabled. After setting the TZS0/polling_frequency to 1 second, things started working, ondemand scheduler now knows about CPU temperature and I have no need for powernowd.

I wonder why polling wasn't enabled in the first place.

Hugo Vincent (hugo-vincent) wrote :

I confirm that this affects my Compaq Evo n800v as well - polling was disabled. Running:
$ sudo sh -c 'echo 5 > /proc/acpi/thermal_zone/TZ1/polling_frequency'
enabled polling which seemed to cause the fan to ramp up properly. My dmesg output also contains:
  ACPI: Looking for DSDT ... not found!

I never noticed this problem on Dapper. This seems like quite a major and important bug (with the potential for hardware damage).

Hajo Brunne (hajo-brunne) wrote :

My IBM Thinkpad R50p is affected as well. I was running SuSe and FC4 before and had no problems with shutdowns due to high temperatures. I do not remember, which kernel version I used, but I installed the kernel sources from 2.6.17 (with ubuntu patches) and recognized, that the ibm_acpi.c driver is outdated (compared to the driver in 2.6.20.1)

pirast (pirast) wrote :

Could you please try if the same problem occurs with Feisty Herd 4 which ships with Kernel 2.6.20 (Just boot the LiveCD function)?

Hajo Brunne (hajo-brunne) wrote :

Probably hard to reproduce ... i was installing opencms (postgres + java activity). Now I tried a
int main(){while(1);} to heat up the CPU, but if it is around 92 degrees, it switches back from 1,7ghz to 600 MHz for some seconds, which lets the temperature drop dramatically in a short moment. Any other CPU burner recommended?

pirast (pirast) wrote :

Try stress (available in universe). Run it with stress -c xx (xx can be any number), there is more information and more options available on the manpage.

Alejandro Zanotti (aleza66) wrote :

I am going to hich hike with a "me too" [tm]
I had the same problem with mi Acer Aspire wlmi 1522, I used to leave Jukebox playing music and when I came back. The laptop shutted down.
I tried beryl and i could use my computer at all!

Rimas Kudelis (rq) wrote :

Count me in. I have the same problem on a non-laptop P4 computer.

GreatBunzinni (greatbunzinni) wrote :

It seems that the next release of Ubuntu (feisty fawn - 7.04) is just around the corner. If I'm not mistaken that will be the 4th release since this bug report was submitted. Will this bug be finally fixed in this new release?

I know that bitching and complaining doesn't get things fixed but this is starting to get very frustrating. The only action this bug report gets is from comments of disgruntled users who repeatedly announce that this bug exists and is endemic to Ubuntu. To add to the frustration it seems that besides those who suffer from this bug, everyone is simply ignoring this problem.

Personally I find it very disappointing, specially since last week my laptop, where I ran Kubuntu since the days of 5.04, died misteriously. Maybe it wasn't even due to this bug but come on, all those crashes due to overheating obviously did not do any good to it either.

pirast (pirast) on 2007-03-18
Changed in linux-source-2.6.20:
assignee: nobody → ubuntu-kernel-team
status: Unconfirmed → Confirmed
Elliot Hughes (elliot-hughes) wrote :

I am confirming this since this bug has been reported in a lot of releases, with a lot of people experiencing the same bug and logs available below. Sorry for the delay.

Changed in acpi-support:
status: Unconfirmed → Confirmed
Changed in linux-source-2.6.20:
importance: Undecided → High

This report is quite hard to pin down as these are several different models of computer being reported (effectively there are several bug reports muddled up in one page).

The common trait I found between two of the output is that doing:

  $ cat /proc/acpi/processor/CPU0/throttling

reports "<not supported>". If your laptop doesn't match this, please open a new bug report.

The next question is /why/ we're getting that message; the hardware and BIOS indicate whether throttling (rather than scaling) is possible and this can be controlled via ACPI based on the safe temperature limits provided by the manufacturer of that machine.

Somebody said that they'd had a specific success with Suse 10, in the case where Ubuntu hadn't worked---what does the above file contain when running on SUSE without the problem manifesting itself, does this proc file still indicate "<not supported>"?

Since the last version of Ubuntu, the attempt now is to always use the kernel's built-in "ondemand" scaling where-ever possible rather than running the separate userspace application "powernowd" to make decisions about what scaling speed (not throttling) would be best.

There's a possibility that a mix-up is occuring and 'powernowd' is allowed to remain running in cases where it should not be. This appears to be what Trae McCombs is reporting (stopping 'powernowd' and manually selecting 'ondemand' solves the problem).

Trae: what type of CPU do you have? Could you tell me with the output of running 'lsmod' whether 'p4clockmod' is in use, or another CPU scaling module?

Daugirdas (daugirdas) wrote :

Dear Paul,

I am glad to see some significant progress on this bug. I guess you refered to me in your post... Yes, I have been running openSUSE 10.0-2 AMD64 for about 2 years now. I have had only 2 shutdowns over that time with typically heavy machine load. openSUSE uses <B>ondemand</B> governor by default. I am sure you are all familiar with the concept of this pm utility. In essence when the temperature reaches a trigger point CPU is scaled down and system cools down a few degrees, then resumes at full power, etc.

daugirdas@dtrsuse64:~> cat /proc/acpi/processor/CPU0/throttling
<not supported>
daugirdas@dtrsuse64:~>

The same is stated in http://librarian.launchpad.net/1511721/cpuid.txt attachment generated from WinXP x64.
So having ondemand working cleanly is critical at least on Acer Aspire 1520 series. Performance governor kills SUSE instantly for example. Same goes with userspace.

In other developments my laptop is showing some "great" hot results. I was hoping to convert my c: vfat drive to ntfs but it overheated just a mere 2 minutes into the operation (I had to restart into some special safe mode for that). Windows install cd in recovery mode overheats nearly instantly. Surprisingly I was able to complete both XP x32 and x64 setup w/o any problems in the past. Maybe they load an AMD driver by some chance...
System simply locks up if left in BIOS setup for a couple minutes. And yes, it is possible use it to fry an egg!

Paul Sladen (sladen) wrote :

Thanks for that Daugirdas. In your particular case, I pop off the keyboard and check that the CPU heatsink is bonded correctly to the CPU using a layer of thermal glue or a thermal pad.

The machine shouldn't be able to overheat itself /that/ quickly, and something is causing the CPU's last-resort thermal trip diode to kick in.

(Yes, I've heard reports of 2? (I think) machines where overheating issues were caused by a missing pad at manufacture time and I've personally come across two machines on the second-hand market where the heat-sinks had been stuck back on with *blue tack* by the previous owner...).

Changed in linux-source-2.6.17:
assignee: mjg59 → ubuntu-kernel-team
GreatBunzinni (greatbunzinni) wrote :

Paul, I do believe this problem is a tough cookie and hard to pin down. Yet, it's impossible to pin anything down when there is no effort to attempt any pinning whatsoever throughout periods which go as far as 8 months. That's what is happening with this bug and it has been the case for nearly 2 years now.

Ben Collins (ben-collins) wrote :

Marking this critical. I can't guarantee that this will get fixed, because quite honestly, I've no way to reproduce it. What's more, I really believe this is a lack of information from hardware vendors for how to operate properly with this sort of system.

First things first, in feisty, powernowd is a one shot on boot, it doesn't stay running. Secondly, ondemand is our default governor in most cases now.

This should resolve things for most people.

For others, let's see if anyone is willing to be gracious enough to donate/loan their equipment to one of our kernel developers. We have East Coast US, Ottawa, North West US, and UK covered. We also have a UDS coming up in Seville, Spain. About the only way to get this fixed is to debug right on the machine (remote ssh access is not very useful).

Any takers? :)

Changed in linux-source-2.6.20:
importance: High → Critical
mathew (meta23) wrote :

I have the same problem on an IBM ThinkPad laptop, a T41p. Same error message about DSDT in dmesg.

In my case, the system still believes CPU throttling is supported according to /proc/acpi; however, the CPU never actually changes speed.

I tried rebuilding my ACPI firmware as per the comments, but that made things even worse--X then caused the system to shut down.

mathew (meta23) wrote :

Uh, sorry, I mean a T42p. The T41p works fine.

Also, I tried upgrading my BIOS to the latest version, didn't help. Running Edgy Kubuntu.

I want to contribute, I have my ubuntu running on my aspire 1522.
How can I contribute? What can I post, or check?
alex

2007/3/20, GreatBunzinni <email address hidden>:
>
> Paul, I do believe this problem is a tough cookie and hard to pin down.
> Yet, it's impossible to pin anything down when there is no effort to
> attempt any pinning whatsoever throughout periods which go as far as 8
> months. That's what is happening with this bug and it has been the case
> for nearly 2 years now.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://launchpad.net/bugs/22336
>

--
Alejandro Zanotti
-------------------------------------------------------------------------------
"Hay 10 tipos de personas, las que saben leer binario y las que no"

lion71 (wrycatcher) wrote :

Acer Aspire 1640Z series. 1 year old. Runs maybe 4 hours a day at different intervals, normal power down in between uses.

Up until 2 weeks ago, this notebook ran Windows XP exclusively. And it NEVER had a problem with excessive heating, certainly not tripping any hardware BIOS emergency shutdown.

2 weeks ago, I installed Ubuntu Edgy. It is up to date with the "recommended updates" (that little orange and white icon on the task panel). This machine has shut itself down at least 4 times and it is definitely heat-related, judging firstly from empirical evidence. In fact, in one really bad shutdown, the computer refused to power on for several minutes, as it was apparently really hot. I've since installed the monitor applets (battery, CPU/hardware temp, and CPU throttling). So far I've seen it topped out at like 50-62 C, which is fairly warm, but not dangerously so. I've seen the CPU throttle up and down from 600 MHz to 1.7 GHz (max). Temperature spikes during CPU throttle up and then drops on throttle down. At idle, it's in the 35-45 C range.

I have to admit, I have yet to do any of the customizations that I've seen in this thread, such as the 'powernowd' and 'ondemand', nor have I mucked with the trip points for passive mode cooling -> active mode cooling. I am going to try those two. I am also going to look in the log files for more clues. The only thing I have done is set the suspend timeouts to lower values, just to keep it from idling hot when it doesn't need to be...and I also updated my BIOS just in case.

I'm following a lot of the discussion here without much trouble except for this DSDT. Can someone explain to me what DSDT is and what role it plays in all this?

sardion (ubuntu-sardion) wrote :

I am in Los Angeles and would be willing to let my laptop be used for some experimenting for a while. Please contact me via email about this if it can be done.

FWIW, I have "solved" the problem by simply setting scaling_max_freq to 1GHz (even though I have a 1.6GHz PM chip).

I am not 100% certain that this bug is all the same bug (I doubt anyone is) but if it will help, someone can borrow my machine for a while.

Alejandro Zanotti (aleza66) wrote :

Is there a way to set a max speed for the cpu, I cant encode a mpeg to a
dvd.
I dont mind if it takes longer, but I need to do it with out an "auto
shutdown"

2007/4/2, sardion <email address hidden>:
>
> I am in Los Angeles and would be willing to let my laptop be used for
> some experimenting for a while. Please contact me via email about this
> if it can be done.
>
> FWIW, I have "solved" the problem by simply setting scaling_max_freq to
> 1GHz (even though I have a 1.6GHz PM chip).
>
> I am not 100% certain that this bug is all the same bug (I doubt anyone
> is) but if it will help, someone can borrow my machine for a while.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti
-------------------------------------------------------------------------------
"Hay 10 tipos de personas, las que saben leer binario y las que no"

sardion (ubuntu-sardion) wrote :

For those of you who have this problem, my workaround is:

echo -n 1000000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq

Though I actually use the sysfs-utils package to set this automatically at boot time.

Alejandro Zanotti (aleza66) wrote :

Could you explain it better, i am not a pro ...(yeT) :)
what is that you do?

2007/4/6, sardion <email address hidden>:
>
> For those of you who have this problem, my workaround is:
>
> echo -n 1000000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
>
> Though I actually use the sysfs-utils package to set this automatically
> at boot time.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti
-------------------------------------------------------------------------------
"Hay 10 tipos de personas, las que saben leer binario y las que no"

sardion (ubuntu-sardion) wrote :

Sure. Here's the easiest way:

In terminal do:

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies

this will give you a list of what speeds your processor can run at. make note of them.
now in the terminal do:

sudo aptitude install sysfsutils

once it's done installing that, do (again in terminal):

gksudo gedit /etc/sysfs.conf

That will open a file, you need to add the following lines to that file (at the end of it):

devices/system/cpu/cpu0/cpufreq/scaling_governor = ondemand
devices/system/cpu/cpu0/cpufreq/scaling_max_freq = MAX
devices/system/cpu/cpu0/cpufreq/scaling_min_freq = MIN

but replace MIN with whatever number was smallest of the available_frequencies in the first step and replace MAX by a number from available_frequencies that seems reasonable (I use 600000 for MIN and 1000000 for MAX, i.e. 600MHz and 1GHz).
Then save the file and restart.

I pretty much used trial and error, lowering MAX (make sure it is always set to one of the available_frequencies values of course) until there are no longer overheating problems.

Alejandro Zanotti (aleza66) wrote :

Thanks a lot. did you prevent from shutting down using this_?

2007/4/6, sardion <email address hidden>:
>
> Sure. Here's the easiest way:
>
> In terminal do:
>
> cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
>
> this will give you a list of what speeds your processor can run at. make
> note of them.
> now in the terminal do:
>
> sudo aptitude install sysfsutils
>
> once it's done installing that, do (again in terminal):
>
> gksudo gedit /etc/sysfs.conf
>
> That will open a file, you need to add the following lines to that file
> (at the end of it):
>
> devices/system/cpu/cpu0/cpufreq/scaling_governor = ondemand
> devices/system/cpu/cpu0/cpufreq/scaling_max_freq = MAX
> devices/system/cpu/cpu0/cpufreq/scaling_min_freq = MIN
>
> but replace MIN with whatever number was smallest of the
> available_frequencies in the first step and replace MAX by a number from
> available_frequencies that seems reasonable (I use 600000 for MIN and
> 1000000 for MAX, i.e. 600MHz and 1GHz).
> Then save the file and restart.
>
> I pretty much used trial and error, lowering MAX (make sure it is always
> set to one of the available_frequencies values of course) until there
> are no longer overheating problems.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti
-------------------------------------------------------------------------------
"Hay 10 tipos de personas, las que saben leer binario y las que no"

sardion (ubuntu-sardion) wrote :

Yes, on my laptop, setitng the max_freq to 1GHz prevents overheating and the computer does not shutdown on its own, I have left ot running for more than a day doing CPU intensive tasks and its fine when its slowed down.

Alejandro Zanotti (aleza66) wrote :

thanks for the tip!

2007/4/7, sardion <email address hidden>:
>
> Yes, on my laptop, setitng the max_freq to 1GHz prevents overheating and
> the computer does not shutdown on its own, I have left ot running for
> more than a day doing CPU intensive tasks and its fine when its slowed
> down.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti
-------------------------------------------------------------------------------
"Hay 10 tipos de personas, las que saben leer binario y las que no"

amitm02 (amit-man) wrote :

I just wanted to add, that i thought i had the same bug. i fought with it allot, and used many of the tips in the previous comments.
In the end it turned out that my laptop fan was just to dirty! a quick clean with compressed air and the problem was solved.
worth checking it out. :)

amit man

p.s
yes, i did felt like an idiot.

Alejandro Zanotti (aleza66) wrote :

Do you have an unmount guide for the laptop¿

2007/4/12, amitm02 <email address hidden>:
>
> I just wanted to add, that i thought i had the same bug. i fought with it
> allot, and used many of the tips in the previous comments.
> In the end it turned out that my laptop fan was just to dirty! a quick
> clean with compressed air and the problem was solved.
> worth checking it out. :)
>
> amit man
>
> p.s
> yes, i did felt like an idiot.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti
-------------------------------------------------------------------------------
"Hay 10 tipos de personas, las que saben leer binario y las que no"

Rimas Kudelis (rq) wrote :

Again: the problem for me arises even on a non-notebook computer which I assembled just hacf a year ago, so it's not yet choking with dust.

Those "emergency shutdowns" piss me off. Expecially when GRUB is now so sensitive about uncleanly unmounted partitions, forcing me to log into single-user mode just to press ctrl+alt+del quite often...

Giuseppe Pennisi (giupenni78) wrote :

I add a similar bug, https://bugs.launchpad.net/ubuntu/+bug/104069

I think it is the same problem.

gp

Paul Harrison (peharri) wrote :

I think some people are barking up the wrong tree with this one. There is a problem with the CPU overheating, but it has little or nothing to do with throttling, as the complaints are about the CPU overheating when CPU usage is *supposed* to be high (ie compiling applications, etc)

Throttling is a technique meant to be used during periods of *low* CPU usage. The purpose is to reduce the power consumption of the CPU when it isn't doing much. You're not supposed to throttle the CPU when it's doing a lot of work except in a dire emergency, because that defeats the purpose of having a fast CPU in the first place. A CPU that's throttled whenever it's doing a lot of work is essentially one that may feel slightly more responsive than an equivalent machine with a CPU that runs at the slower speed by default, but is otherwise just as poorly performing.

I have a Thinkpad T60. Until a couple of weeks ago, I was running Debian sarge with a 2.6.16 kernel. I can tell you what's different between it and the Feisty install I have today: on the Debian machine, when the fans needed to come on full blast, they did. On the Feisty install, they don't. Even typing "# echo level 7 > /proc/acpi/ibm/fan" to force the fans to their highest "supported" speed is not enough to actually get them to go at their documented rate. They certainly aren't making the same amount of noise. The only way I can permanently prevent my laptop from overheating is to override the rate control altogether with "# echo level disengaged > /proc/acpi/ibm/fan" which does get the fans up to full speed, but makes it permanent (whereas under Debian, the fans would only go to full speed if the laptop really was doing a lot of work. The various OpenGL screensavers and Unreal Tournament would do that.)

If "level 7" fans meant the same thing under Feisty as it did under Sarge, I don't think there'd be a problem.

I can manually enable throttling and have done so, but the result has been somewhat unusable: whenever I start anything from installing packages (gzip uses CPU...) to compiling a large program, the CPU speed plummets to something barely usable. I'm failing to see the point. I can see using it if the CPU approaches an unsafe temperature even when fans are on full blast (essentially as a last resort, to save the system from either burning up or turning off), but not if simply bumping up the fan speed would do the job.

Alejandro Zanotti (aleza66) wrote :

The thing here I think is that when CPU throtles up due to processor needs,
and temperature rises... it should stop underclock if temperature rises to
much and come closer to critical temp, until temps comes down... but it
doesnt.
Thats the thing i think

2007/4/17, Paul Harrison <email address hidden>:
>
> I think some people are barking up the wrong tree with this one. There
> is a problem with the CPU overheating, but it has little or nothing to
> do with throttling, as the complaints are about the CPU overheating when
> CPU usage is *supposed* to be high (ie compiling applications, etc)
>
> Throttling is a technique meant to be used during periods of *low* CPU
> usage. The purpose is to reduce the power consumption of the CPU when it
> isn't doing much. You're not supposed to throttle the CPU when it's
> doing a lot of work except in a dire emergency, because that defeats the
> purpose of having a fast CPU in the first place. A CPU that's throttled
> whenever it's doing a lot of work is essentially one that may feel
> slightly more responsive than an equivalent machine with a CPU that runs
> at the slower speed by default, but is otherwise just as poorly
> performing.
>
> I have a Thinkpad T60. Until a couple of weeks ago, I was running Debian
> sarge with a 2.6.16 kernel. I can tell you what's different between it
> and the Feisty install I have today: on the Debian machine, when the
> fans needed to come on full blast, they did. On the Feisty install, they
> don't. Even typing "# echo level 7 > /proc/acpi/ibm/fan" to force the
> fans to their highest "supported" speed is not enough to actually get
> them to go at their documented rate. They certainly aren't making the
> same amount of noise. The only way I can permanently prevent my laptop
> from overheating is to override the rate control altogether with "# echo
> level disengaged > /proc/acpi/ibm/fan" which does get the fans up to
> full speed, but makes it permanent (whereas under Debian, the fans would
> only go to full speed if the laptop really was doing a lot of work. The
> various OpenGL screensavers and Unreal Tournament would do that.)
>
> If "level 7" fans meant the same thing under Feisty as it did under
> Sarge, I don't think there'd be a problem.
>
> I can manually enable throttling and have done so, but the result has
> been somewhat unusable: whenever I start anything from installing
> packages (gzip uses CPU...) to compiling a large program, the CPU speed
> plummets to something barely usable. I'm failing to see the point. I can
> see using it if the CPU approaches an unsafe temperature even when fans
> are on full blast (essentially as a last resort, to save the system from
> either burning up or turning off), but not if simply bumping up the fan
> speed would do the job.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti
-------------------------------------------------------------------------------
"Hay 10 tipos de personas, las que saben leer binario y las que no"

Paul Harrison (peharri) wrote :
Download full text (3.7 KiB)

The fans should come on first. It should never underclock the CPU except if
the CPU is idle, or as a last resort.

--- Alejandro Zanotti <email address hidden> wrote:

> The thing here I think is that when CPU throtles up due to processor
> needs,
> and temperature rises... it should stop underclock if temperature rises
> to
> much and come closer to critical temp, until temps comes down... but it
> doesnt.
> Thats the thing i think
>
> 2007/4/17, Paul Harrison <email address hidden>:
> >
> > I think some people are barking up the wrong tree with this one. There
> > is a problem with the CPU overheating, but it has little or nothing to
> > do with throttling, as the complaints are about the CPU overheating
> when
> > CPU usage is *supposed* to be high (ie compiling applications, etc)
> >
> > Throttling is a technique meant to be used during periods of *low* CPU
> > usage. The purpose is to reduce the power consumption of the CPU when
> it
> > isn't doing much. You're not supposed to throttle the CPU when it's
> > doing a lot of work except in a dire emergency, because that defeats
> the
> > purpose of having a fast CPU in the first place. A CPU that's throttled
> > whenever it's doing a lot of work is essentially one that may feel
> > slightly more responsive than an equivalent machine with a CPU that
> runs
> > at the slower speed by default, but is otherwise just as poorly
> > performing.
> >
> > I have a Thinkpad T60. Until a couple of weeks ago, I was running
> Debian
> > sarge with a 2.6.16 kernel. I can tell you what's different between it
> > and the Feisty install I have today: on the Debian machine, when the
> > fans needed to come on full blast, they did. On the Feisty install,
> they
> > don't. Even typing "# echo level 7 > /proc/acpi/ibm/fan" to force the
> > fans to their highest "supported" speed is not enough to actually get
> > them to go at their documented rate. They certainly aren't making the
> > same amount of noise. The only way I can permanently prevent my laptop
> > from overheating is to override the rate control altogether with "#
> echo
> > level disengaged > /proc/acpi/ibm/fan" which does get the fans up to
> > full speed, but makes it permanent (whereas under Debian, the fans
> would
> > only go to full speed if the laptop really was doing a lot of work. The
> > various OpenGL screensavers and Unreal Tournament would do that.)
> >
> > If "level 7" fans meant the same thing under Feisty as it did under
> > Sarge, I don't think there'd be a problem.
> >
> > I can manually enable throttling and have done so, but the result has
> > been somewhat unusable: whenever I start anything from installing
> > packages (gzip uses CPU...) to compiling a large program, the CPU speed
> > plummets to something barely usable. I'm failing to see the point. I
> can
> > see using it if the CPU approaches an unsafe temperature even when fans
> > are on full blast (essentially as a last resort, to save the system
> from
> > either burning up or turning off), but not if simply bumping up the fan
> > speed would do the job.
> >
> > --
> > CPU overheats during high usage "throttling <not supported>"
> > https://bugs.launchpad.net/bugs/22336
> > ...

Read more...

Daugirdas (daugirdas) wrote :

Reply to Paul's comment:

Ideally, the CPU should never overheat, and throttling is then useful only for reducing power consumption. However we are talking about very buggy hardware. A lot of laptops on the market today need special drivers to maintain temperature. There is even no throttling support in some models like Acer Aspire 1520 series. Therefore software CPU speed control is the only way to keep my machine going.

Alejandro Zanotti (aleza66) wrote :

Auch... i own a aspire 1522... damn
I tried limiting the max speed at 1.6 instead of 1.8 but when triyng to do
some video works.... temp raises too much.

2007/4/17, Daugirdas <email address hidden>:
>
> Reply to Paul's comment:
>
> Ideally, the CPU should never overheat, and throttling is then useful
> only for reducing power consumption. However we are talking about very
> buggy hardware. A lot of laptops on the market today need special
> drivers to maintain temperature. There is even no throttling support in
> some models like Acer Aspire 1520 series. Therefore software CPU speed
> control is the only way to keep my machine going.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti
-------------------------------------------------------------------------------
"Hay 10 tipos de personas, las que saben leer binario y las que no"

zeddock (zeddock) wrote :

Just a "Me too!"

Dell Inspiron 8500.

zeddock

Benjamin COUHE (voraistos) wrote :

Just a comment: My CPU doesnt overheat when there is high CPU usage, but does when there is not. -yep, how strange is that ?-

With previous versions of ubuntu -starting from Hoary-, the laptop never had any problem, and i always got all hardware support. Thanks elitegroup for using standards :) . Anyway, i have been running cpuburn (burnP6) for 20 minutes now, and openoffice.org loads just as fast as usual, and firefox sucks as usual as well. Now the important and "funny" point: when i dont use my cpu, or use it in a way that doesnt require more than 600mhz, frequency scaling goes down automatically to min. 600 mhz. BUT on the version 7.4 (with the new kernel), when i look at my cpufreq "ondemand" mode, it keeps going between 600 mhz and 1.5 ghz, (usually befor to go to 1.5, i see itgoing to 1.2 hz) those jumps are very fast, and cpu usage is 0 or 1 % (which is normal when i am not using the PC) However it shouldnt go to 1.5 ghz if it doesnt need it, and at 1.5 ghz, the fan always rotates, and in this case it doesnt. If i limit the cpu freq. at some point, then the PC wont crash. If i leave it "on demand", then it will. 25 minutes now, and no sign of a bug, even my hdd still rotates (while its known to be sensitive, my hd sucks ass). I hope this info related to the "modes" on demand and conservative is helpfull for you guys to find the bug.

Benjamin COUHE (voraistos) wrote :

Sill has not crashed... And i can touch the radiator with my hand without burning myself....

Art Jennings (noddy) wrote :

I can confirm this bug on Kubuntu/Ubuntu (Feisty). My machine is an ACER 1522 notebook and has previously run WinXP, SUSE10, 10.1, 10.2 with no problems.

When compiling a large project after maybe 4mins the system shuts down, I have been able to repeat this 5 times in a row before stumbling on this bug report.

relevant section from /var/log/messages..

Apr 21 17:07:56 user-linux -- MARK --
Apr 21 17:18:44 user-linux kernel: [ 1248.734964] ACPI: Critical trip point
Apr 21 17:18:48 user-linux gconfd (user-5426): Received signal 15, shutting down cleanly
Apr 21 17:18:48 user-linux gconfd (user-5426): Received signal 15, shutting down cleanly
Apr 21 17:18:48 user-linux gconfd (user-5426): Exiting
Apr 21 17:18:56 user-linux exiting on signal 15

Seems a shame I will have to go back to opensuse as the rest of the OS is great, especially the synaptic package manager. oh well...

FarAndre (spiritth) wrote :

The same problem on laptop Asus Z92T (cpu Turion TL52).
The temperature of cpu is always over 60degreesC when I'm doing nothing and when I just start some programs, for example XMMS, the temperature start increasing (over 72degreesC) and the fan is working as a crazy.
Yesterday I've installed Ubuntu Feisty 64 and the problem appears. Before I was using Ubuntu Dapper 64 and everything with the cpu's temperature and fan was all right.

Alejandro Zanotti (aleza66) wrote :

It's really messed up, I cant figure this out. I ve tried almost eveything
possible.

2007/4/21, FarAndre <email address hidden>:
>
> The same problem on laptop Asus Z92T (cpu Turion TL52).
> The temperature of cpu is always over 60degreesC when I'm doing nothing
> and when I just start some programs, for example XMMS, the temperature start
> increasing (over 72degreesC) and the fan is working as a crazy.
> Yesterday I've installed Ubuntu Feisty 64 and the problem appears. Before
> I was using Ubuntu Dapper 64 and everything with the cpu's temperature and
> fan was all right.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti
-------------------------------------------------------------------------------
"Hay 10 tipos de personas, las que saben leer binario y las que no"

mathew (meta23) wrote :

I've just tried Fiesty.

On the plus side, the CPU is now scaling, whereas under Edgy it always seemed to run at top speed.

On the minus side, doing something CPU-intensive still causes overheating.

Thomas Renninger (trenn) wrote :

AFAIK Ubuntu has picked up a patch from:
http://bugzilla.kernel.org/show_bug.cgi?id=5534
IIRC they used a quite early one, one of Peter Wainwright.
Just a wild guess, but maybe it works without it?

GULLI.ver (bugs-simon) wrote :

I first experienced that bug on kernel 2.6.20.14 - where frequency scaling begin to work on my IBM T43p.
Now when having the cpu scaled to 100%, my laptop will halt due to reaching cpu temperature limit.

the problem seems to be the scaling-steps in /proc/acpi/ibm/fan

I can manually speed the fan up by "echo level 7 > /proc/acpi/ibm/fan"
which will bring the speed up to "4263"
on the lowest level (level 1) the speed is quoted "3268"

Comparing the noise with the fan-noise while running windows, level 7 does not seem to be the maximum at all.

When having fan at auto-level at heavy cpu-load, fan-speed is increasing slowly, but will never be over speed "4263".

When having the fan in disengaged mode (echo 0x2F 0x40 > /proc/acpi/ibm/ecdump), it will speed up to it's full speed "6859".

Why is fan speed controlled by ibm-acpi and not by the bios?

Running feisty (2.6.20-15-386) on a T43p 1,86 GHz Pentium M.

Art Jennings (noddy) wrote :

I solved this using the suggestion on
http://collegegeek.org/index.php/2007/03/06/cpu-temperature-solution/

In my case (ACER aspire 1522) I entered..
echo -n "95:80:60:75:65:60" > /proc/acpi/thermal_zone/THRC/trip_points
echo 4 > /proc/acpi/thermal_zone/THRC/polling_frequency

(you may want to put this in /etc/rc.local as it gets reset on boot)

Note that previously no polling was setup by my clean install of 7.04 - contrary to suse10.2/3. As a matter of interest the /proc/acpi directory for Feisty and a distro that works (suse10.2) are almost identical except for the no polling. However if I set them up the same it still crashes under Feisty if you don't change the trip_points to lower values. It defaults to critical = 97C and passive = 90C for this machine but feisty doesn't look like it throttles at all till it gets to 90C then it just shuts itself down (it may be trying to hibernate if that is where the 'hot' trip_point is), whereas in suse the proc is throttled back to 800MHz when the temp gets too high. To my ear the fan react similarly in both cases.

These settings may not be optimal as I do not know enough about acpi trip_points, polling freqs and cpu tolerances but it does work.. for me anyways.
I would wait for an official solution if an unmelted laptop is important to you though :)

ksosez (methone) wrote :

Im seeing this as well:

IBM Thinkpad Z60m

Fresh install of Feisty...was working great on Dapper.

Here is a typical syslog entry:

Apr 23 07:46:28 ksosez-laptop kernel: [112521.624000] ACPI: Critical trip point
Apr 23 07:46:28 ksosez-laptop kernel: [112521.624000] Critical temperature reach
ed (98 C), shutting down.
Apr 23 07:46:28 ksosez-laptop kernel: [112521.636000] ACPI: Critical trip point
Apr 23 07:46:28 ksosez-laptop kernel: [112521.636000] Critical temperature reach
ed (98 C), shutting down.
Apr 23 07:46:28 ksosez-laptop init: tty4 main process (4342) killed by TERM sign
al
Apr 23 07:46:28 ksosez-laptop init: tty5 main process (4343) killed by TERM sign
al
Apr 23 07:46:28 ksosez-laptop init: tty2 main process (4348) killed by TERM sign
al
Apr 23 07:46:28 ksosez-laptop init: tty3 main process (4349) killed by TERM sign
al
Apr 23 07:46:28 ksosez-laptop init: tty1 main process (4350) killed by TERM sign
al
Apr 23 07:46:28 ksosez-laptop init: tty6 main process (4351) killed by TERM sign
al
Apr 23 07:46:34 ksosez-laptop kernel: [112527.848000] Critical temperature reach
ed (92 C), shutting down.
Apr 23 07:46:36 ksosez-laptop exiting on signal 15

Thomas Renninger (trenn) wrote :

I remember a bug where passive cooling kicked in too late.

I don't have a machine with passive cooling trip point in my hands currently, you can debug this
with a command like that:
watch -n1 cat /proc/acpi/thermal_zone/*/{temperature,trip_points,state} /sys/devices/system/cpu/cpu0/cpufreq/{scaling_max_freq,scaling_cur_freq}

As soon as passive trip point is exceeded, cpufreq max must be reduced immediately and the state in the thermal zone must switch to passive.
There was a bug that this is done in the next measure cycle.
The temperature is checked and adjusted each polling_frequency seconds or if a thermal ACPI event happens.
*This is not the case if passive cooling kicks in*. In this case the a BIOS value tsp (1/10s) is used which is very high on the ThinkPads (600 per default).
That means if passive cooling kicks in (which is quite normal on latest ThinkPads on high load, they reach high frequncies very quickly) the temperature
is only checked every minute as soon as the passive trip point has been reached. Reducing max_freq by one freq step should be enough, but that must
happen as soon as passive cooling kicks in.

The tsp value (time in 1/10s how often temp should be polled when passive cooling is on) can be overridden by passing as thermal module parameter.

The mentioned fan issue:
Updating the EC firmware might help a bit (on ThinkPads). There was a firmware (EC, not BIOS) update stating "avoiding fan noise", IIRC this change got reverted
in some later firmware update. But I doubt that is the real problem. In fact, all this must work without fans, cpufreq should just go down to lower steps.

Another thing that might help:
If you have an ATI Radeon graphics card, make sure
Options "DynamicClocks" "On"
(-> better check if this is correct) is added in your xorg.conf
I heard this can reduce temperature significantly on some machines.

Installing Feisty with upgraded bios on my T41p, the problem is still there (no temperature polling), but this time I have a clean way to solve the problem.
I have added a script to set the polling and the trip points at start and resume:

/etc/acpi/resume.d/99-set-acpi-polling-trip-points.sh
/etc/acpi/start.d/99-set-acpi-polling-trip-points.sh

/!\: You need to edit it if you want to use it, to use your laptop thermal zone and the trip points for your system!

Thomas Renninger (trenn) wrote :

Add a polling value per default is a good idea and a sane workaround
(the value should be a bit higher, e.g. 10 or 15 should be enough or it
might make problems on e.g. recent HP laptops).

Overriding trip_points generally is a bad idea for a distribution and
won't work out.

BTW: Is latest mainline kernel also affected?

Alejandro Zanotti (aleza66) wrote :

Can I try this on my Acer aspire 1522?¿

2007/4/26, Thomas Renninger <email address hidden>:
>
> Add a polling value per default is a good idea and a sane workaround
> (the value should be a bit higher, e.g. 10 or 15 should be enough or it
> might make problems on e.g. recent HP laptops).
>
> Overriding trip_points generally is a bad idea for a distribution and
> won't work out.
>
> BTW: Is latest mainline kernel also affected?
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti

Vikrant (vikrant82) wrote :

I am not seriously affected by this bug. But it was interesting to go through the thread.
I thot it would be nice to hear from a group of people who are not noticing the overheating.

First thing first ----> Its overheating ...
I am on Presario V4122 with a Centrino [ 1.73Ghz ]. Have used XP pro over 1 year on this.

Switched to Ubuntu like a week back. [7.04]
Although there were no reboots or hand-burns but I definitely felt that my laptop felt a little hotter.

When i researched I landed up here. I have no doubt that there is overheating is there when i compare it with windows XP.

At present with Mozillz [7-8 tabs open] + few messengers and a player on + a lot of applets in panel, CPU frequency @ 46%, Temperatures at 54 and 55.
.
.
Ok now I start a video encoding. I let it run for 5 minutes or so ...
.
.
CPU frequency @ 100%, Temperatures @ 54 and 70 -75. So I suppose second one is CPU temperature. Lets wait 5 more mins ...
.
.
uh .. 75, 76, 77 ... would it touch 80.. ! Nope stayed pretty much in that range ... CPU frequency @ 100% all this time ...

I would just return with temperatures for same activity in XP pro on same machine ... [ 78,79!!! ]

And by the way, I just loved Ubuntu ..

zeddock (zeddock) wrote :

Gotta tell you all that I decided to take the laptop that was shutting down
and take it apart. I vacuumed the fins of the heat-sync's, removed CPU and
putt a fresh dab of heat-sync paste, and that bad-boy has no further issues!

The fan hardly ever comes on anymore but it is stable and not hot.

Hope this helps others.

zeddock

On 4/26/07, Vikrant <email address hidden> wrote:
>
> I am not seriously affected by this bug. But it was interesting to go
> through the thread.
> I thot it would be nice to hear from a group of people who are not
> noticing the overheating.
>
> First thing first ----> Its overheating ...
> I am on Presario V4122 with a Centrino [ 1.73Ghz ]. Have used XP pro over
> 1 year on this.
>
> Switched to Ubuntu like a week back. [7.04]
> Although there were no reboots or hand-burns but I definitely felt that my
> laptop felt a little hotter.
>
> When i researched I landed up here. I have no doubt that there is
> overheating is there when i compare it with windows XP.
>
> At present with Mozillz [7-8 tabs open] + few messengers and a player on +
> a lot of applets in panel, CPU frequency @ 46%, Temperatures at 54 and 55.
> .
> .
> Ok now I start a video encoding. I let it run for 5 minutes or so ...
> .
> .
> CPU frequency @ 100%, Temperatures @ 54 and 70 -75. So I suppose second
> one is CPU temperature. Lets wait 5 more mins ...
> .
> .
> uh .. 75, 76, 77 ... would it touch 80.. ! Nope stayed pretty much in
> that range ... CPU frequency @ 100% all this time ...
>
> I would just return with temperatures for same activity in XP pro on
> same machine ... [ 78,79!!! ]
>
> And by the way, I just loved Ubuntu ..
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Alejandro Zanotti (aleza66) wrote :

zeddock; in what laptop had you made this?
alex

2007/5/1, zeddock <email address hidden>:
>
> Gotta tell you all that I decided to take the laptop that was shutting
> down
> and take it apart. I vacuumed the fins of the heat-sync's, removed CPU
> and
> putt a fresh dab of heat-sync paste, and that bad-boy has no further
> issues!
>
> The fan hardly ever comes on anymore but it is stable and not hot.
>
> Hope this helps others.
>
> zeddock
>
>
> On 4/26/07, Vikrant <email address hidden> wrote:
> >
> > I am not seriously affected by this bug. But it was interesting to go
> > through the thread.
> > I thot it would be nice to hear from a group of people who are not
> > noticing the overheating.
> >
> > First thing first ----> Its overheating ...
> > I am on Presario V4122 with a Centrino [ 1.73Ghz ]. Have used XP pro
> over
> > 1 year on this.
> >
> > Switched to Ubuntu like a week back. [7.04]
> > Although there were no reboots or hand-burns but I definitely felt that
> my
> > laptop felt a little hotter.
> >
> > When i researched I landed up here. I have no doubt that there is
> > overheating is there when i compare it with windows XP.
> >
> > At present with Mozillz [7-8 tabs open] + few messengers and a player on
> +
> > a lot of applets in panel, CPU frequency @ 46%, Temperatures at 54 and
> 55.
> > .
> > .
> > Ok now I start a video encoding. I let it run for 5 minutes or so ...
> > .
> > .
> > CPU frequency @ 100%, Temperatures @ 54 and 70 -75. So I suppose second
> > one is CPU temperature. Lets wait 5 more mins ...
> > .
> > .
> > uh .. 75, 76, 77 ... would it touch 80.. ! Nope stayed pretty much in
> > that range ... CPU frequency @ 100% all this time ...
> >
> > I would just return with temperatures for same activity in XP pro on
> > same machine ... [ 78,79!!! ]
> >
> > And by the way, I just loved Ubuntu ..
> >
> > --
> > CPU overheats during high usage "throttling <not supported>"
> > https://bugs.launchpad.net/bugs/22336
> > You received this bug notification because you are a direct subscriber
> > of the bug.
> >
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti

zeddock (zeddock) wrote :

Dell Inspiron 8500

zeddock

On 5/2/07, Alejandro Zanotti <email address hidden> wrote:
>
> zeddock; in what laptop had you made this?
> alex
>
> 2007/5/1, zeddock <email address hidden>:
> >
> > Gotta tell you all that I decided to take the laptop that was shutting
> > down
> > and take it apart. I vacuumed the fins of the heat-sync's, removed CPU
> > and
> > putt a fresh dab of heat-sync paste, and that bad-boy has no further
> > issues!
> >
> > The fan hardly ever comes on anymore but it is stable and not hot.
> >
> > Hope this helps others.
> >
> > zeddock
> >
> >
> > On 4/26/07, Vikrant <email address hidden> wrote:
> > >
> > > I am not seriously affected by this bug. But it was interesting to go
> > > through the thread.
> > > I thot it would be nice to hear from a group of people who are not
> > > noticing the overheating.
> > >
> > > First thing first ----> Its overheating ...
> > > I am on Presario V4122 with a Centrino [ 1.73Ghz ]. Have used XP pro
> > over
> > > 1 year on this.
> > >
> > > Switched to Ubuntu like a week back. [7.04]
> > > Although there were no reboots or hand-burns but I definitely felt
> that
> > my
> > > laptop felt a little hotter.
> > >
> > > When i researched I landed up here. I have no doubt that there is
> > > overheating is there when i compare it with windows XP.
> > >
> > > At present with Mozillz [7-8 tabs open] + few messengers and a player
> on
> > +
> > > a lot of applets in panel, CPU frequency @ 46%, Temperatures at 54 and
> > 55.
> > > .
> > > .
> > > Ok now I start a video encoding. I let it run for 5 minutes or so ...
> > > .
> > > .
> > > CPU frequency @ 100%, Temperatures @ 54 and 70 -75. So I suppose
> second
> > > one is CPU temperature. Lets wait 5 more mins ...
> > > .
> > > .
> > > uh .. 75, 76, 77 ... would it touch 80.. ! Nope stayed pretty much in
> > > that range ... CPU frequency @ 100% all this time ...
> > >
> > > I would just return with temperatures for same activity in XP pro on
> > > same machine ... [ 78,79!!! ]
> > >
> > > And by the way, I just loved Ubuntu ..
> > >
> > > --
> > > CPU overheats during high usage "throttling <not supported>"
> > > https://bugs.launchpad.net/bugs/22336
> > > You received this bug notification because you are a direct subscriber
> > > of the bug.
> > >
> >
> > --
> > CPU overheats during high usage "throttling <not supported>"
> > https://bugs.launchpad.net/bugs/22336
> > You received this bug notification because you are a direct subscriber
> > of the bug.
> >
>
>
> --
> Alejandro Zanotti
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

jojo4u (bugzilla-freedom-x) wrote :

I'm not affected and to not own a laptop at all, but I see some knowledge lacking here about how a CPU should react when overheating. I will talk about Intel CPUs here. From the Pentium-M on, Thermal Monitor 2 supplements Thermal Monitor 1 in cause of overheating.

Info material:
Pentium-M: http://www.digit-life.com/articles2/cpu/intel-thermal-features-pm.html
Core 2 Duo: http://www.digit-life.com/articles2/cpu/intel-thermal-features-core2.html

There are two overheating points on Intel mobile processors.
1st is the "throttling temperature" - the CPU starts throttling from this point on (PROCHOT# gets activated). On Intel Core 2 Duo mobile CPUs this is 100 °C.
2n is the "shutdown temperature" - the CPU just shuts down (THERMTRIP# gets activated). On Intel Core 2 Duo mobile CPUs this is set to ~125 °C.

When the system get's over 100 °C, Thermal Monitor 2 is supposed to put the CPU into it's lowest power state (i.e. idle VID and FID 6x as in C1E/lowest EIST). If this is not sufficient - and "extended throttling" is configured - Thermal Monitor 1 jumps in and throttles the CPU (i.e. putting waitstates in) down to 33%. If even this isn't enough, the temperature rises to 125 °C and the CPU shuts itself down.

I can see that Linux tries to rely an the ACPI tables of the BIOS for thermal mangement.
First, take a look here: http://acpi.sourceforge.net/documentation/thermal.html and http://www.columbia.edu/~ariel/acpi/acpi_howto.html#thermal_management
It says there are two cooling modes. One active and one passive. This is controlled by the trip_points.
E.g. active trip-point: 60, passive: 90. That means that after 60 °C the fan should kicks in and after 90 °C throttling kicks in as well.
E.g. active trip-point: 0, passive: 90 as seen here in this bug. That means, fan is off all the time and throttling occurs over 90 °C. "Matthew Garrett said on 2005-09-24" explains this nicely.
On the other hand, the fan seems to be controlled by the BIOS sometimes.

I can see that many DSDTs have a critical point below 100 °C. This prevents Thermal Monitor to kick in. I'd suggest to put critical to 115 °C, hot to 0 °C (disabled - there is no time to to a software suspend), passive to 100 °C and active to whatever you want: E.g.
sudo echo -n "110:0:90:60:0" > /proc/acpi/thermal_zone/*/trip_points
followed by a
cat /proc/acpi/thermal_zone/*/trip_points
to see wether the settings got applied.

Btw, I have no clue how "extended throttling is managed in Linux.
Btw2, of course your fan should keep the CPU below 100 °C. If it doesn't you should correct that but that's outside my post.

Core 2 Duo mobile datasheet: ftp://download.intel.com/design/mobile/datashts/31407803.pdf

jojo4u (bugzilla-freedom-x) wrote :

Damn, no edit in launchpad.
Ok there is one error. For my suggested values it has to be:
sudo echo -n "115:0:100:60:0" > /proc/acpi/thermal_zone/*/trip_points

I'd suggest everybody to disable powernowd since the kernel gouvernors are more straighforward (http://www.thinkwiki.org/wiki/How_to_make_use_of_Dynamic_Frequency_Scaling)

I'm not too sure about the interactions between powernowd, cpufreq and throttling. In theory a passive trip-point of e.g. 80 and a critical point of 90 should be sufficient and put less strain on the hardware. I don't know whether pownernod prevents throttling. And what about cpufreq throttling? Is it as effective as Thermal Monitor?

Btw, if your temperature is not read correctly, don't forget to enable polling temperature, as said before. (http://acpi.sourceforge.net/documentation/thermal.html)

UrkoM (urko-masse) wrote :

jojo4u: your suggestions seem very good. However I have one question:
Your suggested values only show 5 numbers, where previous examples included 6 numbers. Can you explain?
In any case, it looks like setting a proper governor and forgetting about userspace daemons is the way to go. I'll give it a try as soon as I can. Thanks!

By the way, I am also feeling this problem, my laptop is getting really hot with Feisty. It is a Sony Vaio VGN-FS315M.

Alejandro Zanotti (aleza66) wrote :

this applys to any computer,or only intell based. I am on a amd64.

2007/5/2, jojo4u <email address hidden>:
>
> Damn, no edit in launchpad.
> Ok there is one error. For my suggested values it has to be:
> sudo echo -n "115:0:100:60:0" > /proc/acpi/thermal_zone/*/trip_points
>
> I'd suggest everybody to disable powernowd since the kernel gouvernors
> are more straighforward
> (
> http://www.thinkwiki.org/wiki/How_to_make_use_of_Dynamic_Frequency_Scaling
> )
>
> I'm not too sure about the interactions between powernowd, cpufreq and
> throttling. In theory a passive trip-point of e.g. 80 and a critical
> point of 90 should be sufficient and put less strain on the hardware. I
> don't know whether pownernod prevents throttling. And what about cpufreq
> throttling? Is it as effective as Thermal Monitor?
>
> Btw, if your temperature is not read correctly, don't forget to enable
> polling temperature, as said before.
> (http://acpi.sourceforge.net/documentation/thermal.html)
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti

SMiTTY (mike-ftl) wrote :

 Guess I'll chime in as well....I too am seeing this overheat problem with Feisty. Oddly I didn't see this behavior under Edgy *shrug*

 As soon as I upgraded to 7.04 (which the upgrade didn't work and ended up doing a fresh install), I noticed the laptop running hotter and have had it crash on me a few times. Basically the screen would get pixelated, and everything but the mouse would lock up, followed by a power-off.

 I have a HP Pavilion dv8235nr. Any progress or fixes on this? It does boggle me that 6.10 didn't show this.

 Also, of note, the CPU's should have a max of 1.66GHz, though 7.04 limits it to 1.00GHz

 Let me know if I should provide any more info.

Thanks

 - Mike

UrkoM (urko-masse) wrote :

Hey! I just noticed something. I was reading up on this document:
http://acpi.sourceforge.net/documentation/thermal.html
And it says:
"cat trip_points" sample output:
critical (S5): 110 C
passive: 105 C: tc1=2 tc2=10 tsp=100 devices=0xdf72e380
active[0]: 48 C: devices=0xc157fec0

However, my output is:
critical (S5): 105 C
passive: 85 C: tc1=1 tc2=5 tsp=10 devices=0xdf924ec8

And I have set the trip_points with at least 5 numbers. So it seems the active cooling is not being detected!!!

My laptop is a Sony Vaio VGN FS315M, with Intel(R) Pentium(R) M processor 1.73GHz. If more info is needed, just ask. I want to help!

creamdog (creamdog-macnytt) wrote :

I can confirm the same issue with my laptop wich is a Acer Aspire 1360 , AMD Sempron 2800+ running Fiesty.
No problems at all when it was running heavy applications in winXP and this is the first Linux installation on the laptop.
Can't really say I have noticed any difference in the way the cpu fan is activated with increased load.

Just another data point:

I'm experiencing these overheating problems on an Acer Aspire 3620 laptop. It's in stock condition, 512 MB Ram, 1.6Ghz Celeron processor. I'm running Ubunutu 6.06.1 LTS.

The laptop runs _hot_. For the first few weeks I had it, it frequently exhibited massive slowdowns that I now believe are the result of passive cooling kicking it. (It took a while to diagnose the problem.) I've taken to propping it up on a book on the left side. The intake vent is on the bottom on the right side. Raising it a half inch or so made a big difference and made the machine much more usable.

So, propped up, and using powernowd, the machine would run hot at about 78-82 degrees Celcius while idleing at the slowest speed (200 Mhz). My usual workload is pretty light, so while hot things seemed to work. However, Flash animations, especially video (say, YouTube) running full screen is the big stresser. It will quickly drive the tempurature up to 90. Extended use can drive the tempurature to 95, at which point the passive cooling makes the machine unbarably slow. Oddly enough, it doesn't seem to help the tempurature much.

I recently tried switching to the kernel's ondemand scaling. (sudo /etc/init.d/powernowd stop ; sudo sh -c 'echo -n ondemand > 'sys/devices/system/cpu/cpu0/cpufreq/scaling_governor' as noted repeatedly above) I can't explain why this would change anything, but the machine is idling a bit cooler, 75-78. YouTube video gets the machine up to 90-91, which is hot, but usable. Obviously this isn't a real fix; I'm pretty sure 90 degrees Celcius is bad for the machine, but it's an acceptable workaround for now. (I knew what I was getting into when I bought such a cheap machine.)

A few bits of information; I'm happy to provide more upon request.

$ cat /proc/acpi/thermal_zone/TZS0/cooling_mode
<setting not supported>
cooling mode: passive
$ cat /proc/acpi/thermal_zone/TZS0/polling_frequency
<polling disabled>
$ dmesg > /tmp/dmesg-bug
((attached))

GULLI.ver (bugs-simon) wrote :

I have to revoke my problem report from above...

On Ubuntu Edgy the speedstepping didn't work at my of my IBM/Lenovo T43p 1,86 GHz Pentium M and limited the speed to 800 MHz, on Feisty problems occured with working speedstepping and full CPU-speed.

But: Since I cleaned my fan-fence - which was totally jammed, I finally solved the overhating problem. Temperatures dropped by 30 °C.

I still don't know why windows on the same machine never overheated.

Falconix (falconix) wrote :

I'm suffer by this bug to, on my Dell Inspiron 9300.
Critical temperature reached (100 C), shutting down.

Cool_J_77 (cool-j-77) wrote :

     I had been suffering from this bug for quite some time until now. I have a hp pavilion ze5170 notebook with a p4 at 2ghz running ubuntu 7.04. I kept getting the message that critical temp hit, and that powernowd could not be started, just like many of you have. I tried stopping powernowd and setting trip points, as well as using ondemand. None of that worked for me. Well actually it had worked the first night I tried it but after restart, and even after doing all the steps over again at startup, and writing startup scripts, it would shutdown and give me that same error.

    Now however I have found the solution. At least for my computer I have. I used the synaptic package manager and installed powersave. Doing this it had to uninstall powernowd and apmd. Now using a temp monitor called computertemp I see that my laptop is constantly at 45C and my cpu is maxed at 2.0 at all times. I can play UT on wine for as long as I want without shutdown, and watching videos was no problem. My temp now maxes out at 63C and even then it only just touches 63 then it's back to 62 and below.

    There is one problem with this solution however. After restarting my computer it took almost 15 minutes for ubuntu to fully start up again. Everything works yet, however it's just a little bit slower to open new programs. I believe it had something to do with ununtu's desktop needing apmd, but I havn't tried reinstalling it. I figure that as long as the major bug is out of the way I'll have time to fix the others that may come up, since my computer doesn't restart every 5 minutes now.

Cool_J_77 (cool-j-77) wrote :

Well now that ubuntu has been running for 20minutes everything is back to full speed. Hope this fix helps you.

ksosez (methone) wrote :

This is still bad for me, running a Z60m Thinkpad. Even after tp-fancontrol im running at least 20 degrees C above what it is in windows. My next move is to go back to the default video drivers (no problems with heat on Dapper).

Is anyone who can actually fix this reading this? or am I just spitting into the wind. I am willing to help anyone who needs information to track this down since it makes Ubuntu almost unusuable.

Cool_J_77 (cool-j-77) wrote :

UPDATE!!!
   Almost instant start up now. I figured out that when I removed my startup script of $ sudo echo cpufreq_ondemand >> /etc/modules I had messed up the startup. Now that I added that back in and with Powersave installed and powernowd and apmd uninstalled I no longer have shutdown issues. Nor overheating issues. So if this overheating problem has gotten you boggled then give this a try it just may keep you going (at least until this bug is fixed).

Rimas Kudelis (rq) wrote :

hmm...
I guess my problem is out too, as my non-laptop restarts seem to have been caused by misconfiguration of lm-sensors (/etc/sensors.conf).
Anyways, my PC is now undergoing warranty reparations, as yesterday I found out that my PSU cooler wouldn't spin (causing emergency reboots again, heh).

Aldrin Martoq (amartoq) wrote :

I think there is bug on handling thermal in Ubuntu. There are 2 approaches:

1.- slow down everything: this is what microsoft windows is doing, just install MobileMeter (http://www.geocities.co.jp/SiliconValley-Oakland/8259/), play your favorite windows cpu intensive game and see. If your system is getting hot, windows slow down to the minimum (800Mhz). I found out that all my games are played at the lowest speed because of temperature problems.
2.- Tell the user that reached a critical themperature and *SUSPEND* the machine. It may be RAM or disk.

In my case, the problem was the fan were really really dirty. I've cleaned and now everything is fine. Some pictures are here:
http://img187.imageshack.us/slideshow/player.php?id=img187/557/1178842006ygq.smil

HTH,

I have the same issue, at upstart!, (running the live cd ubuntu/xubuntu, doesn't matter). The laptop (Compaq Presario 2107EA) have been running for months with ubuntu 6.10 without any issues, and was running for a couple of days smoothly using 7.04, but now it reaches critical temperature at once, often before gnome is even fully loaded. I do not have to use any CPU-heavy stuff, just leave it after upstart and it shuts down. I get the same problem when using the live CD of 7.04 or Xubuntu 7.04, so it does not seems to be my installation that is the problem. However, the computer runs perfect when booted with the 6.04 or 6.10 live-CD, so it seems to be an issue associated with the 7.04 version and its new kernel.

The computer have been running XP for years without problems (ok. maybe not without problems, were talking windows... but at least without this specific problem).

UrkoM (urko-masse) wrote :

Possible workaround:
I have improved the temperature of my laptop a lot, by doing the following:
1- setting Trip Points and Polling Frequency in /etc/rc.local, as described above:
echo -n "105:0:85:50:40:35" > /proc/acpi/thermal_zone/THRM/trip_points
echo 10 > /proc/acpi/thermal_zone/THRM/polling_frequency

2- following the information in the UbuntuGuide.org website about SpeedStep for my processor:
How to enable your CPU's Power Saving/Frequency Scaling features
http://ubuntuguide.org/wiki/Ubuntu:Feisty#How_to_enable_your_CPU.27s_Power_Saving.2FFrequency_Scaling_features
I chose the OnDemand governor.

My laptop is a Sony Vaio VGN FS315M, with Pentium-M 1.73Ghz. I am running the 386 kernel, because this laptop seems to have issues with the Generic kernel. It now runs most of the time at 798Mhz, but it increases the frequency quickly as needed. It is feels more responsive than before, when I just used the Trip Points and Polling Frequency.

However, it is very obvious that the fan is still not being used as much as I would like. Ideally, when the laptop is plugged to main power, I would like to set it on Performance mode, and use the fan as much as possible. When I do:
cat /proc/acpi/thermal_zone/THRM/*
I see that there is absolutely no information about Active cooling Trip Points.

Once again: if I need to provide any info to anybody working on this problem, just let me know.
Which makes me wonder: who is working on this? Any idea of when a new version of the kernel will come out, hopefully with some improvements?

eBobster (ebobster) wrote :
Download full text (3.3 KiB)

I'm a long time sufferer of this, found a way to live with it ensuring
that my laptop simply under performs, ... until recently.

Previously I requested verification of normal operating temperatures
-- how hot is 90-100 C for P-M1.73Ghz really?? At full power mine is
like this, apparently this is really killing it, but I don't think so.
 Any one know for sure _before_ commenting?

I'm concerned that there hasn't been any absolute statement about how
this should work and what to expect. Seems that it works ok for
enough people not to worry about it. But why does Windows use this
CPU so much better??

However, I can now get much better than before results from a fresh
Feisty Kubuntu install using the following:

echo "115:0:95:80:0" > /proc/acpi/thermal_zone/THR0/trip_points

and polling seems to be at 1.

From my point of view, I don't care about the 115 because the chip
will thermally sort itself out -- besides it never reaches this, and
this is the point, I don't want a shutdown! Every now and again I see
a temp of 102, 105, 107.. these are all just blips that immediately
return to 90 something and only happen at full pelt which isn't that
often for typical operation. The fan does its job, the chip scales
itself to 1300 (something) as and when. So I'm happy to let it rip
right to limit (hence the 95)

The other numbers don't really matter, its for the board which doesn't do a lot.

I'm running in ondemand, and can run performance (though don't see the point).

I've noticed now that my machine has started using C-States properly
also. I had 3+ hours instead of the usual <2. Some of this I'm sure
was due to the previous install of X being a bitch after several days
of use. Uptime is now 18 days and its running as nice as it does from
first boot.

I watch the acpiinfo all the time (recommended during this problem)

  watch acpitool -e

Can people please start posting this also for comparisons:

  CPU type : Intel(R) Pentium(R) M processor 1.73GHz
  Min/Max frequency : 798/1729 MHz
  Current frequency : 798 MHz
  Frequency governor : ondemand
  Freq. scaling driver : centrino
  Cache size : 2048 KB
  Bogomips : 1598.15
  Processor ID : 0
  Bus mastering control : yes
  Power management : yes
  Throttling control : no
  Limit interface : no
  Active C-state : C2
  C-states (incl. C0) : 5
  Usage of state C1 : 10 (0.0 %)
  Usage of state C2 : 19506306 (38.2 %)
  Usage of state C3 : 10372833 (20.3 %)
  Usage of state C4 : 21221897 (41.5 %)

  Thermal zone 1 : ok, 54 C
  Trip points :
  -------------
  critical (S5): 115 C
  passive: 95 C: tc1=2 tc2=5 tsp=300 devices=0xdf83ce50

  Thermal zone 2 : ok, 48 C
  Trip points :
  -------------
  critical (S5): 82 C

Q's:
where does the default trip information come from?
Is the default trip information wrong?
what should they be?
exactly what is the relationship between these trips, the cpu and the
CRIT-shutdown?
does the kernel have any control of the fans for this chip?
how is it possible that this doesn't affect every P-M1.73Ghz owner?

btw, you...

Read more...

Frank Abel (frankabel) wrote :

I just want say that I found a very similar bug report on https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.20/+bug/94862 and as you can see the "Assigned To" field are different ;)

eBobster (ebobster) wrote :

It started differently, 255 C?! hmmm... but seems to have mixed its
way into the same sort of bug...

So what has changed recently? Did the older kernels simply not bother
shutting things down?

Does it mix-up the reading of CRITICAL?? Is it possible that the
passive trip should be the 100 C in my (and many others) case?

Rob.

I've got a similar problem on my laptop (Dell Inspiron 9300, 1.73GHz Centrino, Kubuntu 7.04 [upgraded from 6.10]). The problem wasn't there before the upgrade.

I set the CPU Policy to "Dynamic" (aka "on demand"). When doing normal browsing for example:
$ cat /proc/acpi/thermal_zone/THM/temperature
temperature: 51 C

At this point, the CPU is at 800MHz.

Then, let's do some intensive task, like compile a simple app. The CPU goes to 1.73GHz as expected. but just after 2 seconds:

$ cat /proc/acpi/thermal_zone/THM/temperature
temperature: 66 C

And it continues to increase! There is NO WAY that the temperature is rising by 15C in 2 seconds!

Also, just being idle and setting the CPU Policy to "Performance" raises the temperature to 62C (and raising)...

And of course, when doing too intensive tasks, the trip point will be reached and the system will shut down. As I said, this didn't happened when I was on Edgy...

Daniel Scherdel (danschel) wrote :

Hi,

I have a similar problem if I`m using the CPU on full power and without any limitations the THZ*s will rise above 105°C/80°C (yes it is that high) and the system is going down, but gladly in a controlled manner (no file system check needed on next start up).

But I found a more or less working solution if I

$ echo 1330000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq

OR

$ echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

With that I can use the system even if it is working on full throttle a hole day.

The processor is a Intel(R) Pentium(R) M processor 1.86GHz with 8 steppings

Fell free to ask for more informations.
Greetings Dan

Andrew Bonham (bonham) wrote :

These problems have been affecting my desktop, making it difficult to play games, compile programs, and transcode video.

Enabling cpufrequtils (essentially by modprobing p4_clockmod) was done per this guide: http://ubuntuguide.org/wiki/Ubuntu:Feisty#How_to_enable_your_CPU.27s_Power_Saving.2FFrequency_Scaling_features

This works fine, but the "current policy" for cpufrequtils is set between 400 MHz and 1.28 GHz, despite it listing 3.2 GHz (my processor speed) as an avilable frequency. It is not possible to change /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq either using sudo or sudo -i. Using sudo it fails and says "permission denied". If I sudo -i and then echo 3200000 to it, it fails silently, accepting the echo but not changing the value when read by cat.

Since I want to use above 1.2 GHz speed, this is not a workable solution either. Others have encountered this problem in setting max scaling frequency in Ubuntu before, see http://ubuntuforums.org/archive/index.php/t-364548.html and http://ubuntuforums.org/showthread.php?t=308163

My CPU stays around 80 to 90 degrees under use, and while doing processor heavy applications goes to 93 and slowly creeps to 100, where the system shuts down. Yikes!

Download full text (5.8 KiB)

Thanks Thomas, for pointing out this bug report.

There are a lot of reports here, not all the same.

Note firstly, that 4 People have said their problems went away
when they cleaned out their fan.

A bunch more said that their problems went away when they
disabled powernowd and instead used the in-kernel cpufreq governors.
This is may explain the Ubuntu-specific aspect of a bunch
of these reports.

If you have cleaned out your fan, and you are able to reproduce the
issues using the in-kernel governors instead of userspace daemons,
then I urge you to file a bug (one but per system model please)
in the upstream bugzilla:

http://bugzilla.kernel.org/enter_bug.cgi?product=ACPI
in the Power-thermal category.

But a couple of more things need clarification:

There are two types of systems here and the way to address
them are totally different.

1. ACPI controlled fan that doesn't come on

If /proc/acpi/thermal_zone/*/trip_points
has lines that being with "active", then
you have active trip-points and ACPI fan control.

If the temperature rises above these trip points
and the fan fails to come on, then you have an
ACPI fan issue. There were several famous ones,
particularly the HP nx6125 and HP nx6325 which
have a very "interesting" BIOS that only recently
has Linux started to learn to deal with.

Note that ACPI thermals should NOT require polling.
If you need to set the polling frequency in /proc/acpi/../thermal_zone/polling_frequency
to anything but 0, then there is a kernel bug
that you are not helping to fix by working around it.

(Indeed, one could argue that the way this override
 is available is a bug and should be removed)

2. ACPI passive/critical trip points only

If you don't have active lines in your trip_points file,
but have just passive and critical, then you've
got motherboard controlled fans, and Linux
and ACPI have no influence on them.

Yes, sometimes a BIOS upgrade changes the firmware
fan control policy. Sometimes there are also BIOS SETUP
options for cool vs quiet as well.

But in both ACPI and motherboard fan cases above,
where the fans is spinning fast
and the temperature continues to climb...
CLEAN YOUR FAN! Many laptops suck air in
from the bottom and blow it through a fine grill.
This grill can get blocked with dust, making your
fan ineffective. (no, this doesn't address the
"windows works, linux doesn't" cases)

If you've got the thing open already, consider
checking that the cooling solution is properly
bolted down -- and when you check, consider
cleaning it off and using a dab or arctic silver
as the TIM when you replace it...

> ACPI: Looking for DSDT ... not found!

Ignore this -- it is Ubuntu "value add" from the patch they use to allow
overriding the DSDT with the initrd method.

> ondemand vs userspace governors

This may make the average power under typical use less.
however, ondemand is the same as "performance" when
the machine is fully utilized. ie. it may make a cooling
problem less common, but it doesn't address the worse case.

> customizing the DSDT

Don't bother. DSDT customization is primarily a debugging method.
If Linux can't handle the DSDT that came with your machine as well
as Windows does, ...

Read more...

When I filed my initial report (the Acer Aspire 3620 laptop), I hadn't actually tried cleaning the fan/heatsink. The laptop had exhibited symptoms very early one, within a month or two of initial purpose. I don't usually have dust problems with the many other computer systems in the household. So I discounted the possibility of that being the cause. Just to be perfectly certain it wasn't the case, I have since cleaned the fan with a few blasts from a can of compressed air in the intake (bottom) and outgo (right side). I ejected a fair amount of dust and the machine now runs about 20 degrees cooler. With cpuburn running, I'm hovering around 65 degrees Celcius with occasional very brief spikes to 75 degrees. This is _without_ proping the laptop up on a book. Previously, with the laptop propped up on a book to improve circulation I was idling around 80 degrees and could easily hit 95 during moderately intense use.

So, disregard my report. And anyone else experiencing these symptoms on an Acer Aspire 3620, it's possible that some batches are shipping unusually dirty. (Naturally, you abuse your poor fan by blowing compressed air on it at your own risk.)

StevePEI (stevedegrace) wrote :

I have a trippy observation which may be a Clue. At least for some people's problem. I don't know how to use this Clue, but I toss it out there in case it makes an "aha" for anyone.

I have been having a related problem on my Presario 2100. I recently wiped WinXP and replaced with Feisty. For about a week everything was awesome and I was really loving it. Then mysteriously this problem just started happening - I couldn't do anything remotely computationally intensive (like open firefox) without the computer shutting down and getting these cpu scaling and critical temperature complaints from the kernel. So at first I thought I had the Ubuntu analog of a virus - it must be something I downloaded. Tried removing a bunch of stuff and nope problem still there. So I decided to re-install Feisty. And interestingly *it would not work from the Live CD either* - where before it was *perfect* for hours at a time. The reinstall always causes one fo thhese critical temperature failres and never completes.

So here's the trippy observation. Fact one, when I was doing all this fiddling with Ubuntu, I was unplugging and plugging the laptop a lot as I was moving it from where I normally keep it to next to the router because I need an internet connection to install ndiswrapper to make my wireless card work - so my battery was always draining a bit. Fact two, I have a crappy battery, and just for fun I configured Gnome to always show the battery monitor. All the times I was using Ubuntu I always saw the icon in the shape of a battery and I always assumed that this was just how it looked. Then one day - just before my problems started - I notice it in the shape of a power cord and it says I am running on AC power. This is odd thinks I, because I was not used to seeing this, but since my battery is crap it makes sense that it would be charging practically forever if you let it drain even a little bit.

So after doing a lot of reading, two and two come together of a seemingly improbable possibility and I try unplugging the power, starting Ubuntu, letting it drain a litte, and starting my install. Suddenly, miraculum est, I have the persistent battery icon and my install is going without a hitch. Hmmm.

As odd as it seems, could the problem be related in some cases and in some possibly indirect way to something that should be a total one-off - the battery charging? As long as my battery is not fully charged I seem to be rolling in puppies over here, but they turn to rottweilers when the battery gets a full charge.

BTW - I am sure I should clean out my fan and I will take this as a wake-up call to do so. But my fan/temperature to my qualitative perception seems no different than on WinXP when Ubuntu is actually working for me. It was working fine under Windows (well, not "fine", but not doing anything like this). Something else is going on here.

On 24/05/07, StevePEI <email address hidden> wrote:

Steve,

I think you are simply operating in powersave mode which for me keeps
the CPU running around the 800Mhz from the possible 1.7+Ghz. Hence
your CPU remains cool. I am doing this deliberately now just to keep
from the reboot.

You can evaluate this for yourself by installing acpitool and keeping
watch of "acpitool -e".

Please, if you have the time try it and see how our results compare.

> BTW - I am sure I should clean out my fan and I will take this as a
> wake-up call to do so. But my fan/temperature to my qualitative
> perception seems no different than on WinXP when Ubuntu is actually
> working for me. It was working fine under Windows (well, not "fine", but
> not doing anything like this). Something else is going on here.

Indeed.

I haven't compared on the same machine CPU Freq/Temp of Linux vs
Windows, but I have another machine running P-M1.8 with CPUCool
telling me temperatures under Windows and I can't inspire that to heat
up to 80C. I will assume that this has to be due to insufficient dust
on this machine yet - for now, though it isn't new either - however, I
hope at least to get more attention from Len Brown, since his
excellent post dispelled so much community speculation (in the face of
lack of anything better).

Rob.

StevePEI (stevedegrace) wrote :

Hi Rob,

If I'm running any kind of power management it's not intentionally. Anything I know about power management and Linux (or Windows) I learned in the last few days - it's always been a topic I've been happy to ignore. I swear I didn't deliberately touch anything to do with power management until after the problem appeared.

Actually, as a side note, I don't even have anything under the directory /sys/devices/system/cpu/cpu0 except topology, so a lot of the stuff people are talking about above can't be applicable to me.

I did clean out my fan and it is running better now. I reinstalled Ubuntu successfully and I am having no problems now. Of course, I also have not allowed the battery to fully charge yet to try and test my theory again. But the problem came and went away in a fashion that could not have anything to do with anything I had installed, or any problem with my fan - I was still having the problem trying to use the LiveCD (system stopping under what could never be called an excessive load, just starting Firefox would do it), and it went away when I allowed my battery to drain a little, again running from the LiveCD. I can't think of any good explanation for this. It seems like the battery is fully charged but I have a battery icon rather than a cord icon in Gnome, FWIW.

The fan is speeding up and slowing down with load just like I was used to it doing on Windows - maybe my BIOS is taking care of this?

Here is my output from acpitool -e with a fair number of things running:

stephen@stephen-laptop:~$ acpitool -e
  Kernel version : 2.6.20-15-gener20060707 - ACPI version : 20060707
  -----------------------------------------------------------
  Battery #1 : present
    Remaining capacity : 4192 mAh, 100.0%
    Design capacity : 4400 mAh
    Last full capacity : 4192 mAh, 95.27% of design capacity
    Capacity loss : 4.727%
    Present rate : 0 mA
    Charging state : charged
    Battery type : rechargeable, LION
    Model number : 02KT
    Serial number : 20353

  AC adapter : on-line
  Fan : <not available>

  CPU type : Mobile Intel(R) Celeron(R) CPU 1.80GHz
  CPU speed : 1794.356 MHz
  Cache size : 256 KB
  Bogomips : 3592.00
  Processor ID : 0
  Bus mastering control : yes
  Power management : yes
  Throttling control : yes
  Limit interface : yes
  Active C-state : C2
  C-states (incl. C0) : 3
  Usage of state C1 : 264290 (3.8 %)
  Usage of state C2 : 6601745 (96.2 %)
  T-state count : 8
  Active T-state : T0

  Thermal zone 1 : ok, 50 C
  Trip points :
  -------------
  critical (S5): 96 C
  passive: 90 C: tc1=4 tc2=3 tsp=40 devices=0xeae54874

   Device Sleep state Status
  ---------------------------------------
  1. PCI0 5 disabled
  2. MDEM 4 disabled
  3. LAN 5 disabled
  4. COM1 4 disabled
  5. LID 3 * enabled

Len Brown (len-brown) wrote :

Steve,
Perhaps you can remove the "battery charging" aspect of your observation by
removing the battery and running directly from AC w/o a battery present?
(not every laptop will run in this mode, but most will)

It is possible that your laptop enables some power savings modes when on DC
that it is not enabling when on AC. Eg. most laptops boot in low-frequency-mode
when on DC, but some boot in high-frequency-mode when on AC...
You'd be able to tell pretty quick by comparing the bogomips from the AC vs DC
boot messages.

While sometimes these difference are hidden in firmware, sometimes they
are exposed to ACPI. Please attach the output from "acpidump" so we can
figure out what the laptop exposes. Also, the output from dmesg -s64000.

StevePEI (stevedegrace) wrote :

Hi Len,

Running without the battery I have not been able to reproduce the issue. So I'm even more baffled, although perhaps my having cleaned out the fan is helping too - maybe I should try watching some videos for a while with the battery removed. My problem seems to have gone away, but since I have no idea what caused it in he first place, and it showed a complete lack of dependency on my particular Ubuntu installation (problem still existed with CD), I can't say it won't come back, either. In any case, I'll do whatever I can to contribute to research on this issue.

Output of acpitool -e while battery removed:

stephen@stephen-laptop:~$ acpitool -e
  Kernel version : 2.6.20-15-gener20060707 - ACPI version : 20060707
  -----------------------------------------------------------
  Battery #1 : slot empty

  AC adapter : on-line
  Fan : <not available>

  CPU type : Mobile Intel(R) Celeron(R) CPU 1.80GHz
  CPU speed : 1794.226 MHz
  Cache size : 256 KB
  Bogomips : 3592.07
  Processor ID : 0
  Bus mastering control : yes
  Power management : yes
  Throttling control : yes
  Limit interface : yes
  Active C-state : C1
  C-states (incl. C0) : 3
  Usage of state C1 : 39561 (6.3 %)
  Usage of state C2 : 585696 (93.7 %)
  T-state count : 8
  Active T-state : T0

  Thermal zone 1 : ok, 50 C
  Trip points :
  -------------
  critical (S5): 96 C
  passive: 90 C: tc1=4 tc2=3 tsp=40 devices=0xeae54874

   Device Sleep state Status
  ---------------------------------------
  1. PCI0 5 disabled
  2. MDEM 4 disabled
  3. LAN 5 disabled
  4. COM1 4 disabled
  5. LID 3 * enabled

======================
Output of acpidump:

Too much to get off the console, attached instead.

======================
Output of dmesg -s64000:

I'll attach this output to a subsequent message.

StevePEI (stevedegrace) wrote :

Here is my sudo dmesg -s64000 output.

StevePEI (stevedegrace) wrote :

I duplicated the issue. I left the laptop plugged while shut down. When I next turned on the computer, I saw the cord icon instead of the battery icon and I had a sinking feeling. Sure enough, starting Firefox caused the laptop to shut down, complaining that CPU scaling was not enabled and that a critical temperature of 51C had been reached (!). It lasted maybe three minutes. Took out the power cable, started up again, and this time she's still going.

Seems like two conditions are needed to make Ubuntu flaky for me:

1. The battery must be present and *fully* charged from being charged while the computer is off.

2. The AC cord must be attached.

Weird.

StevePEI (stevedegrace) wrote :

And to follow up on that... Feisty had a fit the second after I wrote the above, ergo, problem not fixed.

I was not able to duplicate my fix at first. I let the batter drain and restarted the computer - the Gnome power aplet was saying AC power *even with the AC unplugged* and acpitool seemed to be saying I didn't have a battery at all - even though I was running on battery entirely!

I found a fix, but it is as whacky as everything else with this. See bug 122378, a rejected bug. What I did was boot from an Edgy CD (maybe a Feisty CD would have worked too) while the battery was partly drained and the AC disconnected. Battery seems to be properly recognised. Rebooted Feisty from the hard drive. Now all of a sudden I have a (correct) battery icon, that says I am running on AC power (I am) and that the battery is charging, and it is tracking the degree of charge. acpitool says I have a battery present. Not having any meltdowns.

There seems to be a communication and a persistence between Ubuntu and the hardware on these Compaq Presario laptops that is not dependent on the hard disk! Or something. There is a mystery here. Maybe if it can be solved, it can expand the breadth of Ubuntu's unproblematic hardware support.

bmjbmj (bmjbmj) wrote :

Hello (sorry for my bad englich)

I think have "solved" the "bugg" for the 15xx line of Acer computers under kubuntu 7.4.

We are swtching to kubuntu at my work and i started instaling on 37 Aspire computers. All of them 2-1 yers old and used in a office that is cleand 2 times a week. Of the 37 computers 11 crached at instal. Msg somthing about 90 degree and over limmit then black screen and shutdown. I'm an electric enginger and was very suprised that windows dident report the same. I investigated what I belived was a bug and read this bug info. Then I checked CPU speed under Windows and it was 800 Mhz on all computers. This is not an bugg, it is a construction fault! And i'm suprised that the computer does not sound an alarm (this is standard on desktops and servers). I phoned Acer but thay blamed Linux so I moved 3 computers down to our factory and disasambeld them. A thick burr of dust hade asambled around the air duct from the fan (monted at the top of the keboard area) and the exhust at the back of the computer. I used clean oilfree air at 2 bar from a wall outlet. The dust did not go away so I increased the presur to 6 bar and finaly the dust exploded in a smal cloud. I didn't disasamble the rest of the computers but used compresed olifree air on them to. After cleaning all computers kubuntu instaled whide out any problmem. I also strestested 7 of them for 24 houers and thy ran at 2200 Mhz whide out problem. I have instaled a dustfilter on top of the air intake and informed user to periodicaly test 2200Mhz operation for 30 minutes from the control panel. The CPU now alters speed from 2200Mhz on full load (CAD calculation) to 800Mhz under e.g Mozilla.

Thank you all for constructing a OS that has a warningsystem to inform the operator of faults and thanks for this buginfo!

eBobster (ebobster) wrote :

Nice one bmjbmj.

I would still like the OS to not shutdown given there are other
options to this problem. A best of both would be really nice.

I managed to clean my heatsink this weekend. Inspection from the
outside looked ok, but inside a big dust mass definitely affecting
airflow. Now this is shifted, the laptop compiled a kernel at full
speed without incident.

Heat transfer compound was also added - their strip of tape looked a bit shit.

acpitool actually shows my CPU temperature occasionally going to 100C
but no shutdowns, on the whole its running 10-20C cooler than before.

Rob.

On 28/05/07, bmjbmj <email address hidden> wrote:
> Hello (sorry for my bad englich)
>
> I think have "solved" the "bugg" for the 15xx line of Acer computers
> under kubuntu 7.4.
>
> We are swtching to kubuntu at my work and i started instaling on 37
> Aspire computers. All of them 2-1 yers old and used in a office that is
> cleand 2 times a week. Of the 37 computers 11 crached at instal. Msg
> somthing about 90 degree and over limmit then black screen and shutdown.
> I'm an electric enginger and was very suprised that windows dident
> report the same. I investigated what I belived was a bug and read this
> bug info. Then I checked CPU speed under Windows and it was 800 Mhz on
> all computers. This is not an bugg, it is a construction fault! And i'm
> suprised that the computer does not sound an alarm (this is standard on
> desktops and servers). I phoned Acer but thay blamed Linux so I moved 3
> computers down to our factory and disasambeld them. A thick burr of dust
> hade asambled around the air duct from the fan (monted at the top of the
> keboard area) and the exhust at the back of the computer. I used clean
> oilfree air at 2 bar from a wall outlet. The dust did not go away so I
> increased the presur to 6 bar and finaly the dust exploded in a smal
> cloud. I didn't disasamble the rest of the computers but used compresed
> olifree air on them to. After cleaning all computers kubuntu instaled
> whide out any problmem. I also strestested 7 of them for 24 houers and
> thy ran at 2200 Mhz whide out problem. I have instaled a dustfilter on
> top of the air intake and informed user to periodicaly test 2200Mhz
> operation for 30 minutes from the control panel. The CPU now alters
> speed from 2200Mhz on full load (CAD calculation) to 800Mhz under e.g
> Mozilla.
>
> Thank you all for constructing a OS that has a warningsystem to inform
> the operator of faults and thanks for this buginfo!
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

GreatBunzinni (greatbunzinni) wrote :

bmjbmj

I do not think that is the case. At all.

First of all, the dust bunnies clogging the fan intake does not justify the different thermal behaviour that that line of laptops shows under windows (works smoothly) and linux (massive overheating problems). What you saw there was windows correctly handling the CPU scaling, which doesn't happen in Ubuntu.

Second, I just got my acer aspire 1524 laptop from Acer's customer support, where it was repaired.The laptop went up in smoke and they've replaced the motherboard, graphics card and keyboard. It was as clean as it could be. Nonetheless, as soon as I got it I tried to install Kubuntu and as it was expected it crashed due to overheating. There were no dust bunnies nor any spec of dust inside the laptop and yet, the overheating problem persists. And yes, that problem doesn't happen in windows. Again.

It doesn't justify why the temperature would be increasing/decreasing by 10C
within approx. 2 seconds either (see my bug entry for details).

P-É

On 5/28/07, GreatBunzinni <email address hidden> wrote:
>
> bmjbmj
>
> I do not think that is the case. At all.
>
> First of all, the dust bunnies clogging the fan intake does not justify
> the different thermal behaviour that that line of laptops shows under
> windows (works smoothly) and linux (massive overheating problems). What
> you saw there was windows correctly handling the CPU scaling, which
> doesn't happen in Ubuntu.
>
> Second, I just got my acer aspire 1524 laptop from Acer's customer
> support, where it was repaired.The laptop went up in smoke and they've
> replaced the motherboard, graphics card and keyboard. It was as clean as
> it could be. Nonetheless, as soon as I got it I tried to install Kubuntu
> and as it was expected it crashed due to overheating. There were no dust
> bunnies nor any spec of dust inside the laptop and yet, the overheating
> problem persists. And yes, that problem doesn't happen in windows.
> Again.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Len Brown (len-brown) wrote :

StevePEI
Can you reproduce this with a 2.6.21.stable kernel using a sane config
(eg defconfig). your dmesg shows that Ubuntu is loading just about everything
under the sun. If they include something like lm-sensors (what does # sensors say?)
that could be a conflict with ACPI on your platform which could result
in erratic behaviour.

Len Brown (len-brown) wrote :

Pierre-Étienne Messier,
Same comment as for Steve...
Can you reproduce the problem with a 2.6.21.stable kernel --
in particular, one with CONFIG_HWMON=n?

Len Brown (len-brown) wrote :

GreatBunzinni
I agree that Linux isn't handling a thermally challenged situation as well as Windows
in some of these reports.

eg. if you can tape up your fan and Windows can handle it through passive cooling
and throttling your system down, Linux should be able to handle that as well.

It might merit a big fat warning, but running slow is better than heating up
until a critical thermal trip takes your system away.

Hello,

I did a few things during the weekend on my laptop, as upgrading the kernel
and cleaning the fans.

First, I upgraded the kernel from 2.6.20-15 to 2.6.20-16 then I shut the
computer down in order to clean it (I know, I should have done it
separately...). It has quite a lot of dirt into the fans and heatsinks (my
laptop is 2 years old). I removed it all using a can of compressed air, and
I added some thermal paste on the CPU core (the original thermal paste
wasn't too visible, I think Dell put a very thin layer of thermal paste
originally).

When I booted the computer, the idle temperature was much lower, around 35C
(instead of 50C). Putting it into "performance" mode did increase the
temperature, but not as much as before. The fans are quieter too. Since the
kernel upgrade was a security fix (correct me if I'm wrong), I think that
cleaning my laptop solved the problem.

I could compile a custom 2.6.21.3 kernel without rebooting (the temperature
was around 65C IIRC). Using that kernel, the situation is similar: the
temperature is much cooler than it was before I clean my laptop. Therefore,
I do not think that the kernel is the main problem in the scenario.

Though, I believe that there may be something strange (but not broken) with
the temperature reporting since it is increasing/decreasing real fast (it
may be the hardware that is desined that way though, I did not try on
Windows (on a separate hard disk that I use about once a year) to see if the
behavior is similar -- I'll add it to my TODO list.

Could other people with this problem clean their machines to see if it
helps?

On 6/3/07, Len Brown <email address hidden> wrote:
>
> GreatBunzinni
> I agree that Linux isn't handling a thermally challenged situation as well
> as Windows
> in some of these reports.
>
> eg. if you can tape up your fan and Windows can handle it through passive
> cooling
> and throttling your system down, Linux should be able to handle that as
> well.
>
> It might merit a big fat warning, but running slow is better than heating
> up
> until a critical thermal trip takes your system away.

I totally agree with you, Len!

Pierre-Étienne

scott (sahendrickson) wrote :

Hi all,

I'm not sure if this is related, but my computer shuts down every few days because it "overheats". However, I wasn't sure that it was actually overheating, so I set up a cron job to "cat /proc/acpi/thermal_zone/*/*" to a file every minute. Last time it shut down, I had the following entries in the messages log file ...

Jun 10 20:23:13 scott-server kernel: [46118.903677] ACPI: Critical trip point
Jun 10 20:23:13 scott-server kernel: [46118.956637] ACPI: Unable to turn cooling device [df85e798] 'on'
Jun 10 20:23:14 scott-server gconfd (scott-17284): Received signal 15, shutting down cleanly
Jun 10 20:23:14 scott-server gconfd (scott-17284): Exiting

... and the following information from my cron job output ...

cooling mode: active
<polling disabled>
state: ok
temperature: 38 C
critical (S5): 70 C
passive: 55 C: tc1=4 tc2=3 tsp=60 devices=0xdf855338
active[0]: 55 C: devices=0xdf85e798

It doesn't seem that the CPU is actually overheating to me. I modified the cron job to store more information each minute so that next time I'll have more to work with.

date
cat /proc/acpi/thermal_zone/*/*
acpitool -e
smartctl -d ata /dev/sda -a The
smartctl -d ata /dev/sdb -a

Any ideas on what I could log to figure out what is happening?
I have cleaned the fans, and I know that they are turning on and off when the system gets hot.
Finally, I'm also pretty sure that the system didn't actually overheat, as nothing was being done when it overheated. Is it possible that there's a bug that misreports the temperature or a trip point sometimes?

I'd appreciate any input or help.

Thanks,
-- Scott

scott (sahendrickson) wrote :

Same shutdown this morning. But, it's clean from the cron output that neither the CPU or the hard drives were overheating. Is there any way to temporarily turn off temperature monitoring?

Matt Wallace (mmw-old) wrote :

/me too!!!

I've only just started to encounter this issue as of this morning, despite having Feisty running pretty much since it was released.

I'm going to give some of the above a shot over the next few days and I'll report back, just thought I should add my voice to this...

Cheers,

Matt.

Rimas Kudelis (rq) wrote :

hmm...

I escaped those shutdowns by commenting out all i2c related modules from /etc/modules (I suppose I added them myself previously, when trying to set up fan speed control).

scott (sahendrickson) wrote :

I was able to stop the shutdowns (at least for the last week, so far) by adding pci=noacpi to the boot options. But, I don't actually what that does or why it worked (or, what I lost by doing that).

Julien MARY (jmary) wrote :
Download full text (123.5 KiB)

I have similar trouble on compaq evo n160. Trouble was not existing on edgy.
My fan never start and the temperature whatever is the real one is always showing 47C.
In /proc/acpi/thermal_zone/TZ0/ I have the files :
cooling_mode polling_frequency state temperature trip_points

A cat on them shows respectively :
<setting not supported>
<polling disabled>
state: ok
temperature: 47 C
critical (S5): 100 C
passive: 98 C: tc1=0 tc2=1 tsp=150 devices=CPU0

dmesg | grep -i acpi shows :

[ 0.000000] BIOS-e820: 0000000027f60000 - 0000000027f6fc00 (ACPI data)
[ 0.000000] BIOS-e820: 0000000027f6fc00 - 0000000027f80000 (ACPI NVS)
[ 0.000000] ACPI: RSDP 000F6400, 0014 (r0 PTLTD )
[ 0.000000] ACPI: RSDT 27F69E0D, 002C (r1 PTLTD RSDT 6040000 LTP 0)
[ 0.000000] ACPI: FACP 27F6FB64, 0074 (r1 COMPAQ CPQ2C01 6040000 PTL 1)
[ 0.000000] ACPI: DSDT 27F69E39, 5D2B (r1 COMPAQ U22C01 6040000 MSFT 100000D)
[ 0.000000] ACPI: FACS 27F7FFC0, 0040
[ 0.000000] ACPI: BOOT 27F6FBD8, 0028 (r1 PTLTD $SBFTBL$ 6040000 LTP 1)
[ 0.000000] ACPI: PM-Timer IO Port: 0x1008
[ 16.658023] ACPI: Core revision 20070126
[ 16.661168] ACPI: setting ELCR to 0200 (from 0e20)
[ 16.663314] ACPI: bus type pci registered
[ 16.689177] ACPI: Interpreter enabled
[ 16.689185] ACPI: (supports S0 S1 S4 S5)
[ 16.689208] ACPI: Using PIC for interrupt routing
[ 16.697085] ACPI: PCI Root Bridge [PCI0] (0000:00)
[ 16.697396] PCI quirk: region 1000-107f claimed by ICH4 ACPI/GPIO/TCO
[ 16.697972] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
[ 16.698315] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGPB._PRT]
[ 16.698479] ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCIB._PRT]
[ 16.700672] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 9 10 *11 14 15)
[ 16.700843] ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 *5 7 9 10 11 14 15)
[ 16.700992] ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 7 9 10 11 14 15) *0, disabled.
[ 16.701162] ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 9 *10 11 14 15)
[ 16.701330] ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 *9 10 11 14 15)
[ 16.701544] ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 9 10 11 14 15) *0, disabled.
[ 16.701696] ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 9 10 11 14 15) *0, disabled.
[ 16.701846] ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 9 10 11 14 15) *0, disabled.
[ 16.702031] pnp: PnP ACPI init
[ 16.702048] ACPI: bus type pnp registered
[ 16.709008] pnp: PnP ACPI: found 11 devices
[ 16.709013] ACPI: ACPI bus type pnp unregistered
[ 16.709021] PnPBIOS: Disabled by ACPI PNP
[ 16.709118] PCI: Using ACPI for IRQ routing
[ 17.101708] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 10
[ 17.101720] ACPI: PCI Interrupt 0000:02:06.0[A] -> Link [LNKD] -> GSI 10 (level, low) -> IRQ 10
[ 19.911001] ACPI: CPU0 (power states: C1[C1] C2[C2])
[ 19.913617] ACPI: Thermal Zone [TZ0] (47 C)
[ 20.668836] ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 11
[ 20.668848] ACPI: PCI Interrupt 0000:00:1d.0[A] -> Link [LNKA] -> GSI 11 (level, low) -> IRQ 11
[ 20.772903] ACPI: PCI Interrupt 0...

Leszek Trenkner (olaf-post) wrote :

For me - but my laptop has never overheated - just turning off powernowd lowered system temperature by 7-8 degree celcius (from 61 to 53 degrees idle - CPU T5500 still using "ondemand" frequency scaling governor). With powernowd system almost all the time ran on full speed, even if the CPU's weren't loaded with anything serious, thus ondemand governor was noot functioning properly - or rather powernowd broken it. Removing powernowd from init scripts fixes that problem, and CPU usually for text editing or web browsing are running on 60% of nominal frequency - so maybe bad powernowd influence on scaling governors causes overheating?

Note - despite proper DSDT my T5500 has somewhat limited throttling capabilities:
$ cat /proc/acpi/processor/CPU1/info
processor id: 0
acpi id: 1
bus mastering control: no
power management: yes
throttling control: no
limit interface: no
Stock ubuntu Fiesty with updated kerne & everything else: Linux black 2.6.20-16-generic #2 SMP Thu Jun 7 20:19:32 UTC 2007 i686 GNU/Linux

mikko (mikko-) wrote :

My Acer laptop does shutdowns at least once a week. It's not fun. I was listening just ONE wav-file with audacity and suddenly same shutdown again.

Jul 4 22:08:18 unelma-laptop kernel: [ 1772.376000] ACPI: Critical trip point
Jul 4 22:08:18 unelma-laptop kernel: [ 1772.376000] Critical temperature reached (80 C), shutting down.

I did't have problems before. But after updating to edgy and now with feisty this have been very annoying problem. In XP I can use at least 15 wav-files with 15 vst-effects and still record a new track at the same time, without any warming. And XP has done shutdowns never (without user:). So this machine it's not a problem. In linux, I can listen hardly one sound file.. fun.

I can hardly touch the underside of my laptop after shutdown, it's so hot. Indeed.

Alejandro Zanotti (aleza66) wrote :

My problem was due to Excesive dust in the airsink intake. You may wanna
take a look.

2007/7/4, mikko sorri <email address hidden>:
>
> My Acer laptop does shutdowns at least once a week. It's not fun. I was
> listening just ONE wav-file with audacity and suddenly same shutdown
> again.
>
> Jul 4 22:08:18 unelma-laptop kernel: [ 1772.376000] ACPI: Critical trip
> point
> Jul 4 22:08:18 unelma-laptop kernel: [ 1772.376000] Critical temperature
> reached (80 C), shutting down.
>
> I did't have problems before. But after updating to edgy and now with
> feisty this have been very annoying problem. In XP I can use at least 15
> wav-files with 15 vst-effects and still record a new track at the same
> time, without any warming. And XP has done shutdowns never (without
> user:). So this machine it's not a problem. In linux, I can listen
> hardly one sound file.. fun.
>
> I can hardly touch the underside of my laptop after shutdown, it's so
> hot. Indeed.
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

--
Alejandro Zanotti

GreatBunzinni (greatbunzinni) wrote :

Alejandro, Mikko stated that he doesn't experience any overheating problems with XP. Your dust bunnies suggestion only sticks if somehow the dust bunnies are playing tricks on them evil linux users.

mikko (mikko-) wrote :

I put pci=noacpi to the grub..
But after boot the Ubuntu's login sound is played eternally, plop, plop, plop, plop, plop..
My machine was very slow, mouse cursor moved 1cm/second. I tried also Rhythmbox, it worked only one second and then played eternally one second loop. Noacpi is not the answer for me. I removed noacpi and no problems again, only warming.

eBobster (ebobster) wrote :

On 05/07/07, GreatBunzinni <email address hidden> wrote:
> Alejandro, Mikko stated that he doesn't experience any overheating
> problems with XP. Your dust bunnies suggestion only sticks if somehow
> the dust bunnies are playing tricks on them evil linux users.

Does this imply that in spite of the "dust bunnies", XP is able to
manage the heat of the CPU without causing a shutdown? Perhaps
Microsoft bought them too?

Fact - dust bunnies reduce heat dissipation from CPU.
Fact - Windows still works.
Fact - Linux will start shutting down because of overheating.
Fact - I liberated my dust bunnies and haven't had an overheat since
and don't have to run in powersave all the time anymore to get > 5
mins worth of experience from Linux.
Desire - a more reasonable resolution.

We still don't know how general this problem is to generic 'Linux'.

So there is a problem to fix.

Alternatively, there is a business model for OSS dust bunny liberation
- we can charge per microgram of "bunite".

--
The OssDBLF

sorry just modified it a little bit :)

librano (librano04) wrote :

sunix! that script sounds like a real solution... kind of like what windows does... how exactly did u implement that script?

Nic Hanno (nichanno) wrote :

Len, thanks for the clarification. Your comments are not 'pedantic'.

After installing Ubuntu Feisty on IBM/Len Thinkpad Z60m, i'm quite impressed - support for *most* laptop power management hardware.

I really think a FAQ needs to be authored on this topic, i mean, laptops are really becoming the norm nowadays and its good that people are interested in power saving - albeit for battery life, but this does also help the environment.

I think clarification is also needed on the differences between 'find /sys -iname acpi' and 'find /proc -iname acpi' - thermal_zone here but cpufreq there! Why!?!?

Anyway, thanks again for the fact Len.

Nic.

librano :
it's just a loop with a sleep of 1 sec. it's parsing the temperature from the file and according to your settings using the governor.
However im not sure if it is the good way to do this : maybe using inotify could be a good enhancement. And i'm not sure as well if this kind of trick's been done by another tool (actually i didn't really understand the begining of this bug report ... has something's been done, is going to be done or ? why does the reboot happen in Feisty instead of trying to slow down the cpu first ? what can we do ? )

i read again this bug report and
echo 1 > /proc/acpi/thermal_zone/TZS0/polling_frequency
seems to scale the cpu according to the temperature :)

Benjamin COUHE (voraistos) wrote :

Hi i did post something about my now-almost-defunct laptop which was overheating. CPU was a pentium-m. in fact, it was overheating on feisty if i remember while previously (before an upgrade or update) it did not. I also strangly noticed that, at some point, the hard drive would lock and stop spinning- a feature that was not present on the actual machine- in fact the hard drive was heating so much that it was actually crashing -or call that a panic :P - . Since the PC had almost not the time to boot before to smell like something burning and crashing, i stopped using it. Recently i discovered that the hard drive was in fact intact and not overheating -through ide/usb interface- on a "good" system running the same version of ubuntu. More recently i decided to boot ubuntu from livecd on the "dead" machine. Didnt overheat, even when i wanted it to. i then plugged my hard drive to the usb port, installed ubuntu on it, updated.... No crash, no heat, nothing. My point is: i dont know if its kernel-package related, but somehow something blew my ide contoller up, or more likely has been driving my chipset crazy, the chipset running everything else -CPU ?- crazy as well. i recall that the cpu was changing its frequency all the time very fast (trusting the gnome applet). The thing that controls this is probably in the chipset (but i dont know :O ), and, is probably controlled by some sort of basic kernel driver or ACPI. Something overrides the normal, natural, BIOS-defined way of keeping the computer safe, or lies to the bios, or hte bios doesnt detect this stuff as an error and does its usual job driving the rest of the system crazy, making the box crash.

Matthew Garrett (mjg59) wrote :

Not an acpi-support issue

Changed in acpi-support:
status: Confirmed → Invalid
ctirpak (chris-tirpak) wrote :

The rest of the story:

1. Glad i wasn't in the passenger seat.
2. insurance company said it want a total loss and used non OEM parts to
rebuild it

:-)

Matthew Garrett wrote:
> Not an acpi-support issue
>
> ** Changed in: acpi-support (Ubuntu)
> Status: Confirmed => Invalid
>
>

ctirpak (chris-tirpak) wrote :

Argh, please accept my apologies for the prior post (it should be deleted), I replied to two emails and crossed up my replies.

Why is this not an acpi-support issue and what should it be filed under. This bug caused me to stop using Fiesty on a brand new (i.e. no clogged fan) Lenovo T60p. Having my laptop reboot for no real reason moments after startup was more than I could handle. Then to have it tell me it had overheated was just too much. Using openSuSE it works fine.

Is this bug just mis-categorized? Because if it affects you, it is critical. i suspect there are quite a few more people affected that haven't reported it. I had nothing to add so was just watching it.

Thanks
Chris

Matthew Garrett (mjg59) wrote :

It's a kernel issue - acpi-support is a set of helper scripts, but basic functionality like this is provided by the kernel. The bug was already filed against the kernel, so I've just closed the acpi-support part of it.

Thanks for the explanation. That makes sense.

Chris

Gustavo Niemeyer (niemeyer) wrote :

Same problem. IBM/Lenovo T60, Fan doesn't go up and CPU burns up to shutdown.

Michael Chang (thenewme91) wrote :

Just curious, but on pentium-based laptops, are p4_clockmod (etc.) getting loaded by default?

If not, should they be?

(I find on my desktop P4 computer, I get "throttling <not supported>" and have to load p4_clockmod manually (in /etc/modules or via modprobe) in order to get throttling support. For me, this is significant, because even P4-based desktops I've seen seem to get too hot over extended periods of "normal" use.)

mikko (mikko-) wrote :

I tried to rip a cd to mp3 with Grip. Just after two minutes:
mikko@unelma-laptop:~$ cat /proc/acpi/thermal_zone/THRC/temperature && cat /proc/acpi/thermal_zone/THRS/temperature
temperature: 79 C
temperature: 65 C

then I tried abcde from terminal:

temperature: 62 C
temperature: 49 C

Very interesting, these were the highest values, never above 62C. Why? Althought I used also Mozilla and other programs at the same time. With abcde mp3 encoding time was same as in Grip at least. Maybe even faster.

Tim O'Callaghan (timo-linux) wrote :

I suspect that this - at least in my case - may be more than just acpi/throttling.

I've been running Windows XP on my 5024wlmi (AMD64 Turon ML34 ATI Mobility Radeon X700) for the last year or so, and while my machine has run hot, even on over night, as usual on bittorrent type deals, its never given me any trouble.

So i want to do some serious work, and decided to finally convert to a 64bit Linux distro, which is one of the reasons why i bought it.

Now every attempt i have made to do this has been after a few hours of doing stuff on Windows - email, music, podcasts etc.

I attempt to run a liveCD, and it shuts down very quickly. I switch to console mode to see why, and i'm told my machine is running at 102 degrees and linux must shut down. Probably Impossible numbers, but there you go.

I have had this similar problem when i try to use the Ubuntu, Kbuntu, Kbuntu64 and DSL live cd's which i tried one after the other.

Len Brown (len-brown) wrote :

Gustavo,
If your fan is clean and you can reproduce the issue on your T60
using the latest stable kernel.org kernel (currently 2.6.22),
please file a sighting on bugzilla.kernel.org against ACPI.
I don't expect to find ACPI fan control on the T60 -- as thinkpads
have never had it before, but lets have a look. Please attach
(do not paste) the output from acpidump to the report.
Note that you should try with and without the thinpad/ibm_acpi driver loaded.

Tim,
5024wlmi is an Acer, yes? Acer _does_ tend to use ACPI fan control,
so we may have a fighting chance of doing something with that box.
As above, please file sighting upstream.
Also, to collect the output from acpidump, you may have to boot
the system with "acpi=off", which will likely enable some SMM-based
fan control that will keep the system fans functioning properly.

Michael,
If p4_clockmod is necessary to keep a system cool, then the system either
has some design issues, or failing cooling hardware. This is true also
of using any other cpufreq frequency control. For a properly designed
and functioning system should be able to cool itself sufficiently to
sustain maximum performance as long as necessary. cpufreq may make the system
run cooler on average, but doesn't address the case where the system
is fully utilized where cpufreq does nothing.

mikko (mikko-) wrote :

Ripping cd with sound-juicer. Normally my machine automatically shutdowns if temperature is above 80. Why not now? Why I was able to rip many cds, temperature was almost an hour 86C?

mikko@unelma-laptop:~$ cat /proc/acpi/thermal_zone/THRC/temperature && cat /proc/acpi/thermal_zone/THRS/temperature
temperature: 78 C
temperature: 55 C
mikko@unelma-laptop:~$ cat /proc/acpi/thermal_zone/THRC/temperature && cat /proc/acpi/thermal_zone/THRS/temperature
temperature: 77 C
temperature: 55 C
mikko@unelma-laptop:~$ cat /proc/acpi/thermal_zone/THRC/temperature && cat /proc/acpi/thermal_zone/THRS/temperature
temperature: 83 C
temperature: 56 C
mikko@unelma-laptop:~$ cat /proc/acpi/thermal_zone/THRC/temperature && cat /proc/acpi/thermal_zone/THRS/temperature
temperature: 83 C
temperature: 56 C
mikko@unelma-laptop:~$ cat /proc/acpi/thermal_zone/THRC/temperature && cat /proc/acpi/thermal_zone/THRS/temperature
temperature: 86 C
temperature: 58 C

leexgx (leexgx) wrote :

why is this bug still here has 7.04 fixed this bug

why cant you just Let the laptop use its Own built in auto fan control as windows as far as i know norm does not control the fan under norm use unless you loaded an driver or an program to control it, just looks like Linux kernel ACPI or ubuntu is overriding the fan controls (and thermal protection) when the laptop has working one and may damage laptop over long time

i have an T60 laptop to test but i have to Bend it all the time to make it work (looks like it was not put back together right missing screws {not me was given to me}) i fix this laptop (pull it apart and rebuild it) and see if i can get it working fine (boots into XP fine when laptop is bent lol)

if it overheats but Other Linux distros do not its an Problem with the Ubuntu code messing with the fan speeds

zechs (the-principal-ideal) wrote :

"Me too" X 2

I have the same problem on two laptops; Thinkpad R51 and Thinkpad x61. Processors are Pentium M and Core Duo (respectively). The x61 doesn't actually shut down (though I haven't been pushing it very hard), but gets very hot, while the R51 shuts down and reports temp at 85 or 86 degrees. The heat is too much even starting at a blank desktop, but increases a lot when watching videos on youtube for example. I can report that the heat is much lower when I'm using gentoo installed on the same computers.

udude (igal) wrote :

Hello all
My laptop (Acer Aspire 5560) also suffers from this issue.
A few instances of a script that does "ls" in an endless loop can bring my cpu temp to 95C. At that point it smells like melting plastic so I stop the scripts.
Air vents are clean as a whistle, windows XP (it's a dual boot system) never gets close to that temperatures no matter what I run.

I've been subscribed for updates on this bug for a while now, reading similar messages over and over week after week.
If I am not mistaken, this bug was opened during Sep 2005. It's been two years after it was reported, yet this critical issue still exists.
It seems to me that there are many people subscribed on this tread who will gladly assist in providing more debug info and beta test possible solutions.
Can ubuntu engineers provide the reason for the fact this was not fixed by now?
Also it seems to me that it should be mentioned here if/when this issue is expected to be resolved.
I would really want to know that so I can plan ahead and maybe switch my laptop to a different linux distro.
I really like ubuntu. I have it on my desktop, it's doing great there and I wish I could keep it on my laptop (it's supposed to be for mobile systems as well, no?).
Saying that, it does not make much sense in the long term to run a distro that melts the hardware it runs on. This issue requires ubuntu's response, one way or another.

SMiTTY (mike-ftl) wrote :

 I fully agree with the previous comments....This bug has been open for WAY too long with no movement.
Frankly the comments to check your fans/vents/etc has been played out. There are too many people having this issue and we can't all have dirty boxes.

 When are the developers going to step up and give this bug some attention?

 I really like Ubuntu, but as people have pointed out...It's not happening on other distro's and not happening under windows.
So what gives? Do we all change away from Ubuntu?

 BTW : I'm running on an HP dv8000

 Let me know if I can give any debug info so we can squash this.

 - SMiTTY

DantePasquale (dantepasquale) wrote :

I would have gladly helped out by donating my laptop (See old posting from quite some time ago), but the overheating finally caused the motherboard to melt and my laptop is now dead. This was not a good thing!

I have the same problem in my HP Pavillion ZV5000.
Now I use the script make by sunix, work for me...
In windows my machine works fine... I think this is a major bug / implementation from ACPI.
I try setup /proc/acpi/thermal_zone/THRM/polling_frequency but dont find good documentation.
Some of you can help-me? I want substitute script froms sunix by a polling_frequency.
Sorry my english.

michelef (michelef) wrote :

I had the same problem with Kubuntu 7.10 install on a Gateway Solo 9500 notebook. I found that specifically turning off apm (no-apm) at install made it go away. Also turned off PM (power Management) in bios and set Intel Speedstep tech to Auto in bios. Still haven't managed to complete the install but now the CPU fan takes a break when nothing is happenning (i.e. Idle in Install desktop). Doesn't seem to be as mainstream as the other posts but it might narrow the search. Knoppix on the same machine caused no problems and does not require the no-apm switch.

GreatBunzinni (greatbunzinni) wrote :

This bug still persists in Ubuntu 7.10. I've tried to install it on my laptop and I had to endure the frustration of having it overheat during the install process in all 3 attempts at installing that I did.

On the other hand, the opensuse installation process runs flawless and without a single overheating incident.

tempura (tempura) wrote :

Hi all!
Same problem here. I have used 7.04 for long and everything was running smoothly.
After the update to 7.10 I am experiencing the same problem - as soon as the temperature is about 74°C (according to ACPI) my laptop switches off (HP Pavilion zd8000 series - P4 3Ghz). trip_points shows a critical temperature of 81°C.
I can not run any CPU-intensive applications, since the computer will switch off after about 1-2 minutes - useless in my case.
Furthermore suspend does not work (was working with 7.04) - but this is an other issue.

tempura (tempura) wrote :

Hi again!
I have just reverted to kernel 2.6.20-15-generic (feisty) and this fixes the problem for me. During CPU intensive tasks (~100% CPU load) the CPU temperature peaks at 75°C but the system will not shut down.
It seems that something got messed up (possibly in ACPI) between 2.6.20-15-generic and 2.6.22-14-generic. Suspend/resume does work again as before, so for now I will stick with 2.6.20-15-generic.
Unfortunately this does not help users who experienced this behaviour already in feisty...

GreatBunzinni (greatbunzinni) wrote :

On a side note, I've just realized that opensuse in fact also suffers from this problem. It doesn't affect it as hard as it affects Ubuntu because it appears that it's overheating limits are higher. I've just installed opensuse, gave it a trial run and my laptop shutdown with warning messages announcing that the system temperature was at about 104 C.

Thomas Renninger (trenn) wrote :

Isn't that the time when nohz/clocksource/highres patches came in?
Can you check whether C-states are working correctly.
Can you also check with "watch -n1 cat /proc/interrupts", whether you become thousands of timer interrupts per second.

tempura (tempura) wrote :

I have tried to check C-states with powertop, but it says, that detailed information is only available for mobile processors (which mine is not - I think it is a desktop P4 HT crammed in a notebook).
cpuinfo shows:
...
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Pentium(R) 4 CPU 3.00GHz
stepping : 3
cpu MHz : 3000.000
...

About the interrupts: on 2.6.20-15 I get about 300 timer interrupts/sec. On 2.6.22-14 the timer is very quiet - about 200 interrupts after 5 minutes uptime.

Mircea Deaconu (mirceade) wrote :

Hello! First of all let me say I am truly regretting doing this. This is a spam message sent to all critical bug message lists. It's purpose: making this (https://bugs.launchpad.net/ubuntu/+source/acpi-support/+bug/59695) bug critical too. This is a long standing bug and has a very serious impact on laptop type of hardware. It's priority is set to "wishlist" and I just cannot take this anymore. I DO NOT CARE if my account gets suspended. I am doing what's right for all my friends using Ubuntu on their laptops.

Thomas Renninger (trenn) wrote :

Especially ThinkPad (T41/T42/T43/R40/...) may want to have a look at this bug:
IBM T41p shuts down, powersave, Temperature state changed to critical
https://bugzilla.novell.com/show_bug.cgi?id=333043

Alex Masidlover (amasidlover) wrote :

I can also confirm this bug, I have a Rock Pegasus TL, which is a centrino based laptop with a pentium M 1.6 processor. Powernowd is not running and cpu frequency scaling is ondemand. I have a temporary fix which is: echo 1200000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq (obviously this value is specific to the characteristics of my processor and fan, but basically I just played around running a long compile and gradually increased the max scaling freq from being the same as the min, until I got a value which kept the CPU temp sensible...)

Thomas Renninger (trenn) wrote :

Alex, this problem could be very machine specific and may be unrelated to the other reports.
Please first make sure you are running the latest available BIOS and make sure your fans'
performance match the needs of your CPU and are clean and not filled up with dust.

Would you mind opening a separate bug at bugzilla.kernel.org against the ACPI component.
Please post /proc/acpi/thermal_zone/trip_points and acpidump there.
Also do a:
cat /proc/acpi/events
(you need to kill other applications accessing /proc/acpi/events first -> fuser).
Do you see (thermal or CPU related) acpi events when trip points are exceeded or if it gets hot?

Does it work if you have only 1 process producing load?
(E.g. if you compile a kernel with make -j6 it shuts down, but it works with just make?)

There is an issue since some kernel versions, that it can happen, that an acpi event is not processed/
scheduled quickly enough to lower the frequency.

Happy holidays to everyone.

I also am experiencing this problem on my brand new Athlon 64 X2 with Biostar motherboard Ubuntu 6.06 LTS, 2.6.15-29-amd64-server. It is unusable at this point.

I may have found the source of this problem: when the powernow-k8 module loads the following message appears in the messages log:

"Dec 23 22:21:35 flamingo kernel: [10705.713846] powernow-k8: Processor cpuid 60fb1 not supported."

A search for "cpuid 60fb1" at AMD's website led me to this page:

http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_1260_1202%5E1073%5E871%5E13118,00.html

which has a downloadable driver for kernels between 2.6.10 and 2.6.18.

Since the kernel in this version is 2.6.15, it does not include the driver for the Athlon 64 chips, and requires this download. However, the instructions from AMD say to compile the kernel. I'm not prepared to do that, I only want to compile the module. I have the kernel headers because VMWare needed them to compile its module, and I wold like to compile only the powernow-k8 module. Can someone provide the necessary steps?

Thanks!

GreatBunzinni (greatbunzinni) wrote :

I'm the owned of an Acer Aspire 1524 laptop, which suffers from this nasty overheating problem. I've gave up on running linux on it and now it runs windows exclusively, installed from the original Windows XP home that was bundled with the hardware.

I've just reinstalled windows on that computer and, following my experience taken from that nasty job, I have to announce something to this group. Here it goes.

This overheating problem isn't a linux bug nor a ubuntu bug. It's a hardware design/support problem, caused by the OEMs themselves and their majestic incompetence.

Here's how I arrived at that conclusion. I've picked up the laptop, I've taken the windows XP install CDs, rebooted the machine, placed the first install CD in the optic drive and started reinstalling windows from scratch. After a couple of minutes into the installation process, the laptop hangs and then crashes. Due to overheating. The laptop crashed while attempting to install the original OS which came pre-installed and bundled with the hardware.

Well, that could've been one of those rare crashes that only happen once in a while. I waited a while for the laptop to cool down and tried again to reinstall windows. Again, it hanged. Again, due to overheating.

Now, that crash could also be caused by faulty hardware. So I ran the install process yet again, this time pointing a hair drier directly into the laptop's air intake. That meant spending about an hour holding a air drier pumping cool air at full blast into the laptop. The install process succeeded and I managed to reinstall windows on that computer. After installing a few applications and anti-virus, I noticed that the laptop wasn't displaying any problems, which seemed weird, following that nasty install process. To sum things up, I've installed a game (America's Army) and gave it a try. Well, the game played smoothly without a single incident for over an hour. In a computer which couldn't even pass the first 5 minutes of the windows XP install process without overheating.

So it seems that's that. It isn't a linux problem after all. It's a hardware design and support, caused simply by the OEM's unlimited incompetence.

On 15/09/2007, Len Brown <email address hidden> wrote:
> Tim,
> 5024wlmi is an Acer, yes? Acer _does_ tend to use ACPI fan control,
> so we may have a fighting chance of doing something with that box.
> As above, please file sighting upstream.
> Also, to collect the output from acpidump, you may have to boot
> the system with "acpi=off", which will likely enable some SMM-based
> fan control that will keep the system fans functioning properly.
>

Yes

Tim O'Callaghan (timo-linux) wrote :

On 15/09/2007, Len Brown <email address hidden> wrote:
> Tim,
> 5024wlmi is an Acer, yes? Acer _does_ tend to use ACPI fan control,
> so we may have a fighting chance of doing something with that box.
> As above, please file sighting upstream.
> Also, to collect the output from acpidump, you may have to boot
> the system with "acpi=off", which will likely enable some SMM-based
> fan control that will keep the system fans functioning properly.
>

Try again...

Yes, 64bit AMD, the only affordable 64bit laptop on the market
at the time....

To be honest i have given up on it for the moment. I hope to do
something with it in the new year. The issue for me is actually
installing Ubuntu or similar so that i have the tools to analyze
the problem before the machine throws an overheating exception.

Thats is the real problem with getting this fixed, it takes too
much fekking around with kernel drivers before you can start to
look into the issue. I will trawl this thread a bit for tips,
but i do not have the time right now.

Tim.

gunashekar (gunashekar) wrote :

This happens with my laptop with Celeron M (64 bit) processor.
the heating does not happen with windoze vista or XP.
I have tested fiesty, gutsy as well as the development versions of hardy and some other distros and the problem persists.

Tim O'Callaghan (timo-linux) wrote :

Fixed in latest (8.04 Alpha?) release. Was able to install it and run
the livecd with few problems.

Tim.

On 25/01/2008, Giovanni Lovato <email address hidden> wrote:
> ** Also affects: linux-source-2.6.24 (Ubuntu)
> Importance: Undecided
> Status: New
>
>
> --
> CPU overheats during high usage "throttling <not supported>"
> https://bugs.launchpad.net/bugs/22336
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Just adding a note that I'm reassigning the Ubuntu Hardy kernel source package from 'linux-source-2.6.24' to just 'linux'. Beginning with the Hardy release the package naming convention changed from linux-source-2.6.x to just linux. Sorry for any confusion.

Also just curious if anyone else has tested with a Hardy Alpha release? Thanks.

Changed in linux:
status: New → Incomplete

csim says:

"I noticed the following message during boot:

ACPI: Looking for DSDT ... not found!

So i went on installing IASL and looking at what's wrong with the DSDT, when i compiled the resulting dsdt.dsl file, it gave me 9 errors and 23 warnings.

http://gentoo-wiki.com/HOWTO_Fix_Common_ACPI_Problems

I found out that by replacing all _T_0 with T_0, _T_1 with T_1 and _T_2 with T_2 would fix the errors and it did! Now i only had 23 warnings, i was only able to fix 2, since i wasn't sure about the other ones."

>"I noticed the following message during boot:
>
>ACPI: Looking for DSDT ... not found!

You can ignore that message.
It is normal for Ubuntu, which prints it when
no DSDT override is found in the initrd.

przemo24555 (przemo2) wrote :

I solved this problem yesterday, it's not about linux distribution or kernel version :)
All you need is 3 min :)

http://przemo2.blogspot.com/2008/03/cpu-overheats-cpu-si-przegrzewa.html

gunashekar (gunashekar) wrote :

The overheating problem was resolved after a bios upgrade on my HP/Compaq Presario V6000 laptop

iMatt (anti-spam-imatt) wrote :

My T60p overheating problem with 7.10 appears to be solved.

I did a BIOS upgrade to version 2.21 from the IBM/Lenovo website. Released 2008/02/13.
http://www-307.ibm.com/pc/support/site.wss/document.do?sitestyle=lenovo&lndocid=MIGR-63027

The temperature skyrocketed when I ran a cpuburn test (up past 80C for 20 mins) but it wouldn't lock up. Minutes prior to the update, the machine would lock up in the 60-65C range.

It also appears the fan is running about 600rpm faster by default. Verified by "cat /proc/acpi/ibm/fan". I no longer have to physically set the fan to speed "7" anymore.

This leads me to believe it wasn't solely a CPU temperature issue, as the machine can run MUCH warmer now without lockup.
To all out there experiencing this issue, check for a BIOS upgrade - it just may fix the problem.

Should the problem reoccur I will post up.

tempura (tempura) wrote :

For me the problem was resolved by updating to Hardy (8.04). Now everything seems to run fine.

clickwir (clickwir) wrote :

CPU scaling not working on my laptop with latest hardy.
See bug: https://bugs.launchpad.net/ubuntu/+source/powernowd/+bug/231534

My desktop, is working with CPU frequency scaling. That's an AMD Athlon X2 4000+ Brisbane.

What else can I provide to help get this fixed?

Sergio Zanchetta (primes2h) wrote :

The 18 month support period for Edgy Eft 6.10 has reached it's end of life. As a result, we are closing the linux-source-2.6.17 Edgy Eft kernel task. However, please note that this report will remain open against the actively developed kernel. Thank you for your continued support and help as we debug this issue.

Changed in linux-source-2.6.17:
status: Confirmed → Invalid
mathew (meta23) wrote :

Still getting overheating in Hardy, on an IBM ThinkPad T42p.

It's sufficiently bad that I can't rsync at full speed or the system overheats and shuts down.

Daugirdas (daugirdas) wrote :

I tested 8.04 x64 kubuntu live cd on my notorious Acer Aspire 1522Wlmi. Unfortunately, the issue is still there.

It reached 90C PASSIVE point and shut down

The issue is now in suse as well. It was introduced in 10.3 as a "bugfix". I am feeling a bit hopeless.

I'll try the 32bit kubuntu. If that works I'll just give away the laptop to mum and leave it. If not, it is hard to say that but it will be WINDOWS ONLY machine.

Regards,
Daugirdas

On Tuesday 01 July 2008 01:58:17 Daugirdas wrote:
> I tested 8.04 x64 kubuntu live cd on my notorious Acer Aspire 1522Wlmi.
> Unfortunately, the issue is still there.
>
> It reached 90C PASSIVE point and shut down
>
> The issue is now in suse as well. It was introduced in 10.3 as a
> "bugfix". I am feeling a bit hopeless.

This could be a good hint to find it.
Since when do see this happenening?
Which kernel was still working?
Can you do a:
rpm -q --changelog kernel-xy-0.00-0 |head -n30
of the working and the not working kernel and send it, pls.

Thanks,

        Thomas

Daugirdas (daugirdas) wrote :

I did some testing today. I tried kubuntu 32 bit 8.04 - no luck.

vmlinuz-2.6.25.5-1.1-vanilla kernel from suse 64bit worked beautifully. The system goes to 800MHz once it his 90C and stays at that speed until it cools down to 75C. The graph shows it nicely. The system did not shut down and this is very important.

So we need to narrow it down to a specific patch. That is the list of suse patches http://www.mirrorservice.org/sites/ftp.opensuse.org/pub/opensuse/update/10.3/repodata/ .
http://www.mirrorservice.org/sites/ftp.opensuse.org/pub/opensuse/update/10.3/repodata/patch-kernel-4749.xml --- looks particularly suspicious and would roughly fit the timescale. I'll try to do some more testing in the evening to narrow it down.

patches.arch/acpi_thermal_passive_blacklist.patch: Avoid
 critical temp shutdowns on specific ThinkPad T4x(p) and
 R40 [#333043]

Daugirdas

pirast (pirast) wrote :

When you say that you have tried the vanilla kernel then it is the one without any patches which can be found on kernel.org.

So Ubuntu applies some patch which breaks things.

Daugirdas (daugirdas) wrote :

This is the summary of the kernel package I used:
"kernel-vanilla - The Standard Kernel - without any SUSE patches

The standard kernel - without any SUSE patches Source Timestamp: 2008-06-07 01:55:22 +0200"

So that would my imply both *ubuntu and SUSE have some patch which breaks power management.

pirast (pirast) wrote :

That means that it is up to the Ubuntu devs which are not very responsive here...

pirast (pirast) wrote :

Bug in some patch that Ubuntu ships. Bug does not happen with upstream tarball.

Changed in linux:
assignee: nobody → ubuntu-kernel-team
status: Incomplete → Confirmed
Daugirdas (daugirdas) wrote :

I did some source comparison between suse and vanilla 2.6.25.5 kernels. The ./drivers/acpi/processor_thermal.c and processor_throttling.c were identical. Thermal.c were different (vanilla on the left):

daugirdas@dtrsuse64:~/Desktop/linux-2.6.25.5/drivers/acpi> diff thermal.c thermal.cs
443,445c443
< if (ACPI_FAILURE(status))
< tz->trips.passive.flags.valid = 0;
< else
---
> if (ACPI_SUCCESS(status)) {
447,452c445,454
<
< if (memcmp(&tz->trips.passive.devices, &devices,
< sizeof(struct acpi_handle_list))) {
< memcpy(&tz->trips.passive.devices, &devices,
< sizeof(struct acpi_handle_list));
< ACPI_THERMAL_TRIPS_EXCEPTION(flag, "device");
---
> if (memcmp(&tz->trips.passive.devices, &devices,
> sizeof(struct acpi_handle_list))) {
> memcpy(&tz->trips.passive.devices, &devices,
> sizeof(struct acpi_handle_list));
> ACPI_THERMAL_TRIPS_EXCEPTION(flag, "device");
> }
> } else {
> tz->trips.passive.flags.valid = 0;
> ACPI_EXCEPTION((AE_INFO, status, "Invalid passiv trip"
> " point\n"));

I can't read this but hopefully this would suggest something. Especially since thermal.c contains these lines further down:

 /* take no action if nocrt is set */
 if(!nocrt) {
  printk(KERN_EMERG
   "Critical temperature reached (%ld C), shutting down.\n",
   KELVIN_TO_CELSIUS(tz->temperature));
  orderly_poweroff(true);
 }

Another point: THRC critical point on my system is 97C. 90C is PASSIVE, but I get shutdowns at 90C. That may also mean kernel confuses CRITICAL with PASSIVE!

Daugirdas (daugirdas) wrote :

files in drivers/thermal and drivers/cpufreq are identical

Changed in linux-source-2.6.17:
status: Invalid → Won't Fix
dtsmith1984 (dtsmith1984) wrote :

I am having the same problem with an Acer Aspire 3620 on 8.04.

Critical point is either 80 or 85C. And it reaches that quite easily with any graphics intensive program. 10 minutes of tuxcart and it shuts down.

Does compiling a new kernel from kernel.org solve the problem?

This is the first time ive ever put ubuntu (or linux for that matter) on a laptop. My girlfriend prefers linux and wanted me to get rid of windows, which i gladly did. But now her laptops overheating. She never had any problems in windows. I would like to find a solution so that i don't have to go back to windows.

I didn't have this problem with Ubuntu Hardy, however, since I upgraded to Ibex, my laptop is getting this problem and the battery life went down a lot. I have a Lenovo Thinkpad X61s.

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

clickwir (clickwir) wrote :

clickwir@lappy:~$ sudo modprobe powernow-k8
FATAL: Error inserting powernow_k8 (/lib/modules/2.6.27-1-generic/kernel/arch/x86/kernel/cpu/cpufreq/powernow-k8.ko): No such device
clickwir@lappy:~$ sudo modprobe acpi-cpufreq
FATAL: Error inserting acpi_cpufreq (/lib/modules/2.6.27-1-generic/kernel/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.ko): No such device
clickwir@lappy:~$ uname -a
Linux lappy 2.6.27-1-generic #1 SMP Sat Aug 23 23:19:01 UTC 2008 x86_64 GNU/Linux

Didn't work for me. Note, I just added a few intrepid repos and then updated anything just related to 2.6.27 and it's supporting modules.

My Intel wireless Pro/2200BG works fine, automatically picked up and connected to the network. Seems to be just fine, just no cpu scaling going on. This is an Acer Aspire 3004 laptop.

Colin Muller (colin-durbanet) wrote :

2.6.27-1-generic from kernel.ubuntu.com made no difference on my notebook.

I tried both with polling disabled and with polling at 2 seconds, as set in /proc/acpi/thermal_zone/THRM/polling_frequency then ran a test with stress, and the temperature just kept climbing until it hit critical, which ACPI detected, shutting down the machine.

The notebook (currently running Hardy, but the problem has been present since Warty):
http://www.durbanet.co.za/colin/mecer-linux/mecer_n223ii_notebook_ubuntu_linux.html

On machines like this, which don't raise an alert via ACPI at any temperature apart from CRITICAL, but which do have a constantly-updated record accessible via ACPI of what the current temperature is, is it not possible for the kernel to do the following:

a. Poll the current temperature at a user-configurable period, with a reasonable default
b. Turn on throttling or whatever else is required if the temperature goes above CRITICAL minus n, where n is a user-configurable value with a reasonable default.
c. To turn off the protective throttling (or whatever) when the temperature drops below beneath CRITICAL minus n minus m, where m is user-configurable with a reasonable default.

In the past, I've tried without success to achieve the above using powersave (which was not part of my Hardy install, so I haven't tried that again recently). I currently keep the machine permanently thorttled by having this line in /etc/rc.local:

/bin/echo -n 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq

Luka Renko (lure) wrote :

Similar solution (in kernel) was discussed in kernel.org Bugzilla, but proposed solution was not accepted for upstream kernel:
http://bugzilla.kernel.org/show_bug.cgi?id=10658

clickwir (clickwir) wrote :

Latest kernel and modules not working for me.

clickwir@lappy:~$ uname -a
Linux lappy 2.6.27-3-generic #1 SMP Wed Sep 10 16:18:52 UTC 2008 x86_64 GNU/Linux
clickwir@lappy:~$ sudo modprobe powernow-k8
FATAL: Error inserting powernow_k8 (/lib/modules/2.6.27-3-generic/kernel/arch/x86/kernel/cpu/cpufreq/powernow-k8.ko): No such device
clickwir@lappy:~$ sudo modprobe acpi-cpufreq
FATAL: Error inserting acpi_cpufreq (/lib/modules/2.6.27-3-generic/kernel/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.ko): No such device

Anything else I should try?

clickwir (clickwir) wrote :

clickwir@lappy:~$ uname -a
Linux lappy 2.6.27-6-generic #1 SMP Tue Oct 7 04:15:23 UTC 2008 x86_64 GNU/Linux
clickwir@lappy:~$ sudo modprobe powernow-k8
FATAL: Error inserting powernow_k8 (/lib/modules/2.6.27-6-generic/kernel/arch/x86/kernel/cpu/cpufreq/powernow-k8.ko): No such device
clickwir@lappy:~$ sudo modprobe acpi-cpufreq
FATAL: Error inserting acpi_cpufreq (/lib/modules/2.6.27-6-generic/kernel/arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.ko): No such device

Nada, no cpu frequency scaling. :-(

Sergio Zanchetta (primes2h) wrote :

The 18 month support period for Feisty Fawn 7.04 has reached it's end of life. As a result, we are closing the linux-source-2.6.20 Feisty Fawn kernel task. However, please note that this report will remain open against the actively developed kernel. Thank you for your continued support and help as we debug this issue.

Changed in linux-source-2.6.20:
status: Confirmed → Invalid
zeddock (zeddock) wrote :

I am now on 2.6.27 and problem still persists under 8.10.
Please continue this bug?

Thanx!

zeddock

mplexus (mike-plexousakis) wrote :
Download full text (4.4 KiB)

Hello everyone!

My laptop is Acer 1524 WLMi with AMD64 3400+ (2.2 GHz). It surely suffers from shutting down due to overheating. I am using Ubuntu 8.10 64bit with default kernel 2.6.27-9-generic (yet the problem exists as many in this tread state from earlier versions).

I clean the dust from the heatsink often enough and when I do it takes more time until the laptop shuts down from overheating - yet it still does shut down eventually.

As I understand it has something to do with thermal trip points.
I didn't compile my own kernel (as to fix DSDT errors etc).
I use the on-demand governor (as default).
These are the power-specific modules that are typically loaded:

# lsmod | grep power
powernow_k8 23684 0
cpufreq_powersave 10368 0
freq_table 13568 3 powernow_k8,cpufreq_stats,cpufreq_ondemand
processor 47800 2 thermal,powernow_k8
#

my /proc/acpi/thermal_zone/THRC has the following contents:

# cat cooling_mode
0 - Active; 1 - Passive
# cat trip_points
critical (S5): 97 C
passive: 90 C: tc1=2 tc2=5 tsp=300 devices=CPU0
# cat polling_frequency
<polling disabled>
#

On-demand governor works fine - the cpu frequency varies from 800-1200-2000-2200 MHz up and down according to cpu load. The problem is that when reaching 90 degrees of cpu temperature the high-scale frequency should scale down a bit to let cpu cool off and then rise back up to cope with heavy load. This scaling down does not happen and critical temperature is reached shutting down the system.

After reading Thomas Renninger's post,

[Thomas Renninger wrote on 2007-04-25: ...The tsp value (time in 1/10s how often temp should be polled when passive cooling is on) can be overridden by passing as thermal module parameter. ...]

I started to think maybe I should change thermal's module parameters.

This is the output of modinfo thermal in my system:

filename: /lib/modules/2.6.27-9-generic/kernel/drivers/acpi/thermal.ko
license: GPL
description: ACPI Thermal Zone Driver
author: Paul Diefenbaugh
srcversion: 1787CE9FEB053C917D031A9
alias: acpi*:LNXTHERM:*
depends: processor
vermagic: 2.6.27-9-generic SMP mod_unload modversions
parm: act:Disable or override all lowest active trip points. (int)
parm: crt:Disable or lower all critical trip points. (int)
parm: tzp:Thermal zone polling frequency, in 1/10 seconds. (int)
parm: nocrt:Set to take no action upon ACPI thermal zone critical trips points. (int)
parm: off:Set to disable ACPI thermal support. (int)
parm: psv:Disable or override all passive trip points. (int)

I edited my /etc/modprobe.d/options file and added

options thermal tzp=30 act=0 crt=0 psv=0

The option tzp=30 means 3 seconds of polling, and as for the other zeros I thought they would let me override thermal trip points (I ave already tried manually overriding them by sudo -i ... echo etc only reaching to the conclusion that newer kernel doesn't allow it [ubuntu forums]).

This was a heisty action just to see what happens. It turns out this has some good affects on thermal behaviour of my sy...

Read more...

dr.spock (dr.spock) wrote :

I'm using 8.10 32 bit edition, but my CPU is a 2.0 Dothan, so power management is done by acpi-cpufreq instead of powernow_k8.

$ lsmod | grep cpufreq
acpi_cpufreq 15500 0
cpufreq_userspace 11396 0
cpufreq_stats 13188 0
cpufreq_powersave 9856 1
cpufreq_ondemand 14988 0
freq_table 12672 3 acpi_cpufreq,cpufreq_stats,cpufreq_ondemand
cpufreq_conservative 14600 0
processor 42156 3 acpi_cpufreq,thermal

My /proc/acpi/thermal_zone/ has a THRM directory, not THRC, and it contains:

$ cat cooling_mode
<setting not supported>

$ cat trip_points
critical (S5): 100 C

$ cat polling_frequency
<polling disabled>

Adding "options thermal tzp=30 act=0 crt=0 psv=0" to /etc/modprobe.d/options does not change this value after reboot, it keeps showing "<polling disabled>", but one time I have managed to unload module thermal and reload it with these parameters, it showed "3 seconds". Anyway it does not work, and it shuts down when it reaches critical trip point.

Now I'm trying to repeat test but I can't unload thermal, because it says it's in use.

Any hint?

My kernel is 2.6.27-20-generic, but thermal module shows the same info:

spock@vulcan:/proc/acpi/thermal_zone/THRM$ modinfo thermal
filename: /lib/modules/2.6.27-10-generic/kernel/drivers/acpi/thermal.ko
license: GPL
description: ACPI Thermal Zone Driver
author: Paul Diefenbaugh
srcversion: 1787CE9FEB053C917D031A9
alias: acpi*:LNXTHERM:*
depends: processor
vermagic: 2.6.27-10-generic SMP mod_unload modversions 586
parm: act:Disable or override all lowest active trip points. (int)
parm: crt:Disable or lower all critical trip points. (int)
parm: tzp:Thermal zone polling frequency, in 1/10 seconds. (int)
parm: nocrt:Set to take no action upon ACPI thermal zone critical trips points. (int)
parm: off:Set to disable ACPI thermal support. (int)
parm: psv:Disable or override all passive trip points. (int)

Cheers.

mplexus (mike-plexousakis) wrote :

Well, it turns out that after a reboot my good thermal state was all gone and i was back to shut down due to overheating - and my polling frequency was still <polling disabled>.

I removed thermal module and modprobed itback again and things worked all right again, now my polling freq is back to 3 seconds.

So, adding into /etc/rc.local the below lines did it or me:

rmmod thermal
modprobe thermal

You probably cannot unload thermal because in your terminal you "are" inside /proc/acpi/thermal_zone directory. Get out and try rmmod and modprobe.

Note: using this trick feels the right thing to do for my laptop. It just feels thermal behaviour is in its best state ever: for the first time i can see my cpu use all 4 scales (800, 1200, 2000 and 2200) of frequencies (for several minutes not "instantly") where before it only scaled from lowest straight to highest (and stayed there on high load). Cool ! :-)

dr.spock (dr.spock) wrote :

Thanks mplexus, surely I didn't care I was inside one directory created by the module.

Now I have modified rc.local and polling frequency is set to 3 secs. I have tested it again with command 'yes | sha1sum' while monitoring CPU temp and freq, and freq raises inmediately 2GHz (max), and it shuts down when it reaches 100º (curiously, a message is shown on console that says it will shut down because it has reached 72º).

Well, in my case polling is working fine, but I think it doesn't work because my trip_points file only contains 'critical (S5): 100 C', and the only event possible, then, is to shut down the computer.

mplexus (mike-plexousakis) wrote :

Actually, I don't know if 1 second polling is better..

Anyway, I am still testing this thermal module and what possibilities it gives me. Experiment is all i can do :-)

Another note: for the first time power management works as supposed to in regards to battery reaching critical : now, it does what i told it to do, shut down the system cleanly. Previously, it ignored me and waited until battery went totally empty and then instantly hard-power-off. So, this new behavior is a good thing.

Truly changing the trip points should be the goal. Newer kernels don't allow this i suppose. Thermal module's options say something about it.. We'll see.

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Changed in acpi:
status: Unknown → Confirmed
Antonio Salazar (asalazarmx) wrote :

Just to add another laptop model to the list: A Sony Vaio GN-FZ250FE.

This laptop came with Window Vista, which runs the CPU significantly cooler than Ubuntu, the latter needing to use the cooling fan all the time because its temperature rarely drops from 55 Celsius. The NVidia GPU runs about 5 degrees hotter than the CPU, even when using Metacity instead of Compiz Fusion.

I've seen this behavior in a brand-new Toshiba laptop too, but I don't have it near to check the model, and reading all the comments I realize any people have this problem too. When asked, it's kinda embarrassing to admit that Windows Vista is greener than Ubuntu 8.10 on many laptops.

Changed in acpi:
status: Confirmed → Fix Released
Changed in acpi:
status: Fix Released → Confirmed
Changed in acpi:
status: Confirmed → Fix Released
Leho Kraav (lkraav) wrote :

hi everyone

i just ran into this overheating+shutdown issue with an amilo a1650 laptop, mobile sempron 3100+ (800, 1600,1800 MHz steps), gentoo 2.6.24 and 2.6.28-r10. in my case this was clearly a problem related to horrific thermal paste situation. afaik nobody has touched the cpu+hsf since the machine came from factory some years ago, but the cpu idle temperature was 56 C @ 800MHz. using performance or ondemand governor the cpu temperature would almost immediately skyrocket to 100+ C and the machine would almost immediately power off.

before, this laptop had run windows xp for a almost 2 years now where it also exhibited some occasional freezes and shutdowns, although it would not do it so abruptly, and freezes also seemed to be related to which Mobility Radeon 200M IXP video driver i would use (the newer, the worse!). but overall it would be in a working condition, although you could visibly see (ProcExp) it would throttle to 800 MHz way too often and bog down any task which hogged the cpu for any period of time.

yesterday i opened heatsink up and saw the thermal paste was in a terrible situation, as in it looked like nothing you would expect from a decent thermal paste application - hardened into pieces, scattered around the core, with random blobs stuck on the heatsink. applied some arctic cooling mx-1 and voila - it looks to have a worked. cpu idle temperature was 36 C after booting up, stabilizing at around 45 C @ 800 MHz after staying on for a while. i'm using ondemand governor, so 800 MHz is the usual working speed. in performance situations, the temperature does ratchet up double and more, but thanks to decent thermal paste, the fan has more time to kick in at higher speed and now the maximum temperature during 'make bzImage' was 98 C for a second. the cpu does get automatically throttled down to 800 MHz at high temperatures with performance governor, then switched up again at around 65-70C.

i also and monitored and graphed it (attached).

as you can see, these crappy low end laptops are just not very good performance or gaming machines. the cpu burns up easily and hardware has to scale back the MHz, killing any chance at high performance for extended periods. fortunately this one will do nothing sit idle most of the time with office work, so 800 MHz with occasional bump-up should work fine from now on. but obviously it's a worrying sign that coming out the factory, it was ill-prepared to do even that. googling "amilo overheat" is a clear indicator of that.

Manoj Iyer (manjo) wrote :

Looks like this is an old bug for which a fix has already been released. Marking as Fix Released. If this is still an issue on Jaunty/Karmic kernels please open a separate bug.

Changed in linux (Ubuntu):
status: Confirmed → Fix Released
Leho Kraav (lkraav) wrote :

as a final note, i've just switched the mobile sempron 3100+ (62W TDP) out for a turion mt-34 (25W TDP) for $15 off ebay. switching on performance governor for full 1800 MHz the turion temperature tops out at around 56C and does not get higher... which is almost the same as usual 800 MHz temperature for sempron! hopefully this ends the overheating issue.

edward stroupe (e-stroupe) wrote :

processor is presently running @ 50% plus - with just system moniter open, recently started doing this when opening yahoo e-mail account. Did this with Windows XP, but nit with Ubuntu @ early install and until just a few days ago. When running video clips near
HD, off line, fan turns on. Use an under pad with 2-fans to add cooling plus a side desk fan. 2004 Dell model I-5150. The 'Van Halen
of computers?! Ed.

To post a comment you must log in.