Comment 264 for bug 22336

Revision history for this message
mplexus (mike-plexousakis) wrote :

Hello everyone!

My laptop is Acer 1524 WLMi with AMD64 3400+ (2.2 GHz). It surely suffers from shutting down due to overheating. I am using Ubuntu 8.10 64bit with default kernel 2.6.27-9-generic (yet the problem exists as many in this tread state from earlier versions).

I clean the dust from the heatsink often enough and when I do it takes more time until the laptop shuts down from overheating - yet it still does shut down eventually.

As I understand it has something to do with thermal trip points.
I didn't compile my own kernel (as to fix DSDT errors etc).
I use the on-demand governor (as default).
These are the power-specific modules that are typically loaded:

# lsmod | grep power
powernow_k8 23684 0
cpufreq_powersave 10368 0
freq_table 13568 3 powernow_k8,cpufreq_stats,cpufreq_ondemand
processor 47800 2 thermal,powernow_k8
#

my /proc/acpi/thermal_zone/THRC has the following contents:

# cat cooling_mode
0 - Active; 1 - Passive
# cat trip_points
critical (S5): 97 C
passive: 90 C: tc1=2 tc2=5 tsp=300 devices=CPU0
# cat polling_frequency
<polling disabled>
#

On-demand governor works fine - the cpu frequency varies from 800-1200-2000-2200 MHz up and down according to cpu load. The problem is that when reaching 90 degrees of cpu temperature the high-scale frequency should scale down a bit to let cpu cool off and then rise back up to cope with heavy load. This scaling down does not happen and critical temperature is reached shutting down the system.

After reading Thomas Renninger's post,

[Thomas Renninger wrote on 2007-04-25: ...The tsp value (time in 1/10s how often temp should be polled when passive cooling is on) can be overridden by passing as thermal module parameter. ...]

I started to think maybe I should change thermal's module parameters.

This is the output of modinfo thermal in my system:

filename: /lib/modules/2.6.27-9-generic/kernel/drivers/acpi/thermal.ko
license: GPL
description: ACPI Thermal Zone Driver
author: Paul Diefenbaugh
srcversion: 1787CE9FEB053C917D031A9
alias: acpi*:LNXTHERM:*
depends: processor
vermagic: 2.6.27-9-generic SMP mod_unload modversions
parm: act:Disable or override all lowest active trip points. (int)
parm: crt:Disable or lower all critical trip points. (int)
parm: tzp:Thermal zone polling frequency, in 1/10 seconds. (int)
parm: nocrt:Set to take no action upon ACPI thermal zone critical trips points. (int)
parm: off:Set to disable ACPI thermal support. (int)
parm: psv:Disable or override all passive trip points. (int)

I edited my /etc/modprobe.d/options file and added

options thermal tzp=30 act=0 crt=0 psv=0

The option tzp=30 means 3 seconds of polling, and as for the other zeros I thought they would let me override thermal trip points (I ave already tried manually overriding them by sudo -i ... echo etc only reaching to the conclusion that newer kernel doesn't allow it [ubuntu forums]).

This was a heisty action just to see what happens. It turns out this has some good affects on thermal behaviour of my system. I created some CPU-consuming task ( yes | sha1sum ) and in a few seconds (must be 3..) after the THRC (cpu's) temperature reached 90 degrees the CPU scaled down to lowest freq (800MHz) for a few seconds giving the cpu a chance to lower it's temperature enough as not to reach critical point. After this brief break the cpu scaled again up to maximum freq (2.2 GHz) and I waited to see what happens when cpu temp reaches 90 again. Once again this happened, for a short period it scaled back to 800 MHz and then again to 2.2 GHz but it didn't shut down. I was bored and terminated the task - happy as ever that my laptop stayed alive!

I am going to check again on that as days go by, but I wish this is the end of my problems!

The polling frequency in /proc/acpi/thermal_zone/THRC now is set to 3 seconds (before it was disabled).
All other numbers stayed the same (e.g. still not supported in cpu throttling...).
In the past I tried to just echo 3 in /proc/acpi/thermal_zone/THRC/polling_frequency but the results where not the desired as it is now.
As i said I probably make too quick a conclusion - we will see..

Anyway, I'm new to this "forum" and grateful I found this thread! Thanks to everyone who state their opinions, suggestions and experience..!