thermal.ko fails -> no fan on laptop

Bug #250241 reported by gpk
10
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

The fan never runs when Linux is running, despite reasonably
high temperatures inside a HP pavilion G6062EA laptop (G6000 series).
It is running in 64-bit mode.

The problem can be seen in /var/log/syslog:
Jul 19 15:45:08 nglap kernel: [ 16.228237] ACPI Exception (thermal-0339): AE_BAD_DATA, No critical threshold [20070126]

But, the result of this problem is that /proc/acpi/thermal_zone is empty.
That means you can't even control the fans with user-mode software.

Whether or not the ACPI of the laptop is buggy, this is a bug in
thermal.ko . Even if the laptop doesn't set a thermal limit,
the fans still exist and should be controllable. The current behavior
of thermal.ko effectively sets an infinitely high thermal limit
and thus could be indirectly responsible for damage to
hardware. It'd be much better if it had a default thermal limit
built in at some reasonable value (e.g. 45C).

I know the laptop has working sensors, because using "sensors" from
the "lm-sensors" package works and the reported temperatures are
sensible. Oddly, though, it reports each temperature twice
and the temperatures don't exactly agree. (see log, attached).

I attach a dump from acpidump, if that helps.

So:
1) any help in getting thermal.ko to work would be very much appreciated,
2) thermal.ko does not fail gracefully. If no limit is set by the hardware,
    it should set a default, print a warning message and continue.

Revision history for this message
gpk (gpk-kochanski) wrote :
Revision history for this message
Yann Sionneau (yann-sionneau) wrote :

Thank you for taking the time to report this bug and helping to make Ubuntu better. Unfortunately we can't fix it, because your description does not yet have enough information.

Please include the following additional information, if you have not already done so (pay attention to lspci's additional options), as required by the Ubuntu Kernel Team:
1. Please include the output of the command "uname -a" in your next response. It should be one, long line of text which includes the exact kernel version you're running, as well as the CPU architecture.
2. Please run the command "dmesg > dmesg.log" after a fresh boot and attach the resulting file "dmesg.log" to this bug report.
3. Please run the command "sudo lspci -vvnn > lspci-vvnn.log" and attach the resulting file "lspci-vvnn.log" to this bug report.

For your reference, the full description of procedures for kernel-related bug reports is available at https://wiki.ubuntu.com/KernelTeamBugPolicies Thanks in advance!

Ps : you may want to add some informations about ACPI too, check out this web site to know what informations you can provide : https://wiki.ubuntu.com/DebuggingACPI

thank you !

Changed in linux:
status: New → Incomplete
Revision history for this message
gpk (gpk-kochanski) wrote :

Thanks for the help. Dmesg.log is attached.

Revision history for this message
gpk (gpk-kochanski) wrote :

lspci-vvnn.log is attached.

Revision history for this message
gpk (gpk-kochanski) wrote :

uname-a.log is attached

Revision history for this message
gpk (gpk-kochanski) wrote :

version.log is attached

Revision history for this message
gpk (gpk-kochanski) wrote :

version.log is attached

Revision history for this message
gpk (gpk-kochanski) wrote :

dmidecode >dmidecode.log
attached.

Revision history for this message
gpk (gpk-kochanski) wrote :

cp -r /proc/acpi /tmp;
cd /tmp
tar -cvjf acpi.tar.bz acpi
attached.

Revision history for this message
gpk (gpk-kochanski) wrote :

Oh yes. I should say that I'm running Xubuntu, nearly out of the box,
except that I had to add the "noapic" boot parameter in order to get
the USB to work.

gpk (gpk-kochanski)
Changed in linux:
status: Incomplete → New
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
gpk (gpk-kochanski) wrote :

I gave 2.6.27-2 a try. It was not a neat process, as the laptop depends on Nvidia drivers and Madwifi,
so I was network-less and graphics-less.

I did not hear the fan turn on. However, there were various things relating to thermal
devices in /sys/* which were not there before. I tried playing with some of them, i.e.
typing
echo 7 >status
and similar, and they had no obvious effect. However, I'm not sure if some modules
were missing -- I had to hit <esc> dozens of times during the boot sequence to
un-stick the process.

In /var/log/syslog, there was still a error message about "ACPI Exception (thermal-0339): AE_BAD_DATA, No critical threshold", however there seem to be rather more ACPI lines in the log file than before.

So, no strong conclusion, I think. Sorry for the lack of detail, but I haven't got the wireless
working, so I can only move data by human memory right now.

I added a line to /etc/apt/sources.list to bring in the ibex archives,
then did
aptitude install linux-kernel-2.6.27-2-generic.
Aptitude listed over 100 things that it was holding back, but it installed the new kernel.
Then I rebooted.

I can try something again, if you like.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi gpk,

Maybe give the LiveCD a try now that Alpha5 is out. At least I hope it will resolve the network issues so you can at least grab some log files like updated dmesg output to attach here. Thanks.

Revision history for this message
gpk (gpk-kochanski) wrote :

I've installed 2.6.27-2 on the alpha-5 CDROM.

No fan. Running one processor full throttle
raises the temperature to 55C in a minute or two, and no fan
turns on.

I attach the new syslog. Note that it was booted with
noacpi, and hangs 13 times during the boot process,
after t=3.825618, 81.826138, 105.798974, 116.424470
126.118154, 139.364775 (but it seems to have woken up
by itself from this last one), 174.156999, 220.06960,
222.519582, 296.707953, 307.506641, and after
"Begin: Running scripts/init-bottom
Done."

Revision history for this message
gpk (gpk-kochanski) wrote :

Booting again with noapic and noapictimer flags
give much the same result, except that it doesn't hang and I don't
have to keep hitting keys to make the boot continue.

However, my last post was somewhat wrong.

The fan *is* running, but at a constant low speed. You can hear it if you
put your ear up to the air vents, and feel a breath of warm air. This may
be new behavior, but it's far from full speed -- you can hear it rev up in the
first few seconds of the boot process.

Most importantly, it's not being thermostatically controlled.
CPU temperature will get up to 55C, and the fan gets no noisier even when the CPU
is quite hot.

There are no files named *thermal* in /proc, except the empty /proc/thermal_zone
directory:
$ cd /proc;
$ find . -name '*thermal*' -print
./acpi/thermal_zone
$

There are some thermal devices in /sys, and I attach the list as sys_thermal.txt

Also, note that the error in /var/log/syslog is still there:
Sep 11 20:21:40 nglap kernel: [ 2.379917] ACPI Exception (thermal-0377): AE_OK, No or invalid critical threshold [20080609]

Revision history for this message
gpk (gpk-kochanski) wrote :

Trying this just a few minutes ago, after
sudo aptitude update
sudo aptitude full-upgrade
on a vanilla intrepid alpha-5 install still gives the same result.
This is now using a 2.6.27-3 kernel.

It does thermostat now, though the temperature is rather hot.
At about 50C, the fan comes on (as above) very quietly, and
at about 60C, the fan comes on faster. The fan now keeps the processor from
getting hotter than 62C, even when both cores are running at 100%.

So, good job! It's now safe from melting.
But, is there some way to set the temperature a little lower?
A 60C laptop is really hard on the legs.

Perhaps an argument could be added to thermal.ko ?

FYI, I attach /var/log/syslog .

Revision history for this message
gpk (gpk-kochanski) wrote :

Trying this just a few minutes ago, after
sudo aptitude update
sudo aptitude full-upgrade
on a vanilla intrepid alpha-5 install still gives the same result.
This is now using a 2.6.27-3 kernel.

It does thermostat now, though the temperature is rather hot.
At about 50C, the fan comes on (as above) very quietly, and
at about 60C, the fan comes on faster. The fan now keeps the processor from
getting hotter than 62C, even when both cores are running at 100%.

So, good job! It's now safe from melting.
But, is there some way to set the temperature a little lower?
A 60C laptop is really hard on the legs.

Perhaps an argument could be added to thermal.ko ?

FYI, I attach /var/log/syslog .

Revision history for this message
Anish Bhatt (anish7) wrote :

The thermal module does not work for me, modprobe does not fail, however I get ACPI Exception (thermal-0339): AE_BAD_DATA, No critical threshold [20070126] in dmesg everytime the module is loaded and /proc/thermal_zone is empty.

Revision history for this message
gpk (gpk-kochanski) wrote :

Indeed, /proc/acpi/thermal_zone is empty for me, too. (And, I've been getting that error all along.) Odd that the fan turns on at 60C, then. I suppose it's some sort of hardware thermal limit?

Revision history for this message
gpk (gpk-kochanski) wrote :

I note that there is a DSDT.aml file for this laptop in bug 254688, along with a comment that there are a couple of errors in the laptop's original DSDT file. (DSDT.aml is all part of the ACPI stuff.)
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/254688 .

Incidentally, by 2.6.27-7 kernel, the error message has mutated to this:
Oct 19 11:16:12 nglap kernel: [ 2.370165] ACPI Exception (thermal-0377): AE_OK, No or invalid critical threshold [20080609]

Using the repaired DSDT.aml was not the cause of the changed error message. Presumably, something was modified in the kernel.

FYI, these are the thermal devices I see with a 2.6.2.7-7 kernel, Intrepid Beta release as of today.
Anyone know what I should expect to see?

$ cd /proc
gpk@nglap:/proc$ find . -name '*thermal*' 2>/dev/null
./acpi/thermal_zone
gpk@nglap:/proc$ cd ../sys
gpk@nglap:/sys$ find . -name '*thermal*' 2>/dev/null
./devices/virtual/thermal
./devices/LNXSYSTM:00/ACPI0007:00/thermal_cooling
./devices/LNXSYSTM:00/ACPI0007:01/thermal_cooling
./devices/LNXSYSTM:00/device:00/PNP0A08:00/device:23/device:25/thermal_cooling
./bus/acpi/drivers/thermal
./class/thermal
./module/processor/holders/thermal
./module/thermal
gpk@nglap:/sys$

It's still true that the fan is on (only the merest whisper), but real cooling only starts
when the processor temperature exceeds 60C.

Revision history for this message
kernel-janitor (kernel-janitor) wrote :

Hi gpk,

This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux-image-`uname -r` 250241

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
tags: added: kj-triage
Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
gpk (gpk-kochanski) wrote : Re: [Bug 250241] Re: thermal.ko fails -> no fan on laptop

In 8.10 it was semi-fixed. The laptop fan turns on at
60C and the temperature never exceeds 62C. This is uncomfortably
toasty, but it does keep the processor from melting.

I think the 9.04 situation is similar.

That laptop is currently out of the country, toasting someone else's
lap. I ought to have access to it again in September.

kernel-janitor wrote:
> Hi gpk,
>
> This bug was reported a while ago and there hasn't been any activity in
> it recently. We were wondering if this is still an issue? Can you try
> with the latest development release of Ubuntu? ISO CD images are
> available from http://cdimage.ubuntu.com/releases/ .
>
> If it remains an issue, could you run the following command from a
> Terminal (Applications->Accessories->Terminal). It will automatically
> gather and attach updated debug information to this report.
>
> apport-collect -p linux-image-`uname -r` 250241
>
> Also, if you could test the latest upstream kernel available that would
> be great. It will allow additional upstream developers to examine the
> issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once
> you've tested the upstream kernel, please remove the 'needs-upstream-
> testing' tag. This can be done by clicking on the yellow pencil icon
> next to the tag located at the bottom of the bug description and
> deleting the 'needs-upstream-testing' text. Please let us know your
> results.
>
> Thanks in advance.
>
> [This is an automated message. Apologies if it has reached you
> inappropriately; please just reply to this message indicating so.]
>
>
> ** Tags added: needs-kernel-logs
>
> ** Tags added: needs-upstream-testing
>
> ** Tags added: kj-triage
>
> ** Changed in: linux (Ubuntu)
> Status: New => Incomplete
>

Changed in linux (Ubuntu):
status: Incomplete → New
status: New → Incomplete
Revision history for this message
gpk (gpk-kochanski) wrote : apport-collect data

Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: NVidia [HDA NVidia], device 0: CONEXANT Analog [CONEXANT Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: gpk 1761 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'NVidia'/'HDA NVidia at 0xf6380000 irq 21'
   Mixer name : 'Conexant CX20561 (Hermosa)'
   Components : 'HDA:14f15051,103c30ea,00100000'
   Controls : 14
   Simple ctrls : 7
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=UUID=3679f683-8870-4725-abce-b4192c97f88c
InstallationMedia: Ubuntu 9.10 "Karmic Koala" - Release amd64 (20091027)
Lsusb:
 Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 002 Device 002: ID 04f2:b055 Chicony Electronics Co., Ltd
 Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
MachineType: Hewlett-Packard HP G6000 Notebook PC
NonfreeKernelModules: nvidia
Package: linux-image-2.6.31-14-generic 2.6.31-14.48
PackageArchitecture: amd64
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.31-14-generic root=UUID=f71a9a78-3cc1-4b79-9d43-11c258f9ae86 ro quiet splash
ProcEnviron:
 SHELL=/bin/bash
 PATH=(custom, user)
 LANG=en_GB.UTF-8
ProcVersionSignature: Ubuntu 2.6.31-14.48-generic
RelatedPackageVersions:
 linux-backports-modules-2.6.31-14-generic N/A
 linux-firmware 1.24
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
Uname: Linux 2.6.31-14-generic x86_64
UserGroups: adm admin cdrom dialout lpadmin plugdev sambashare
XsessionErrors:
 (gnome-settings-daemon:1777): GLib-CRITICAL **: g_propagate_error: assertion `src != NULL' failed
 (polkit-gnome-authentication-agent-1:1880): GLib-CRITICAL **: g_once_init_leave: assertion `initialization_value != 0' failed
 (nautilus:1872): Eel-CRITICAL **: eel_preferences_get_boolean: assertion `preferences_is_initialized ()' failed
 (gnome-panel:1871): Gtk-WARNING **: gtk_widget_size_allocate(): attempt to allocate widget with width -5 and height 24
dmi.bios.date: 12/07/2007
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: F.05
dmi.board.name: 30EA
dmi.board.vendor: Quanta
dmi.board.version: 86.09
dmi.chassis.type: 10
dmi.chassis.vendor: Quanta
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnHewlett-Packard:bvrF.05:bd12/07/2007:svnHewlett-Packard:pnHPG6000NotebookPC:pvrRev1:rvnQuanta:rn30EA:rvr86.09:cvnQuanta:ct10:cvrN/A:
dmi.product.name: HP G6000 Notebook PC
dmi.product.version: Rev 1
dmi.sys.vendor: Hewlett-Packard

Revision history for this message
gpk (gpk-kochanski) wrote : AlsaDevices.txt
Revision history for this message
gpk (gpk-kochanski) wrote : AplayDevices.txt
Revision history for this message
gpk (gpk-kochanski) wrote : BootDmesg.txt
Revision history for this message
gpk (gpk-kochanski) wrote : Card0.Amixer.values.txt
Revision history for this message
gpk (gpk-kochanski) wrote : Card0.Codecs.codec.0.txt
Revision history for this message
gpk (gpk-kochanski) wrote : CurrentDmesg.txt
Revision history for this message
gpk (gpk-kochanski) wrote : Dependencies.txt
Revision history for this message
gpk (gpk-kochanski) wrote : IwConfig.txt
Revision history for this message
gpk (gpk-kochanski) wrote : Lspci.txt
Revision history for this message
gpk (gpk-kochanski) wrote : PciMultimedia.txt
Revision history for this message
gpk (gpk-kochanski) wrote : ProcCpuinfo.txt
Revision history for this message
gpk (gpk-kochanski) wrote : ProcInterrupts.txt
Revision history for this message
gpk (gpk-kochanski) wrote : ProcModules.txt
Revision history for this message
gpk (gpk-kochanski) wrote : UdevDb.txt
Revision history for this message
gpk (gpk-kochanski) wrote : UdevLog.txt
Revision history for this message
gpk (gpk-kochanski) wrote : WifiSyslog.txt
Changed in linux (Ubuntu):
status: Incomplete → New
tags: added: apport-collected
Revision history for this message
gpk (gpk-kochanski) wrote :

Running under a newly installed Karmic Koala (updated as of today), you can feel a faint airflow at 39C core temperature. Turn on one core 100%, and the fan doesn't change, even though the hot core gets up to about 58C. When you turn on the second core, and the hotter core gets to 60C or 61C, then the fan goes into high speed mode, and the temperature stabilizes at 62C or 63C.

Turn the load off, and the fan immediately returns to the faint airflow that we had at the beginning.

Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
penalvch (penalvch) wrote :

gpk, thank you for reporting this and helping make Ubuntu better. This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? As per https://wiki.ubuntu.com/Kernel/PowerManagementASPM a patch was issued to address thermal issues. Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux <replace-with-bug-number>

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

tags: added: hardy intrepid karmic
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-therm
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.