CPU fan not activating on HP NX6125

Bug #24308 reported by Alessandro
26
Affects Status Importance Assigned to Milestone
linux-source-2.6.15 (Debian)
Fix Released
High
linux-source-2.6.15 (Ubuntu)
Invalid
Medium
Ben Collins

Bug Description

Good morning,

before i report this bug i wrote my problem on the italian mailing-list and on the
devel. mailing-list, but i have received no answer!

MY LAPTOP:

HP NX6125 - AMD Turion 64 bit 1.8 Ghz - 80GB HD - 512 RAM - Video Ati Radeon Mobility
X300 - SO Breezy Colony 4 updated.

MY PROBLEM:

To work with my laptop i've to add the following command line to the korpline:

no_timer_check

Without it i can't work with my pc. It works to slow. Now i've the following
problem:

Sometimes (very often) the cpu fan doesn't star to cool when the temperature
goes above the safety limit. The result is that my pc stops itself.

Two weeks ago, i've noticed a particular behavior of my laptop. During a
work session wiht my laptop i run the following command: acpi -t and the CPU
fan began to work. When i noticed it, i thought it was a coincidence. After
an hour i tried again: acpi -t and the CPU fan began again to cool the CPU.

I decided to make an experiment. During a week i gave regurarly (every 5-10
minutes) acpi -t during the work session. The result was a regurarly
functioning of the CPU fan. After the work session (so i didn't use the pc
and didn't gave acpi -t) i didn't switch off my laptop. The result was: the
laptop swichted off itself everytime.

At the moment i overcame the problem using crontab. I scheduled every 5
minutes the command acpi -t. The CPU fan works very well. It stars around
55-60°.

Unfortunately i can't find the reason of this problem (CPU fan) and the
reason of this particular behavior of the acpi - t command (i thounght acpi
-t prints simply the temperature. I didn't think that it could have some
influence on the CPU fan).

What do you think about that?

Thanks.

Alessandro

http://bugzilla.kernel.org/show_bug.cgi?id=5534: http://bugzilla.kernel.org/show_bug.cgi?id=5534

Revision history for this message
Matt Zimmerman (mdz) wrote :

Please file one bug report per distinct problem; you seem to have two
(no_timer_check and this fan issue).

What is a korpline?

Revision history for this message
Alessandro (alessandro-eterni) wrote :

(In reply to comment #1)
> Please file one bug report per distinct problem; you seem to have two
> (no_timer_check and this fan issue).
>
> What is a korpline?

I'm sorry. It's an error. I meant the kopt line:

## ## Start Default Options ## ## default kernel options ## default kernel options for automagic boot options ## If you want special options for specific
kernels use kopt_x_y_z ## where x.y.z is kernel version. Minor versions can be omitted. ## e.g. kopt=root=/dev/hda1 ro # kopt=root=/dev/hda2 ro
console=tty0 no_timer_check

I reported this two problems together, becaude i thought they were in some way connected. I think that no_timer_check option causes my CPU-Fan problem.

If you think it's better that i open two bug-reports, i do it.

Tell what you think.

Thanks.

Revision history for this message
Matthew Garrett (mjg59) wrote :

no_timer_check really shouldn't be necessary on the 6125 if you're running the
AMD64 version of Breezy.

Revision history for this message
Alessandro (alessandro-eterni) wrote :

(In reply to comment #3)
> no_timer_check really shouldn't be necessary on the 6125 if you're running the
> AMD64 version of Breezy.

On my pc is running the AMD 64 bit version. One month ago, when i installed the colony 4, i couldn't work with my laptop. After i added no_timer_check i
started to work normally (excepting CPU-Fan problem) with it. Now on my pc runs the update version of breezy. Maybe i can try to disable the
no_timer_check option, but i don't think that my laptop will work without it.

PS: I posted two times these problems on the devel-mailing-list, but i have received no answer, therefore i report these on the bugzilla.

Revision history for this message
Matthew Garrett (mjg59) wrote :

The fix for the no_timer_check bug went in after Colony 4. Please try removing it.

Revision history for this message
Alessandro (alessandro-eterni) wrote :

(In reply to comment #5)
> The fix for the no_timer_check bug went in after Colony 4. Please try removing it.

Ok. Now i'm in the office. I will try this evening at home. I keep you informed.

Revision history for this message
Alessandro (alessandro-eterni) wrote :

(In reply to comment #5)
> The fix for the no_timer_check bug went in after Colony 4. Please try removing it.

I've tested my laptop without the no_timer_check option. It works well. Now i'd
like to disable the crontab file that i've written to see if the CPU-Fan works
or not.

I'll keep you informed.

Revision history for this message
Alessandro (alessandro-eterni) wrote :

(In reply to comment #7)
> (In reply to comment #5)
> > The fix for the no_timer_check bug went in after Colony 4. Please try removing it.
>
> I've tested my laptop without the no_timer_check option. It works well. Now i'd
> like to disable the crontab file that i've written to see if the CPU-Fan works
> or not.
>
> I'll keep you informed.

I erased my crontab and the cpu-fan seems to work also without it very well.

Revision history for this message
Alessandro (alessandro-eterni) wrote :

(In reply to comment #8)
> I erased my crontab and the cpu-fan seems to work also without it very well.

Today the cpu-fan coma back again. If i haven't to work hard with my laptop (for exampler if i surf easilly) the cpu-fan works properly. It starts to cool by 58°. Other way if i work hard with
my laptop (for instance if i use gimp togheter with other application) the cpu-fan doesn't worl well. It doesn't start to cool.....and the laptop stops itself.

Now i've added again the crontab. It seems to work well....but i've not solved the problem.

With the crontab (when it runs the acpi -t command) sometime the laptop freezes for one-two seconds.

Some idea?

Revision history for this message
Richard Mace (macer) wrote :

(In reply to comment #9)
> (In reply to comment #8)
> > I erased my crontab and the cpu-fan seems to work also without it very well.
>
> Today the cpu-fan coma back again. If i haven't to work hard with my laptop
(for exampler if i surf easilly) the cpu-fan works properly. It starts to cool
by 58°. Other way if i work hard with
> my laptop (for instance if i use gimp togheter with other application) the
cpu-fan doesn't worl well. It doesn't start to cool.....and the laptop stops itself.
>
> Now i've added again the crontab. It seems to work well....but i've not solved
the problem.
>
> With the crontab (when it runs the acpi -t command) sometime the laptop
freezes for one-two seconds.

Got the same behaviour with my HP nx6125, with Debian kernels as well as vanilla
kernel 2.6.13.4 and 2.6.14. I discovered it by using cat
/proc/thermal_zone/TZ?/temperature,
which is essentially what acpi -t does, I guess. Today I submitted a bug report
to bugzilla.kernel.org. The bug number is 5534.

>
> Some idea?
>
>

Revision history for this message
Yung-Chin Oei (yungchin) wrote :

I have the same problem on the same hardware. Ubuntu 5.10, kernel version
2.6.12-10-amd64-generic.

It seems that on low cpu-load, the fan is regulated properly. If the cpu heating
occurs 'fast', e.g. when you run "glxgears", the cpu temperature rises with no
fan activity until I explicitly probe the temperature through "cat
/proc/acpi/thermal_zone/TZ1/temperature" (or "acpi -V").

Could it be that hardware alerts at crossing trip-points are not passed to the
acpi-module at higher cpu-loads (too low priority or something like that)?
If I can assist in checking out this bug, please instruct me. I'd like to learn
more about system internals in this way.

Revision history for this message
Alessandro (alessandro-eterni) wrote :

(In reply to comment #11)
> I have the same problem on the same hardware. Ubuntu 5.10, kernel version
> 2.6.12-10-amd64-generic.
>
> It seems that on low cpu-load, the fan is regulated properly. If the cpu heating
> occurs 'fast', e.g. when you run "glxgears", the cpu temperature rises with no
> fan activity until I explicitly probe the temperature through "cat
> /proc/acpi/thermal_zone/TZ1/temperature" (or "acpi -V").
>
> Could it be that hardware alerts at crossing trip-points are not passed to the
> acpi-module at higher cpu-loads (too low priority or something like that)?
> If I can assist in checking out this bug, please instruct me. I'd like to learn
> more about system internals in this way.

Last week i noticed, that the operation acpi -t that i scheduled doesn't
work (it works but i has no influence on the cpufan). I try to explain
you. I can not work without acpi -t with my laptop because of our fun problem.
I noticed that the temperature of my laptop was stable by 78° and if i run acpi -t the
fun doesn't start.

I remember that ,before that, the temperature was always around 58°.
Above this limit and when i run acpi -t started the fun.

For example i run today for 5 minute glxgears and the temperature was
always around 78°. It isn't increased and it isn't decreased.

By the next reboot the situation came back. Temperature on 58° and acpi -t that has
influence on the CPU-Fan

Yesterday i noticed the same behaviout on my laptop, that i described you
in my last mail.

I've worked all the day without the scheduled command acpi -t and:

On Workspace 1: Openoffce
On Workspace 2: Opera and Firefox Browser
On Workspace 3: I got run all the day glxgears
On Worksapce 4: Nothing

The temperature was always between 70° and 73° and how i said without acpi -t.

Now i've to use again the scheduled operation acpi -t to work with my laptop

Revision history for this message
Yung-Chin Oei (yungchin) wrote :

perhaps a relevant fact: I am running a 32-bit chroot.
I'm not sure if the problem was there before I installed it, and now I wonder:
perhaps the hardware alerts get misrouted to the chroot environment? (this may
not make sense, because I don't really understand how all that works)

Revision history for this message
Ben Collins (ben-collins) wrote :

If possible, please upgrade to Dapper's 2.6.15-7 kernel. If you do not want to
upgrade to Dapper, then you can also wait for the Dapper Flight 2 CD's, which
are due out within the next few days.

Let me know if this bug still exists with this kernel.

Revision history for this message
none (ubuntu-bugs-nullinfinity-deactivatedaccount) wrote :

The 2.6.15-7 kernel does not improve things for me. Same behaviour as reported
above: I have a script that does "acpi -t" every few seconds, and without that
script no thermal events are processed.

Revision history for this message
Ben Collins (ben-collins) wrote :

Ok, do this for me please:

1) First, do "ps ax | grep acpid", and make sure acpid is running. If not, stop
there, because that's your problem.

2) If it is running, run "/etc/init.d/acpid stop" to kill it,

3) now, start this command "cat /proc/acpi/event"

4) Stop your acpi -t cron script, and wait to see what events are coming out of
the above command. If you see nothing, and your machine overheats, then let me
know. If you do see events, send them to me (stop the script before the machine
halts and run your command again).

5) Send me the output of "cat /proc/acpi/thermal_zone/*/trip_points"

Revision history for this message
none (ubuntu-bugs-nullinfinity-deactivatedaccount) wrote :

I did have acpid running. I killed acpid and my script and did "cat
/proc/acpi/event" and ran glxgears.

No events at all appeared. I then restarted my script which reported a
temperature of 60 degrees for TZ1. When the script started the fan kicked in
immediately and I saw an event

thermal_zone TZ1 00000081 00000000

... and when the fan stopped it printed the same thing again.

# cat /proc/acpi/thermal_zone/*/trip_points
critical (S5): 95 C
passive: 88 C: tc1=1 tc2=2 tsp=100 devices=0xffff8100379e7bc0
active[0]: 80 C: devices=0xffff810036852d00
active[1]: 75 C: devices=0xffff810036852bc0
active[2]: 65 C: devices=0xffff810036852ac0
active[3]: 58 C: devices=0xffff8100368529c0
critical (S5): 100 C
passive: 90 C: tc1=1 tc2=2 tsp=300 devices=0xffff8100379e7bc0
critical (S5): 100 C
passive: 60 C: tc1=1 tc2=2 tsp=300 devices=0xffff8100379e7bc0

I don't know if you're aware of some of the patches that were posted by a couple
of Intel's ACPI guys in the bug at
http://bugzilla.kernel.org/show_bug.cgi?id=5534 . In any case, no one has
reported that the patches solved their problems.

Revision history for this message
Richard Mace (macer) wrote :

(In reply to comment #17)

>
> I don't know if you're aware of some of the patches that were posted by a couple
> of Intel's ACPI guys in the bug at
> http://bugzilla.kernel.org/show_bug.cgi?id=5534 . In any case, no one has
> reported that the patches solved their problems.
>

Those patches do nothing to solve the problem. They simply enable polling in the
event that the ACPI subsystem does not provide a _TZP method (which, apparently,
the nx 6125 DSDT does not). You can accomplish the same effect with much less
ado, by simply echoing 5 into each of
/proc/acpi/thermal_zone/TZ[1-3]/polling_frequency. Polling won't work, I tried
it a long time ago. It makes things worse than asynchronous mode.

The other patches are simply for diagnostic purposes. I have tried the kernel
patch and my results are posted at bugzilla.kernel.org (#5534). There is also a
DSDT diagnostic patch which will echo messages whenever thermal events are
generated. I haven't tried this. Optimally, the developers need to get their
hands on one of these machines and observe this behaviour first hand...

Revision history for this message
Richard Mace (macer) wrote :

(In reply to comment #13)
> perhaps a relevant fact: I am running a 32-bit chroot.
> I'm not sure if the problem was there before I installed it, and now I wonder:
> perhaps the hardware alerts get misrouted to the chroot environment? (this may
> not make sense, because I don't really understand how all that works)
>

The fact that you have a chroot is irrelevant. I do not have one bit of 32 bit
code on my system and my machine suffers
from exactly the same problem.

Revision history for this message
Yung-Chin Oei (yungchin) wrote :

(In reply to comment #17)

> I did have acpid running. I killed acpid and my script and did "cat
> /proc/acpi/event" and ran glxgears.

I did the same things, however after booting with the 2.6.15-8 kernel.

> No events at all appeared. I then restarted my script which reported a
> temperature of 60 degrees for TZ1. When the script started the fan kicked in
> immediately and I saw an event
>
> thermal_zone TZ1 00000081 00000000

No events here either. However, when I ran 'acpi -t' subsequently, still no
event appeared and no fan started, even though a temp of 80C was reported (I
stopped glxgears immediately).

> # cat /proc/acpi/thermal_zone/*/trip_points
...

my output was exactly identical save for the device addresses.

Revision history for this message
Richard Mace (macer) wrote :

(In reply to comment #20)
> (In reply to comment #17)

> I did the same things, however after booting with the 2.6.15-8 kernel.
>
> > No events at all appeared. I then restarted my script which reported a
> > temperature of 60 degrees for TZ1. When the script started the fan kicked in
> > immediately and I saw an event
> >
> > thermal_zone TZ1 00000081 00000000
>
> No events here either. However, when I ran 'acpi -t' subsequently, still no
> event appeared and no fan started, even though a temp of 80C was reported (I
> stopped glxgears immediately).
>
> > # cat /proc/acpi/thermal_zone/*/trip_points
> ...
>
> my output was exactly identical save for the device addresses.

I reported this all in detail over a month ago on the debian lists and on the
kernel bugzilla (http://bugzilla.kernel.org/show_bug.cgi?id=5534). Please see my
detailed post at ==> http://lists.debian.org/debian-amd64/2005/10/msg01002.html
for similar details. I cannot help but feel this
is all just rehashing old news. Please (developers) see kernel bug #11938
(http://bugzilla.kernel.org/show_bug.cgi?id=5534), where there is a wealth of
information on this particular bug, which is a duplicate of kernel bug #11938.

Revision history for this message
keeema (rut-tomas) wrote :

I have same problem on Fujitsu-Siemens L1310G. Since I was using Ubuntu Edgy Eft, my cpu fan didn't work. Cpu was very hot and laptop automaticaly turned off. Previous versions of Ubuntu were without problem.

Revision history for this message
Lee Willis (lwillis) wrote :

I have the same problem on an emachines 370 desktop. The CPU fan never appears to come on and it overheats when doing intensive work (Last time, in the middle of regenerating an initramfs during an upgrade!). The fan works fine in Windows XP, and worked under previous versions of Ubuntu (I think Hoary - but not too sure). Coincidentally ACPI reports bogus temperatures for the CPU

~$ cat /proc/acpi/thermal_zone/THRM/temperature
temperature: -267 C
~$ acpi -t
     Thermal 1: passive , 4294967040.0 degrees C

I've also added this to bug 54554 since it seems similar [If not a dupe]

Changed in linux-source-2.6.15:
status: Unknown → Rejected
Changed in linux-source-2.6.15:
status: Invalid → Fix Released
Revision history for this message
Mike Burgener (mburgener) wrote :

Hi guys
I'm encountering this problem too

I think it could be reated to a Timing Issue since i only encounter this problem since the HZ options in kernel appeared, did ubuntu change to 1000HZ?

I mean if i generate cpu load it seems to badly poll data from temperature sensors

Greets

Mike

Revision history for this message
Mike Burgener (mburgener) wrote :

just after starting a "make" for kernel compilation when i manually do a

 cat /proc/acpi/thermal_zone/TZS0/temperature

i get very fast this results increasing really really badly in 1 or 1.5 seconds
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 70 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 70 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 70 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 70 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 70 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 73 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 74 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 74 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 74 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 83 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 83 C
root@burginote:/home/mburgener# cat /proc/acpi/thermal_zone/TZS0/temperature
temperature: 83 C

Revision history for this message
seisen1 (seisen-deactivatedaccount-deactivatedaccount) wrote :

Can you please test this on the latest version of Ubuntu, Hardy Heron, so we can see if this is still a problem?

Revision history for this message
Pablo Castellano (pablocastellano) wrote :

We are closing this bug report because it lacks the information we need to investigate the problem, as described in the previous comments. Please reopen it if you can give us the missing information, and don't hesitate to submit bug reports in the future. To reopen the bug report you can click on the current status, under the Status column, and change the Status back to "New". Thanks again!

Changed in linux-source-2.6.15:
status: New → Invalid
Revision history for this message
W. Prins (wprins) wrote :

For the record, I believe this problem has been fixed somewhere along the line: I'm running Hardy 8.04.1 with no apparent CPU fan issues on the same machine as the original poster, an HP NX6125 with AMD Turion64 CPU, 64bit version. I'm not using any special kernel options or anything else for that matter.

Changed in linux-source-2.6.15 (Debian):
importance: Unknown → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.