Comment 18 for bug 1543046

Revision history for this message
Colin Ian King (colin-king) wrote :

OK, I've spent the last 3 hours verifying this manually on 3 different machines that trigger this issue.

Without 1.4.3-5~14.04.3, I see:

[72761.740827] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73066.277771] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73070.281923] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73074.285950] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73078.290045] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73082.294180] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73086.278074] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73090.282163] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73090.302293] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73098.310360] powercap intel-rapl:0: package locked by BIOS, monitoring only

That is, every 4-8 seconds and update.

I then installed thermald from -proposed, (1.4.3-5~14.04.3) and it restarted, and one only gets the initial message once and after that it does not occur. This is the expected behaviour, so on my 3 machines it's working fine.

I also verified this by running it not as a service but in non-daemon mode:

sudo dmesg -c
sudo thermald --no-daemon --loglevel=info

then I loaded the machine with stress-ng --cpu 0 and waiting for it to heat up a while. The kernel logs the following messages:

[73484.272942] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73516.305392] intel_powerclamp: Start idle injection to reduce power
[73520.325616] intel_powerclamp: Stop forced idle injection
[73524.449552] intel_powerclamp: Start idle injection to reduce power
[73528.465760] intel_powerclamp: Stop forced idle injection
[73532.590030] intel_powerclamp: Start idle injection to reduce power

And the thermald is logging:

sysfs write failed constraint_0_power_limit_uw
Set : threshold:87000, temperature:89000, cdev:6(rapl_controller), curr_state:26250000, max_state:26250000
Set : threshold:87000, temperature:92000, cdev:7(intel_pstate), curr_state:1, max_state:10
cdev index:7 consecutive call, increment exponentially state 3
turbo disabled
Set : threshold:87000, temperature:87000, cdev:7(intel_pstate), curr_state:3, max_state:10
turbo enabled
Set : threshold:87000, temperature:86000, cdev:7(intel_pstate), curr_state:2, max_state:10
turbo disabled
Set : threshold:87000, temperature:89000, cdev:7(intel_pstate), curr_state:3, max_state:10
cdev index:7 consecutive call, increment exponentially state 5
Set : threshold:87000, temperature:89000, cdev:7(intel_pstate), curr_state:5, max_state:10
cdev index:7 consecutive call, increment exponentially state 7
Set : threshold:87000, temperature:93000, cdev:7(intel_pstate), curr_state:7, max_state:10
update_pid 1461351021 1461351021 500 250 92250
Read set point 93500
cdev index:7 consecutive call, increment exponentially state 11
Set : threshold:87000, temperature:92000, cdev:7(intel_pstate), curr_state:10, max_state:10
Set : threshold:87000, temperature:92000, cdev:5(intel_powerclamp), curr_state:5, max_state:50
Set : threshold:87000, temperature:85000, cdev:5(intel_powerclamp), curr_state:0, max_state:50
Set : threshold:87000, temperature:90000, cdev:5(intel_powerclamp), curr_state:5, max_state:50
Set : threshold:87000, temperature:82000, cdev:5(intel_powerclamp), curr_state:0, max_state:50
Set : threshold:87000, temperature:89000, cdev:5(intel_powerclamp), curr_state:5, max_state:50
cdev index:5 consecutive call, increment exponentially state 15
Set : threshold:87000, temperature:89000, cdev:5(intel_powerclamp), curr_state:15, max_state:50
Set : threshold:87000, temperature:81000, cdev:5(intel_powerclamp), curr_state:10, max_state:50
Set : threshold:87000, temperature:83000, cdev:5(intel_powerclamp), curr_state:5, max_state:50
Set : threshold:87000, temperature:89000, cdev:5(intel_powerclamp), curr_state:10, max_state:50
Set : threshold:87000, temperature:86000, cdev:5(intel_powerclamp), curr_state:5, max_state:50
Set : threshold:87000, temperature:86000, cdev:5(intel_powerclamp), curr_state:0, max_state:50

So one can see that thermald is detecting a rapl write failure "sysfs write failed constraint_0_power_limit_uw" and the rapl controller is no longer used (as the code is expected to do).

So from my exhaustive testing I believe the code is correct.