OK, I've spent the last 3 hours verifying this manually on 3 different machines that trigger this issue.
Without 1.4.3-5~14.04.3, I see:
[72761.740827] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73066.277771] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73070.281923] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73074.285950] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73078.290045] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73082.294180] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73086.278074] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73090.282163] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73090.302293] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73098.310360] powercap intel-rapl:0: package locked by BIOS, monitoring only
That is, every 4-8 seconds and update.
I then installed thermald from -proposed, (1.4.3-5~14.04.3) and it restarted, and one only gets the initial message once and after that it does not occur. This is the expected behaviour, so on my 3 machines it's working fine.
I also verified this by running it not as a service but in non-daemon mode:
then I loaded the machine with stress-ng --cpu 0 and waiting for it to heat up a while. The kernel logs the following messages:
[73484.272942] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73516.305392] intel_powerclamp: Start idle injection to reduce power
[73520.325616] intel_powerclamp: Stop forced idle injection
[73524.449552] intel_powerclamp: Start idle injection to reduce power
[73528.465760] intel_powerclamp: Stop forced idle injection
[73532.590030] intel_powerclamp: Start idle injection to reduce power
And the thermald is logging:
sysfs write failed constraint_0_power_limit_uw
Set : threshold:87000, temperature:89000, cdev:6(rapl_controller), curr_state:26250000, max_state:26250000
Set : threshold:87000, temperature:92000, cdev:7(intel_pstate), curr_state:1, max_state:10
cdev index:7 consecutive call, increment exponentially state 3
turbo disabled
Set : threshold:87000, temperature:87000, cdev:7(intel_pstate), curr_state:3, max_state:10
turbo enabled
Set : threshold:87000, temperature:86000, cdev:7(intel_pstate), curr_state:2, max_state:10
turbo disabled
Set : threshold:87000, temperature:89000, cdev:7(intel_pstate), curr_state:3, max_state:10
cdev index:7 consecutive call, increment exponentially state 5
Set : threshold:87000, temperature:89000, cdev:7(intel_pstate), curr_state:5, max_state:10
cdev index:7 consecutive call, increment exponentially state 7
Set : threshold:87000, temperature:93000, cdev:7(intel_pstate), curr_state:7, max_state:10
update_pid 1461351021 1461351021 500 250 92250
Read set point 93500
cdev index:7 consecutive call, increment exponentially state 11
Set : threshold:87000, temperature:92000, cdev:7(intel_pstate), curr_state:10, max_state:10
Set : threshold:87000, temperature:92000, cdev:5(intel_powerclamp), curr_state:5, max_state:50
Set : threshold:87000, temperature:85000, cdev:5(intel_powerclamp), curr_state:0, max_state:50
Set : threshold:87000, temperature:90000, cdev:5(intel_powerclamp), curr_state:5, max_state:50
Set : threshold:87000, temperature:82000, cdev:5(intel_powerclamp), curr_state:0, max_state:50
Set : threshold:87000, temperature:89000, cdev:5(intel_powerclamp), curr_state:5, max_state:50
cdev index:5 consecutive call, increment exponentially state 15
Set : threshold:87000, temperature:89000, cdev:5(intel_powerclamp), curr_state:15, max_state:50
Set : threshold:87000, temperature:81000, cdev:5(intel_powerclamp), curr_state:10, max_state:50
Set : threshold:87000, temperature:83000, cdev:5(intel_powerclamp), curr_state:5, max_state:50
Set : threshold:87000, temperature:89000, cdev:5(intel_powerclamp), curr_state:10, max_state:50
Set : threshold:87000, temperature:86000, cdev:5(intel_powerclamp), curr_state:5, max_state:50
Set : threshold:87000, temperature:86000, cdev:5(intel_powerclamp), curr_state:0, max_state:50
So one can see that thermald is detecting a rapl write failure "sysfs write failed constraint_0_power_limit_uw" and the rapl controller is no longer used (as the code is expected to do).
So from my exhaustive testing I believe the code is correct.
OK, I've spent the last 3 hours verifying this manually on 3 different machines that trigger this issue.
Without 1.4.3-5~14.04.3, I see:
[72761.740827] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73066.277771] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73070.281923] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73074.285950] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73078.290045] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73082.294180] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73086.278074] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73090.282163] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73090.302293] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73098.310360] powercap intel-rapl:0: package locked by BIOS, monitoring only
That is, every 4-8 seconds and update.
I then installed thermald from -proposed, (1.4.3-5~14.04.3) and it restarted, and one only gets the initial message once and after that it does not occur. This is the expected behaviour, so on my 3 machines it's working fine.
I also verified this by running it not as a service but in non-daemon mode:
sudo dmesg -c
sudo thermald --no-daemon --loglevel=info
then I loaded the machine with stress-ng --cpu 0 and waiting for it to heat up a while. The kernel logs the following messages:
[73484.272942] powercap intel-rapl:0: package locked by BIOS, monitoring only
[73516.305392] intel_powerclamp: Start idle injection to reduce power
[73520.325616] intel_powerclamp: Stop forced idle injection
[73524.449552] intel_powerclamp: Start idle injection to reduce power
[73528.465760] intel_powerclamp: Stop forced idle injection
[73532.590030] intel_powerclamp: Start idle injection to reduce power
And the thermald is logging:
sysfs write failed constraint_ 0_power_ limit_uw rapl_controller ), curr_state: 26250000, max_state:26250000 intel_pstate) , curr_state:1, max_state:10 intel_pstate) , curr_state:3, max_state:10 intel_pstate) , curr_state:2, max_state:10 intel_pstate) , curr_state:3, max_state:10 intel_pstate) , curr_state:5, max_state:10 intel_pstate) , curr_state:7, max_state:10 intel_pstate) , curr_state:10, max_state:10 intel_powerclam p), curr_state:5, max_state:50 intel_powerclam p), curr_state:0, max_state:50 intel_powerclam p), curr_state:5, max_state:50 intel_powerclam p), curr_state:0, max_state:50 intel_powerclam p), curr_state:5, max_state:50 intel_powerclam p), curr_state:15, max_state:50 intel_powerclam p), curr_state:10, max_state:50 intel_powerclam p), curr_state:5, max_state:50 intel_powerclam p), curr_state:10, max_state:50 intel_powerclam p), curr_state:5, max_state:50 intel_powerclam p), curr_state:0, max_state:50
Set : threshold:87000, temperature:89000, cdev:6(
Set : threshold:87000, temperature:92000, cdev:7(
cdev index:7 consecutive call, increment exponentially state 3
turbo disabled
Set : threshold:87000, temperature:87000, cdev:7(
turbo enabled
Set : threshold:87000, temperature:86000, cdev:7(
turbo disabled
Set : threshold:87000, temperature:89000, cdev:7(
cdev index:7 consecutive call, increment exponentially state 5
Set : threshold:87000, temperature:89000, cdev:7(
cdev index:7 consecutive call, increment exponentially state 7
Set : threshold:87000, temperature:93000, cdev:7(
update_pid 1461351021 1461351021 500 250 92250
Read set point 93500
cdev index:7 consecutive call, increment exponentially state 11
Set : threshold:87000, temperature:92000, cdev:7(
Set : threshold:87000, temperature:92000, cdev:5(
Set : threshold:87000, temperature:85000, cdev:5(
Set : threshold:87000, temperature:90000, cdev:5(
Set : threshold:87000, temperature:82000, cdev:5(
Set : threshold:87000, temperature:89000, cdev:5(
cdev index:5 consecutive call, increment exponentially state 15
Set : threshold:87000, temperature:89000, cdev:5(
Set : threshold:87000, temperature:81000, cdev:5(
Set : threshold:87000, temperature:83000, cdev:5(
Set : threshold:87000, temperature:89000, cdev:5(
Set : threshold:87000, temperature:86000, cdev:5(
Set : threshold:87000, temperature:86000, cdev:5(
So one can see that thermald is detecting a rapl write failure "sysfs write failed constraint_ 0_power_ limit_uw" and the rapl controller is no longer used (as the code is expected to do).
So from my exhaustive testing I believe the code is correct.