Comment 5 for bug 1995606

Revision history for this message
koba (kobako) wrote : Re: Upgrade thermald to 2.5.1

@Robie,
1. is it possible that users are using thermald on hardware not covered by
upstream tests?
[Koba] As per my test cases, the older machine than kbl would be not covered.
but thermald is enabled since 2016, i thought Intel may not support the older fully.
If there's a regression, we could ask user to report on launchpad and help to fix.

2. By "all the unit tests must pass in all the supported
Intel CPUs", who defines "supported"?
[Koba], there's a supported CPU list in the thermald source,
~~~
@src/thd_engine.cpp,
supported_ids_t id_table[] = {
...
>------->-------{ 6, 0x97 }, // Alderlake
>------->-------{ 6, 0x9a }, // Alderlake
>------->-------{ 6, 0xb7 }, // Raptorlake
>------->-------{ 6, 0xba }, // Raptorlake
>------->-------{ 6, 0xbf }, // Raptorlake
...
}
~~~
thermald is maintained by Intel and definitely Intel define "supported".

3. Is it possible that Ubuntu users have hardware not covered by that definition of "supported"?
[Koba], I think it's impossible if there's one Intel platform missed in the supported list. HWE would find it at the developing stage because thermald would complain it first then HWE would check with Intel.

4. Is there any risk to users of non-Intel hardware?
[Koba] There's only one chance that you add the '--ignore-cpuid-check'.
by the default, thermald would not work on non-Intel hardware.

5. How complete is upstream's test coverage?
[Koba] it cover all used modules and loaded policy tables.
a. used modules, rapl_control, intel_pstate, intel_powerclamp, cpufreq, processor.
b. load policy table from xml file or acpi tables.
c. Evaluate the temperature and check if the rules act correctly after activate/escalate/deescalate the cooling devices.

6. What assurance is there that there will be no feature
regressions?
i could only explain there may be corner cases for PL1 min/max feature.

---
for this commit, https://github.com/intel/thermal_daemon/commit/7e490fc79d784b3faf8314af98ec14981ba7fb75

1) Is this safe in relation to Ubuntu kernel versions?
[Koba] I would say it's safe on Jammy/Focal
~~~
Jammy,
~~~~~~
TCC adjustment has been offloaded to kernel driver intel_tcc_cooling,
it's registered as a thermal cooling device.
2eb87d75f980) thermal/drivers/intel: Introduce tcc cooling driver.
This was merged to mainline since 5.13. Focal is using hwe-5.15.
Ref. https://www.phoronix.com/news/Linux-5.13-Intel-Cooling-Driver
~~~~~~

#Timo has a replied for Focal,
~~~~~~
commit fdf4f2fb8e8990c131b2b1a5a9c03681bb16e87a
Author: Srinivas Pandruvada <email address hidden>
Date: Mon Jul 22 18:03:02 2019 -0700

     drivers: thermal: processor_thermal_device: Export sysfs interface
for TCC offset

so a backport to focal (which is planned) should be safe in that regard.
~~~~~~
~~~
2) Did this actually get checked before upload?
[Koba] i checked the related kernel commits if it's landed on Focal/Jammy.

3) What in your proposed QA process would catch this kind of change to ensure that
the specific requirements for each such deprecation is met in Ubuntu
[Koba] I have test cases but they are generic unit tests and a stressing&montioring test.
you could check the description. btw, there may be some edge cases I didn't meet, if the issue is trigged, just ask user to report the issue and help them to fix.