thermald prematurely throttling GPU

Bug #1981087 reported by Colette Kerr
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
thermald (Ubuntu)
Fix Released
Undecided
koba
Jammy
Fix Released
Undecided
koba

Bug Description

[Impact]
 * thermald prematurely throttling GPU

[Fix]
This fix is removed the code refactoring part and keep the necessary.

(patch: 0009-Install-passive-default.patch)
82609c7) Separate Adaptive engine and GDDV

[Test Plan]
Test1,
 * Run game on the target machine.
 * the FPS must not be significantly reduced.
Test2,
 * Run on others platform, ADL/TGL/CML/CFL/KBL.
 * Use monitoring tool(e.g. s-tui) and stress-ng to verify if the machine runs normally.

[Where problems could occur]
 * better support for Passive Policy. currently passive policy 1 is supported and it should have a bug if the machine only enable pssive policy 2.

~~~
I got a new game and started playing it
It would run at over 100 FPS solidly some of the time and then cyclically dip down to below 20 FPS for a few minutes

I determined that it was thermald trying to keep my GPU below 70°C
to determine this I sudo systemctl stop thermald
The game ran solidly and consistently with the GPU at 75°C

This is well below the specs set by the manufacturer and perhaps unreasonably low for a laptop

But more importantly I was given no indication this was happening. I had to sleuth it out myself.

Perhaps it is impossible to determine good defaults for all hardware, I don't know. However without an indication that this is happening there will be a lot of people with a mysteriously broken experience. This was extremely difficult for me to find and I had several friend who are experts on linux gaming and video drivers trying to track this down. I discovered it by luck and perseverance.

This absolutely needs some sort of indication and hopefully a way to remedy it from the GUI. ideally it would set thermal limits that are more in line with what the device is designed for and not a conservative default if at all possible.

ProblemType: Bug
DistroRelease: Ubuntu 22.04
Package: thermald 2.4.9-1
ProcVersionSignature: Ubuntu 5.15.0-40.43-generic 5.15.35
Uname: Linux 5.15.0-40-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.20.11-0ubuntu82.1
Architecture: amd64
CasperMD5CheckResult: unknown
CurrentDesktop: XFCE
Date: Fri Jul 8 16:08:55 2022
InstallationDate: Installed on 2020-10-19 (626 days ago)
InstallationMedia: Xubuntu 20.04.1 LTS "Focal Fossa" - Release amd64 (20200731)
SourcePackage: thermald
UpgradeStatus: Upgraded to jammy on 2022-06-17 (20 days ago)

Revision history for this message
Colette Kerr (electrocutie85) wrote :
description: updated
Revision history for this message
koba (kobako) wrote :

@Colette, would you please try the upstream thermald(2.5.0)? thanks

Revision history for this message
Colette Kerr (electrocutie85) wrote :

@koba

I installed thermald_2.5.0-1_amd64 from Debian Sid (sha256 19840c0dffd4424996016293dcbd9402595cdb5c0b1c709f94f08a717a239b2e) just by grabbing the .deb and installing it

Started the service and ran a quick test. My graphics card reached a peak of 81°C with no in-game slowdown whatsoever

I hope that grabbing it from Debian was okay, I wanted to just keep it as a deb file with no hassle in case something wasn't right, ease of setup etc etc

Is that test valid?

Revision history for this message
koba (kobako) wrote :

@Colette, it's ok to grab the latest thermald from Debian.
thanks and Im preparing the 2.5.0 for ubuntu.

Changed in thermald (Ubuntu):
assignee: nobody → koba (kobako)
status: New → In Progress
Revision history for this message
koba (kobako) wrote :

@Colette, Could you please help me to bisect the thermald between 2.4.9~2.5.0, it would be a several tries

Revision history for this message
Colette Kerr (electrocutie85) wrote :

sure thing

the first commit that runs smoothly is
82609c7e017a0461eb20d66935979e399f024e0e

the last commit which chugs is
eaa77b41c1eddb6d0dc6ebbe8d3f903cf3029723

(There is no space between them I just put them both down to double check during testing)

Revision history for this message
koba (kobako) wrote (last edit ):

@Colette, it's really helpful, thanks a lot
Could you please try this test build again? i backported some patches from upstream thermald.
https://drive.google.com/drive/folders/1xM5MFlC1EldmJPpufJiFhPHyGhpgw-Jw?usp=sharing

Revision history for this message
Colette Kerr (electrocutie85) wrote :

Eek, sorry for missing this yesterday
The version in #7 did not exhibit the bug

That looks good for my use case

koba (kobako)
description: updated
Changed in thermald (Ubuntu Jammy):
status: New → In Progress
assignee: nobody → koba (kobako)
Timo Aaltonen (tjaalton)
Changed in thermald (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Chris Halse Rogers (raof) wrote :

Ok.

Sorry for the delay in getting to this.

There's a lot of code churn between 2.4.9 in Jammy and the 2.5 release, a bunch of which appears to be refactoring rather than straight bugfixes or new hardware support. It also appears to drop support for setting something(?) via a msr and will instead complain about needing a newer kernel - is the Jammy kernel new enough to avoid that codepath?

What effort has been made to cherry-pick the necessary fixes for these bugs?

This is not a "no", but since there's a fair amount of code churn here I think we need a more holistic test plan - how can we ensure that these changes don't break existing systems? Would it be less work overall to cherry-pick only the necessary fixes and need less thorough testing?

Revision history for this message
koba (kobako) wrote :

@Chris,
82609c7) it looks like refactoring but also contains codes to make passive police 1 as default.
the kernel part is also SRUed, LP1966089.
for this bug and 1982073, i cherry-picked some,these are essential and make jammy-2.4.9 and 2.5.0 are almost identical.

~~~
| * 7486cdf - Ensure there is one trips element per zone (3 months ago) <Ludovico Cavedon>
| * 7f1d8b2 - Merge ITM and PSVT tables (3 months ago) <Srinivas Pandruvada>
| * 611ae0c - Parse GDDV before thd_engine init (3 months ago) <Srinivas Pandruvada>
| * d385f20 - Use PL1 max/min from PPCC when policies match (3 months ago) <Srinivas Pandruvada>
| * cbdd92b - Parse idsp and trips (3 months ago) <Srinivas Pandruvada>
| * 82609c7 - Separate Adaptive engine and GDDV (3 months ago) <Srinivas Pandruvada>
| * eaa77b4 - Move interface debug_mode_on to thd_engine (3 months ago) <Srinivas Pandruvada>
| * 57c1fe9 - Add INT3400 base path for Raptor Lake (3 months ago) <Srinivas Pandruvada>
| * 1bc1105 - Use per trip min max (3 months ago) <Srinivas Pandruvada>
| * abafe76 - Install ITMT target (3 months ago) <Srinivas Pandruvada>
| * d736003 - Add capability for min max per trip (3 months ago) <Srinivas Pandruvada>
| * 0dd53a0 - Parse ITMT Table (3 months ago) <Srinivas Pandruvada>
~~~

I have ran some unit tests for rapl/powerclamp/intel_pstate.
also ask Dell/User to verify 2.5.0 with supported kernel.

Revision history for this message
Chris Halse Rogers (raof) wrote :

I'm just surprised that "make passive policy 1 default" isn't a cleanly cherry-pickable thing. That *sounds* like it should be one line of code :)

Unless it first needs to implement support for "passive policy 1", which it may well need to! But I don't know that unless someone tells me :)

Since the churn *is* necessary, then we should organise testing on a wide variety of laptops - both ones which are expected to have bugs fixed by this upload, and ones which aren't expected to be affected by this upload. The CI lab should have a reasonable selection; what sort of automated testing can be done for this?

Revision history for this message
koba (kobako) wrote :

@Chris, it's not easy to use one line of code but not too hard to implement.
Must clean up the existed table if the original policy is passive policy 2.
Then parse&install pp1 on thermald.
I thought it's better to follow the upstream code architecture.
~~~
#here's code sniplet of 82609c7
+void cthd_engine_adaptive::install_passive_default() {
+ if (passive_installed)
+ return;
+
+ thd_log_info("IETM_D0 processed\n");
+
+ for (unsigned int i = 0; i < zones.size(); ++i) {
+ cthd_zone *_zone = zones[i];
+ _zone->zone_reset(1);
+ _zone->trip_delete_all();
+
+ if (_zone->zone_active_status())
+ _zone->set_zone_inactive();
+ }
+
+ struct psvt *psvt = gddv.find_def_psvt();
+ if (!psvt)
+ return;
+
+ std::vector<struct psv> psvs = psvt->psvs;
+
+ thd_log_info("Name :%s\n", psvt->name.c_str());
+ for (unsigned int j = 0; j < psvs.size(); ++j) {
+ install_passive(&psvs[j]);
+ }
+
+ psvt_consolidate();
+ thd_log_info("\n\n ZONE DUMP BEGIN\n");
+ for (unsigned int i = 0; i < zones.size(); ++i) {
+ zones[i]->zone_dump();
+ }
+ thd_log_info("\n\n ZONE DUMP END\n");
+ passive_installed = 1;
+}
+
~~~

I have did the way you said, the one is affected and the one is not affected.
e.g. TGL and ADL platform.

Revision history for this message
Chris Halse Rogers (raof) wrote :

When upstream mixes bugfixes and refactorings, SRUs are a balance between:
*) The exact upstream code has *probably* been tested more, so is probably safer
*) The minimal fix changes behaviour less, so is probably safer

Since *both* options are "probably safer", it mostly depends on what "the minimal fix" looks like. If the minimal fix looks sensible, we'll generally prefer it even if it's not a direct cherry-pick from upstream.

Here it looks like maybe the minimal fix *is* sensible? Could you confidently generate a package that fixes these bugs but doesn't include all the refactorings?

Revision history for this message
koba (kobako) wrote :

@Chris, if skipping the code movement from this patch, the more mannual efforts would be necessary.
I think this must be more risky.
~~~
| * 82609c7 - Separate Adaptive engine and GDDV (3 months ago) <Srinivas Pandruvada>
~~~

If you strongly suggest this, i would like to do.

Revision history for this message
Chris Halse Rogers (raof) wrote :

Ok, so we're back to:

Since the churn *is* necessary, then we should organise testing on a wide variety of systems - both ones which are expected to have bugs fixed by this upload, and ones which aren't expected to be affected by this upload. The CI lab should have a reasonable selection; what sort of automated testing can be done for this?

Revision history for this message
sheldonwang (shelw) wrote :

@koba,
Please help with this.
We got some platforms that need this version (2.5.1) of thermald on Jammy.
And I expect (originally) we could reach -updates no later on 11-04.
Thank you~~

Revision history for this message
Jay Chen (jay-ch) wrote (last edit ):

just to add on the wish list- one OEM platform can be the pilot carrying thermald 2.5.1 forward.
for this goal we need to have thermald 2.5.1 package reach -proposed at least, by 28-Oct

Revision history for this message
koba (kobako) wrote :

@Chris,
I ran thermald 2.5.1 on others platform. e.g. CML/TGL.

description: updated
Revision history for this message
Colette Kerr (electrocutie85) wrote :

Hmm, I think that when i said that it throttled below 20 fps it was mistaken for some sort of par or expectation. That was meant to show the extent of the throttling I was experiencing not setting an expectation

Might want to change this line:
* the FPS must not be throttled below 20FPS.

To something more like:
* the FPS must not be significantly reduced

Or however you think best to phrase the idea that unless the cooling system can't keep up with the load that performance shouldn't be noticeably impacted

koba (kobako)
description: updated
koba (kobako)
description: updated
Revision history for this message
koba (kobako) wrote (last edit ):

@Chris, the oldest machine i found in cert lab is intel 6th gen.
all 6th gen machines are occupied for verifying the proposed.
have no idea when will they be freed.

I also found intel 7th, KBL, intel 8th, CFL and
1. install thermald 2.5.1.
2. fire thermald with passive policy or adaptive policy mode.
3. run the stress-ng then use s-tui to monitor.
#stress-ng --matrix 0 -t 5m
#stress-ng --cpu 0 -t 5m

Machine 1, 201801-26082, i9-8950hk. passed.
Machine 2, 201606-22344, i5-7200u, passed.

Currently, intel 7th, 8th(9th), 10th, 11th, 12th are all verified.

Revision history for this message
sheldonwang (shelw) wrote :

@Chris,
Could you advise what the next step we need to take is?
Koba already verified many old Intel systems (as described in #20).
Thanks.

Revision history for this message
Robie Basak (racb) wrote :

This SRU remains blocked on https://lists.ubuntu.com/archives/ubuntu-release/2022-October/005495.html.

For a straight backport to Jammy, if that is the conclusion, then the upload also needs an SRU meta-bug really to document the justification and regression risk mitigation for the backport.

Changed in thermald (Ubuntu Jammy):
status: In Progress → Incomplete
Revision history for this message
Chris Halse Rogers (raof) wrote : Proposed package upload rejected

An upload of thermald to jammy-proposed has been rejected from the upload queue for the following reason: "Please re-upload with an overall "upgrade to 2.5.1" process bug, detailing the testing to be done to detect regressions across hardware"".

koba (kobako)
Changed in thermald (Ubuntu):
status: Fix Released → New
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

An upload of thermald to jammy-proposed has been rejected from the upload queue for the following reason: "refers to private bugs".

koba (kobako)
description: updated
description: updated
koba (kobako)
description: updated
Changed in thermald (Ubuntu Jammy):
status: Incomplete → In Progress
Revision history for this message
Chris Halse Rogers (raof) wrote :

This is fixed in v2.5, so Kinetic and above.

Changed in thermald (Ubuntu):
status: New → Fix Released
Changed in thermald (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-jammy
Revision history for this message
Chris Halse Rogers (raof) wrote : Please test proposed package

Hello Colette, or anyone else affected,

Accepted thermald into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/thermald/2.4.9-1ubuntu0.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
koba (kobako) wrote :

@Colette, would you please enable the proposed and verfiy against thermald-2.4.9-1ubunt0.3?
thanks

Revision history for this message
koba (kobako) wrote :

Verified with unigine-super of phoronix-test-suite
~~~
$phoronix-test-suite benchmark unigine-super
~~~

didn't observe the significant difference compared to upstream thermald 2.5.3.
~~~
- thermald-2.4.9_1ubuntu-0.3

    >
    >
    >
    > Unigine Superposition 1.0:
    > pts/unigine-super-1.0.8 [Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: High - Renderer: OpenGL]
    > Test 1 of 1
    > Estimated Trial Run Count: 3
    >
    > Estimated Time To Completion: 13 Minutes [03:40 CDT]
    > Started Run 1 @ 03:28:08
    > Started Run 2 @ 03:31:37
    > Started Run 3 @ 03:35:03
    >
    > ```
    > Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: High - Renderer: OpenGL:
    > 4.6
    > 4.6
    > 4.6
    >
    > Average: 4.6 Frames Per Second
    > Maximum: 5.4
    > Deviation: 0.00%
    >
    > Comparison of 1,233 OpenBenchmarking.org samples since 14 June 2018; median result: 70.5 Frames Per Second. Box plot of samples:
    > [ *----------############!###*##*#*#*---------------*-----*--*-*| * ]
    > ^ This Result (2nd Percentile): 4.6
    > Arc A770 DG2: 101 ^ RTX 3080: 147 ^ RX 6900 XT: 199 ^
    > Arc A750 DG2: 96 ^ Gigabyte RX 6800 XT: 177 ^
    > RTX 2070 SUPER: 91 ^ RTX 3090: 173 ^
    > RX 5700 XT: 81 ^ Gigabyte RX 6800: 163 ^
    > ```
    >
~~~
- upstream thermald 2.5.3,

    > Unigine Superposition 1.0:
    pts/unigine-super-1.0.8 [Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: High - Renderer: OpenGL]
    Test 1 of 1
    Estimated Trial Run Count: 3
    >

    > Estimated Time To Completion: 11 Minutes [03:49 CDT]
    Started Run 1 @ 03:39:42
    Started Run 2 @ 03:43:08
    Started Run 3 @ 03:46:35
    >

    > Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: High - Renderer: OpenGL:
        4.7
        4.7
        4.7

    Average: 4.7 Frames Per Second
    Maximum: 5.4
    Deviation: 0.00%

    Comparison of 1,233 OpenBenchmarking.org samples since 14 June 2018; median result: 70.5 Frames Per Second. Box plot of samples:
    [ *----------############!###*##*#*#*---------------*-----*--*-*| * ]
      ^ This Result (2nd Percentile): 4.7
                      Arc A770 DG2: 101 ^ RTX 3080: 147 ^ RX 6900 XT: 199 ^
                     Arc A750 DG2: 96 ^ Gigabyte RX 6800 XT: 177 ^
                 RTX 2070 SUPER: 91 ^ RTX 3090: 173 ^
                  RX 5700 XT: 81 ^ Gigabyte RX 6800: 163 ^
    >
~~~

tags: added: verification-done-jammy
removed: verification-needed-jammy
koba (kobako)
tags: added: verification-done
removed: verification-needed
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Hi koba, thanks for your update.

I don't know how to interpret it, though, I'm not familiar with that benchmark you ran. The test plan asks for a game (let's call it workload) to be used, and to show that the game is not throttled, i.e., FPS is not reduced.

We could transpose that to "run benchmark and show that the performance is not reduced" I suppose, but then we would need numbers from a run with the previous thermald and the new one. You showed numbers between the proposed thermald and a later upstream one. And did thermald kick in? Was the GPU throttled? The [impact] section of this bug says that thermald was prematurely throttling the GPU. How can you show that now it's not doing it anymore?

And what about test 2?
" * Run on others platform, ADL/TGL/CML/CFL/KBL."

Revision history for this message
koba (kobako) wrote (last edit ):
Download full text (3.2 KiB)

@Andreas,
for the test 1, actually i can't hit the issue on my side so just show the benchmark between the proposded and upstream version.
i can ran the previous version to compare.
i still need Colette's help to verify but didnt get the reply.

for test 2, it just to run thermald and check if the cpu is throttled.
please review this in LP#1995606, #25.

~~~
Verified, didn't observe the abnormal
~~~
RPL,
$ sudo apt policy thermald
thermald:
  Installed: 2.4.9-1ubuntu0.3
  Candidate: 2.4.9-1ubuntu0.3
  Version table:
 *** 2.4.9-1ubuntu0.3 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     2.4.9-1ubuntu0.2 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
     2.4.9-1 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy/main amd64 Packages
$ cat /proc/cpuinfo | head
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 186
model name : 13th Gen Intel(R) CoreT i7-13700H
stepping : 2
microcode : 0x4114
cpu MHz : 400.151
cache size : 24576 KB
physical id : 0
~~~
ADL
u@u:~$ sudo apt policy thermald
thermald:
  Installed: 2.4.9-1ubuntu0.3
  Candidate: 2.4.9-1ubuntu0.3
  Version table:
 *** 2.4.9-1ubuntu0.3 500
        500 http://archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     2.4.9-1ubuntu0.2 500
        500 http://archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
     2.4.9-1 500
        500 http://archive.ubuntu.com/ubuntu jammy/main amd64 Packages
u@u:~$ cat /proc/cpuinfo | head
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 154
model name : 12th Gen Intel(R) Core(TM) i5-1250P
stepping : 3
microcode : 0x429
cpu MHz : 1178.666
cache size : 12288 KB
physical id : 0

~~~
CML,
$ sudo apt policy thermald
thermald:
  Installed: 2.4.9-1ubuntu0.3
  Candidate: 2.4.9-1ubuntu0.3
  Version table:
 *** 2.4.9-1ubuntu0.3 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     2.4.9-1ubuntu0.2 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
     2.4.9-1 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy/main amd64 Packages
$ cat /proc/cpuinfo |head
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 165
model name : Intel(R) Core(TM) i9-10880H CPU @ 2.30GHz
stepping : 2
microcode : 0xf4
cpu MHz : 2300.000
cache size : 16384 KB
physical id : 0
~~~
CFL
$ sudo apt policy thermald
thermald:
  Installed: 2.4.9-1ubuntu0.3
  Candidate: 2.4.9-1ubuntu0.3
  Version table:
 *** 2.4.9-1ubuntu0.3 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy-proposed/main amd64 Packages
        100 /var/lib/dpkg/status
     2.4.9-1ubuntu0.2 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy-updates/main amd64 Packages
     2.4.9-1 500
        500 http://tw.archive.ubuntu.com/ubuntu jammy/main amd64 Packages
u@u-G3-3779:~$ cat /proc/cpuinfo |head
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 158
model name : Intel(R) Core(TM) i5-8300H CPU @ 2.30GHz
stepping : 10
microcode : 0xf0
cpu MHz : 1000.040
cache size : 8192 KB
physical id : 0
~~~
K...

Read more...

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

We talked OOB, and I think this would be enough for test 1:
- run some workload (the benchmark might be fine) that, with thermald from jammy (2.4.9-1ubuntu0.2), will throttle the GPU "too soon". I assume there will be some log message about this throttling. Note the benchmark should exercise the GPU, as that is what was reported here in this bug.
- run the same workload with thermald from proposed (2.4.9-1ubuntu0.3). If the throttling is now happening later, which is expected, then we should a) have better benchmark numbers; b) no throttling in the logs, or throttling happening at higher GPU temperatures

Test 2 results are in https://bugs.launchpad.net/ubuntu/+source/thermald/+bug/1995606/comments/25 ? Ok, I'll check, but for the next time please try to keep the bug verifications in the bugs they are meant to, as this upload has 7 associated bugs :)

Revision history for this message
koba (kobako) wrote (last edit ):

put cpu-load and gpu-laod simultaneously
1. cpu-load, phoronix-test-suite benchmark compress-7zip
2. gpu-load, phoronix-test-suite benchmark unigine-super
   resolution: 2560*, full-screen, ultra quality.

use nvidia-smi to monitor gpu temperature, performance and utilization
#sudo nvidia-smi -pm 1, watch -n 1 nvidia-smi

kernel,
~~~
$ uname -a
Linux u 6.0.0-1020-oem #20-Ubuntu SMP PREEMPT_DYNAMIC Fri Jul 14 13:12:17 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
~~~

With 2.4.9-1ubuntu0.2,
observed the gpu was put into P3(performance state), the max power watt is limited under 30W.
the gpu temperature is under 70 and lower.
even cpu load is finished, the p state of gpu is still P3 and gpu performance is limited.
gpu load is still running after cpu load is finished.

With 2.4.9-1ubuntu0.3,
didn't observe the throttled symptom during run cpu&gpu load.
gpu is keeping in P0 and max power watt, 80w.
the gpu temperature is over 70 and higher.

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for thermald has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package thermald - 2.4.9-1ubuntu0.3

---------------
thermald (2.4.9-1ubuntu0.3) jammy; urgency=medium

  * Cherry-pick following fixes from thermald 2.5.1 and 2.5.2 (LP: #1995606)
  * debian/patches/0013-Add-AlderLake-N.patch
    - Add support for Adler Lake N (LP: #2012260)
  * debian/patches/0007-Add-INT3400-base-path-for-Raptor-Lake.patch
    - Fix RPL: Add INT3400 base path(LP: #1989044)
  * debian/patches/0014-Process-ITMT-v2.patch
    - Support ITMTv2 for Raptor Lake (LP: #2007579)
  * debian/patches/0008-Install-passive-default.patch
    - Fix throttled GPU (LP: #1981087)
  * debian/patches/0012-Always-match-motion-0.patch
    - Fix in-motion function doesn't work (LP: #2018275)
  * debian/patches/0003-Parse-ITMT-Table.patch
  * debian/patches/0004-Add-capability-for-min-max-per-trip.patch
  * debian/patches/0005-Install-ITMT_target.patch
  * debian/patches/0006-Use-per-trip-min-max.patch
  * debian/patches/0009-Parse-idsp-and-trips.patch
  * debian/patches/0010-use-PL1-max-min-from-PPCC-when-policies-match.patch
  * debian/patches/0011-Parse-GDDV-before-thd_engine-init.patch
    - Fix i9-12900k shutdown when run Prime95 and stress-ng (LP: #2018236)

 -- Koba Ko <email address hidden> Wed, 05 Jul 2023 13:37:32 +0200

Changed in thermald (Ubuntu Jammy):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.