Skip to content

Commit badf1f9

Browse files
committed
Merge branch 'thermal-intel'
Merge thermal control changes related to Intel platforms for 6.3-rc1: - Rework ACPI helper functions for thermal control to retrieve a trip point temperature instead of initializing a trip point objetc (Rafael Wysocki). - Clean up and improve the int340x thermal driver ((Rafael Wysocki). - Simplify and clean up the intel_pch thermal driver ((Rafael Wysocki). - Fix the Intel powerclamp thermal driver and make it use the common idle injection framework (Srinivas Pandruvada). - Add two module parameters, cpumask and max_idle, to the Intel powerclamp thermal driver to allow it to affect only a specific subset of CPUs instead of all of them (Srinivas Pandruvada). - Make the Intel quark_dts thermal driver Use generic trip point objects instead of its own trip point representation (Daniel Lezcano). - Add toctree entry for thermal documents and fix two issues in the Intel powerclamp driver documentation (Bagas Sanjaya). * thermal-intel: (25 commits) Documentation: powerclamp: Fix numbered lists formatting Documentation: powerclamp: Escape wildcard in cpumask description Documentation: admin-guide: Add toctree entry for thermal docs thermal: intel: powerclamp: Add two module parameters Documentation: admin-guide: Move intel_powerclamp documentation thermal: intel: powerclamp: Fix duration module parameter thermal: intel: powerclamp: Return last requested state as cur_state thermal: intel: quark_dts: Use generic trip points thermal: intel: powerclamp: Use powercap idle-inject feature powercap: idle_inject: Add update callback powercap: idle_inject: Export symbols thermal: intel: powerclamp: Fix cur_state for multi package system thermal: intel: intel_pch: Drop struct board_info thermal: intel: intel_pch: Rename board ID symbols thermal: intel: intel_pch: Fold suspend and resume routines into their callers thermal: intel: intel_pch: Fold two functions into their callers thermal: intel: intel_pch: Eliminate device operations object thermal: intel: intel_pch: Rename device operations callbacks thermal: intel: intel_pch: Eliminate redundant return pointers thermal: intel: intel_pch: Make pch_wpt_add_acpi_psv_trip() return int ...
2 parents c3bd6d5 + fef1f0b commit badf1f9

File tree

14 files changed

+680
-627
lines changed

14 files changed

+680
-627
lines changed

Documentation/admin-guide/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,7 @@ configure specific aspects of kernel behavior to your liking.
116116
svga
117117
syscall-user-dispatch
118118
sysrq
119+
thermal/index
119120
thunderbolt
120121
ufs
121122
unicode
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
=================
2+
Thermal Subsystem
3+
=================
4+
5+
.. toctree::
6+
:maxdepth: 1
7+
8+
intel_powerclamp

Documentation/driver-api/thermal/intel_powerclamp.rst renamed to Documentation/admin-guide/thermal/intel_powerclamp.rst

Lines changed: 30 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ By:
2626
- Generic Thermal Layer (sysfs)
2727
- Kernel APIs (TBD)
2828
29+
(*) Module Parameters
30+
2931
INTRODUCTION
3032
============
3133

@@ -153,13 +155,15 @@ b) determine the amount of compensation needed at each target ratio
153155
Compensation to each target ratio consists of two parts:
154156

155157
a) steady state error compensation
156-
This is to offset the error occurring when the system can
157-
enter idle without extra wakeups (such as external interrupts).
158+
159+
This is to offset the error occurring when the system can
160+
enter idle without extra wakeups (such as external interrupts).
158161

159162
b) dynamic error compensation
160-
When an excessive amount of wakeups occurs during idle, an
161-
additional idle ratio can be added to quiet interrupts, by
162-
slowing down CPU activities.
163+
164+
When an excessive amount of wakeups occurs during idle, an
165+
additional idle ratio can be added to quiet interrupts, by
166+
slowing down CPU activities.
163167

164168
A debugfs file is provided for the user to examine compensation
165169
progress and results, such as on a Westmere system::
@@ -281,6 +285,7 @@ cur_state returns value -1 instead of 0 which is to avoid confusing
281285
100% busy state with the disabled state.
282286

283287
Example usage:
288+
284289
- To inject 25% idle time::
285290

286291
$ sudo sh -c "echo 25 > /sys/class/thermal/cooling_device80/cur_state
@@ -318,3 +323,23 @@ device, a PID based userspace thermal controller can manage to
318323
control CPU temperature effectively, when no other thermal influence
319324
is added. For example, a UltraBook user can compile the kernel under
320325
certain temperature (below most active trip points).
326+
327+
Module Parameters
328+
=================
329+
330+
``cpumask`` (RW)
331+
A bit mask of CPUs to inject idle. The format of the bitmask is same as
332+
used in other subsystems like in /proc/irq/\*/smp_affinity. The mask is
333+
comma separated 32 bit groups. Each CPU is one bit. For example for a 256
334+
CPU system the full mask is:
335+
ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff,ffffffff
336+
337+
The rightmost mask is for CPU 0-32.
338+
339+
``max_idle`` (RW)
340+
Maximum injected idle time to the total CPU time ratio in percent range
341+
from 1 to 100. Even if the cooling device max_state is always 100 (100%),
342+
this parameter allows to add a max idle percent limit. The default is 50,
343+
to match the current implementation of powerclamp driver. Also doesn't
344+
allow value more than 75, if the cpumask includes every CPU present in
345+
the system.

Documentation/driver-api/thermal/index.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@ Thermal
1414

1515
exynos_thermal
1616
exynos_thermal_emulation
17-
intel_powerclamp
1817
nouveau_thermal
1918
x86_pkg_temperature_thermal
2019
intel_dptf

MAINTAINERS

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20707,6 +20707,7 @@ S: Supported
2070720707
Q: https://patchwork.kernel.org/project/linux-pm/list/
2070820708
T: git git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git thermal
2070920709
F: Documentation/ABI/testing/sysfs-class-thermal
20710+
F: Documentation/admin-guide/thermal/
2071020711
F: Documentation/devicetree/bindings/thermal/
2071120712
F: Documentation/driver-api/thermal/
2071220713
F: drivers/thermal/

drivers/powercap/idle_inject.c

Lines changed: 53 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -63,13 +63,29 @@ struct idle_inject_thread {
6363
* @idle_duration_us: duration of CPU idle time to inject
6464
* @run_duration_us: duration of CPU run time to allow
6565
* @latency_us: max allowed latency
66+
* @update: Optional callback deciding whether or not to skip idle
67+
* injection in the given cycle.
6668
* @cpumask: mask of CPUs affected by idle injection
69+
*
70+
* This structure is used to define per instance idle inject device data. Each
71+
* instance has an idle duration, a run duration and mask of CPUs to inject
72+
* idle.
73+
*
74+
* Actual CPU idle time is injected by calling kernel scheduler interface
75+
* play_idle_precise(). There is one optional callback that can be registered
76+
* by calling idle_inject_register_full():
77+
*
78+
* update() - This callback is invoked just before waking up CPUs to inject
79+
* idle. If it returns false, CPUs are not woken up to inject idle in the given
80+
* cycle. It also allows the caller to readjust the idle and run duration by
81+
* calling idle_inject_set_duration() for the next cycle.
6782
*/
6883
struct idle_inject_device {
6984
struct hrtimer timer;
7085
unsigned int idle_duration_us;
7186
unsigned int run_duration_us;
7287
unsigned int latency_us;
88+
bool (*update)(void);
7389
unsigned long cpumask[];
7490
};
7591

@@ -111,11 +127,12 @@ static enum hrtimer_restart idle_inject_timer_fn(struct hrtimer *timer)
111127
struct idle_inject_device *ii_dev =
112128
container_of(timer, struct idle_inject_device, timer);
113129

130+
if (!ii_dev->update || (ii_dev->update && ii_dev->update()))
131+
idle_inject_wakeup(ii_dev);
132+
114133
duration_us = READ_ONCE(ii_dev->run_duration_us);
115134
duration_us += READ_ONCE(ii_dev->idle_duration_us);
116135

117-
idle_inject_wakeup(ii_dev);
118-
119136
hrtimer_forward_now(timer, ns_to_ktime(duration_us * NSEC_PER_USEC));
120137

121138
return HRTIMER_RESTART;
@@ -160,6 +177,7 @@ void idle_inject_set_duration(struct idle_inject_device *ii_dev,
160177
WRITE_ONCE(ii_dev->idle_duration_us, idle_duration_us);
161178
}
162179
}
180+
EXPORT_SYMBOL_NS_GPL(idle_inject_set_duration, IDLE_INJECT);
163181

164182
/**
165183
* idle_inject_get_duration - idle and run duration retrieval helper
@@ -174,6 +192,7 @@ void idle_inject_get_duration(struct idle_inject_device *ii_dev,
174192
*run_duration_us = READ_ONCE(ii_dev->run_duration_us);
175193
*idle_duration_us = READ_ONCE(ii_dev->idle_duration_us);
176194
}
195+
EXPORT_SYMBOL_NS_GPL(idle_inject_get_duration, IDLE_INJECT);
177196

178197
/**
179198
* idle_inject_set_latency - set the maximum latency allowed
@@ -185,6 +204,7 @@ void idle_inject_set_latency(struct idle_inject_device *ii_dev,
185204
{
186205
WRITE_ONCE(ii_dev->latency_us, latency_us);
187206
}
207+
EXPORT_SYMBOL_NS_GPL(idle_inject_set_latency, IDLE_INJECT);
188208

189209
/**
190210
* idle_inject_start - start idle injections
@@ -216,6 +236,7 @@ int idle_inject_start(struct idle_inject_device *ii_dev)
216236

217237
return 0;
218238
}
239+
EXPORT_SYMBOL_NS_GPL(idle_inject_start, IDLE_INJECT);
219240

220241
/**
221242
* idle_inject_stop - stops idle injections
@@ -262,6 +283,7 @@ void idle_inject_stop(struct idle_inject_device *ii_dev)
262283

263284
cpu_hotplug_enable();
264285
}
286+
EXPORT_SYMBOL_NS_GPL(idle_inject_stop, IDLE_INJECT);
265287

266288
/**
267289
* idle_inject_setup - prepare the current task for idle injection
@@ -290,17 +312,22 @@ static int idle_inject_should_run(unsigned int cpu)
290312
}
291313

292314
/**
293-
* idle_inject_register - initialize idle injection on a set of CPUs
315+
* idle_inject_register_full - initialize idle injection on a set of CPUs
294316
* @cpumask: CPUs to be affected by idle injection
317+
* @update: This callback is called just before waking up CPUs to inject
318+
* idle
295319
*
296320
* This function creates an idle injection control device structure for the
297-
* given set of CPUs and initializes the timer associated with it. It does not
298-
* start any injection cycles.
321+
* given set of CPUs and initializes the timer associated with it. This
322+
* function also allows to register update()callback.
323+
* It does not start any injection cycles.
299324
*
300325
* Return: NULL if memory allocation fails, idle injection control device
301326
* pointer on success.
302327
*/
303-
struct idle_inject_device *idle_inject_register(struct cpumask *cpumask)
328+
329+
struct idle_inject_device *idle_inject_register_full(struct cpumask *cpumask,
330+
bool (*update)(void))
304331
{
305332
struct idle_inject_device *ii_dev;
306333
int cpu, cpu_rb;
@@ -313,6 +340,7 @@ struct idle_inject_device *idle_inject_register(struct cpumask *cpumask)
313340
hrtimer_init(&ii_dev->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
314341
ii_dev->timer.function = idle_inject_timer_fn;
315342
ii_dev->latency_us = UINT_MAX;
343+
ii_dev->update = update;
316344

317345
for_each_cpu(cpu, to_cpumask(ii_dev->cpumask)) {
318346

@@ -337,6 +365,24 @@ struct idle_inject_device *idle_inject_register(struct cpumask *cpumask)
337365

338366
return NULL;
339367
}
368+
EXPORT_SYMBOL_NS_GPL(idle_inject_register_full, IDLE_INJECT);
369+
370+
/**
371+
* idle_inject_register - initialize idle injection on a set of CPUs
372+
* @cpumask: CPUs to be affected by idle injection
373+
*
374+
* This function creates an idle injection control device structure for the
375+
* given set of CPUs and initializes the timer associated with it. It does not
376+
* start any injection cycles.
377+
*
378+
* Return: NULL if memory allocation fails, idle injection control device
379+
* pointer on success.
380+
*/
381+
struct idle_inject_device *idle_inject_register(struct cpumask *cpumask)
382+
{
383+
return idle_inject_register_full(cpumask, NULL);
384+
}
385+
EXPORT_SYMBOL_NS_GPL(idle_inject_register, IDLE_INJECT);
340386

341387
/**
342388
* idle_inject_unregister - unregister idle injection control device
@@ -357,6 +403,7 @@ void idle_inject_unregister(struct idle_inject_device *ii_dev)
357403

358404
kfree(ii_dev);
359405
}
406+
EXPORT_SYMBOL_NS_GPL(idle_inject_unregister, IDLE_INJECT);
360407

361408
static struct smp_hotplug_thread idle_inject_threads = {
362409
.store = &idle_inject_thread.tsk,

drivers/thermal/intel/Kconfig

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,9 @@ config INTEL_POWERCLAMP
33
tristate "Intel PowerClamp idle injection driver"
44
depends on X86
55
depends on CPU_SUP_INTEL
6+
depends on CPU_IDLE
7+
select POWERCAP
8+
select IDLE_INJECT
69
help
710
Enable this to enable Intel PowerClamp idle injection driver. This
811
enforce idle time which results in more package C-state residency. The

0 commit comments

Comments
 (0)