Skip to content

Commit 02824a5

Browse files
committed
Merge tag 'pm-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki: "By the number of new lines of code, the most visible change here is the addition of hybrid CPU capacity scaling support to the intel_pstate driver. Next are the amd-pstate driver changes related to the calculation of the AMD boost numerator and preferred core detection. As far as new hardware support is concerned, the intel_idle driver will now handle Granite Rapids Xeon processors natively, the intel_rapl power capping driver will recognize family 1Ah of AMD processors and Intel ArrowLake-U chipos, and intel_pstate will handle Granite Rapids and Sierra Forest chips in the out-of-band (OOB) mode. Apart from the above, there is a usual collection of assorted fixes and code cleanups in many places and there are tooling updates. Specifics: - Remove LATENCY_MULTIPLIER from cpufreq (Qais Yousef) - Add support for Granite Rapids and Sierra Forest in OOB mode to the intel_pstate cpufreq driver (Srinivas Pandruvada) - Add basic support for CPU capacity scaling on x86 and make the intel_pstate driver set asymmetric CPU capacity on hybrid systems without SMT (Rafael Wysocki) - Add missing MODULE_DESCRIPTION() macros to the powerpc cpufreq driver (Jeff Johnson) - Several OF related cleanups in cpufreq drivers (Rob Herring) - Enable COMPILE_TEST for ARM drivers (Rob Herrring) - Introduce quirks for syscon failures and use socinfo to get revision for TI cpufreq driver (Dhruva Gole, Nishanth Menon) - Minor cleanups in amd-pstate driver (Anastasia Belova, Dhananjay Ugwekar) - Minor cleanups for loongson, cpufreq-dt and powernv cpufreq drivers (Danila Tikhonov, Huacai Chen, and Liu Jing) - Make amd-pstate validate return of any attempt to update EPP limits, which fixes the masking hardware problems (Mario Limonciello) - Move the calculation of the AMD boost numerator outside of amd-pstate, correcting acpi-cpufreq on systems with preferred cores (Mario Limonciello) - Harden preferred core detection in amd-pstate to avoid potential false positives (Mario Limonciello) - Add extra unit test coverage for mode state machine (Mario Limonciello) - Fix an "Uninitialized variables" issue in amd-pstste (Qianqiang Liu) - Add Granite Rapids Xeon support to intel_idle (Artem Bityutskiy) - Disable promotion to C1E on Jasper Lake and Elkhart Lake in intel_idle (Kai-Heng Feng) - Use scoped device node handling to fix missing of_node_put() and simplify walking OF children in the riscv-sbi cpuidle driver (Krzysztof Kozlowski) - Remove dead code from cpuidle_enter_state() (Dhruva Gole) - Change an error pointer to NULL to fix error handling in the intel_rapl power capping driver (Dan Carpenter) - Fix off by one in get_rpi() in the intel_rapl power capping driver (Dan Carpenter) - Add support for ArrowLake-U to the intel_rapl power capping driver (Sumeet Pawnikar) - Fix the energy-pkg event for AMD CPUs in the intel_rapl power capping driver (Dhananjay Ugwekar) - Add support for AMD family 1Ah processors to the intel_rapl power capping driver (Dhananjay Ugwekar) - Remove unused stub for saveable_highmem_page() and remove deprecated macros from power management documentation (Andy Shevchenko) - Use ysfs_emit() and sysfs_emit_at() in "show" functions in the PM sysfs interface (Xueqin Luo) - Update the maintainers information for the operating-points-v2-ti-cpu DT binding (Dhruva Gole) - Drop unnecessary of_match_ptr() from ti-opp-supply (Rob Herring) - Add missing MODULE_DESCRIPTION() macros to devfreq governors (Jeff Johnson) - Use devm_clk_get_enabled() in the exynos-bus devfreq driver (Anand Moon) - Use of_property_present() instead of of_get_property() in the imx-bus devfreq driver (Rob Herring) - Update directory handling and installation process in the pm-graph Makefile and add .gitignore to ignore sleepgraph.py artifacts to pm-graph (Amit Vadhavana, Yo-Jung Lin) - Make cpupower display residency value in idle-info (Aboorva Devarajan) - Add missing powercap_set_enabled() stub function to cpupower (John B. Wyatt IV) - Add SWIG support to cpupower (John B. Wyatt IV)" * tag 'pm-6.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (62 commits) cpufreq/amd-pstate-ut: Fix an "Uninitialized variables" issue cpufreq/amd-pstate-ut: Add test case for mode switches cpufreq/amd-pstate: Export symbols for changing modes amd-pstate: Add missing documentation for `amd_pstate_prefcore_ranking` cpufreq: amd-pstate: Add documentation for `amd_pstate_hw_prefcore` cpufreq: amd-pstate: Optimize amd_pstate_update_limits() cpufreq: amd-pstate: Merge amd_pstate_highest_perf_set() into amd_get_boost_ratio_numerator() x86/amd: Detect preferred cores in amd_get_boost_ratio_numerator() x86/amd: Move amd_get_highest_perf() out of amd-pstate ACPI: CPPC: Adjust debug messages in amd_set_max_freq_ratio() to warn ACPI: CPPC: Drop check for non zero perf ratio x86/amd: Rename amd_get_highest_perf() to amd_get_boost_ratio_numerator() ACPI: CPPC: Adjust return code for inline functions in !CONFIG_ACPI_CPPC_LIB x86/amd: Move amd_get_highest_perf() from amd.c to cppc.c PM: hibernate: Remove unused stub for saveable_highmem_page() pm:cpupower: Add error warning when SWIG is not installed MAINTAINERS: Add Maintainers for SWIG Python bindings pm:cpupower: Include test_raw_pylibcpupower.py pm:cpupower: Add SWIG bindings files for libcpupower pm:cpupower: Add missing powercap_set_enabled() stub function ...
2 parents 11b3125 + 0a06811 commit 02824a5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+1416
-407
lines changed

Documentation/admin-guide/pm/amd-pstate.rst

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -251,7 +251,9 @@ performance supported in `AMD CPPC Performance Capability <perf_cap_>`_).
251251
In some ASICs, the highest CPPC performance is not the one in the ``_CPC``
252252
table, so we need to expose it to sysfs. If boost is not active, but
253253
still supported, this maximum frequency will be larger than the one in
254-
``cpuinfo``.
254+
``cpuinfo``. On systems that support preferred core, the driver will have
255+
different values for some cores than others and this will reflect the values
256+
advertised by the platform at bootup.
255257
This attribute is read-only.
256258

257259
``amd_pstate_lowest_nonlinear_freq``
@@ -262,6 +264,17 @@ lowest non-linear performance in `AMD CPPC Performance Capability
262264
<perf_cap_>`_.)
263265
This attribute is read-only.
264266

267+
``amd_pstate_hw_prefcore``
268+
269+
Whether the platform supports the preferred core feature and it has been
270+
enabled. This attribute is read-only.
271+
272+
``amd_pstate_prefcore_ranking``
273+
274+
The performance ranking of the core. This number doesn't have any unit, but
275+
larger numbers are preferred at the time of reading. This can change at
276+
runtime based on platform conditions. This attribute is read-only.
277+
265278
``energy_performance_available_preferences``
266279

267280
A list of all the supported EPP preferences that could be used for

Documentation/devicetree/bindings/opp/operating-points-v2-ti-cpu.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ description:
1919
the hardware description for the scheme mentioned above.
2020

2121
maintainers:
22-
- Nishanth Menon <nm@ti.com>
22+
- Dhruva Gole <d-gole@ti.com>
2323

2424
allOf:
2525
- $ref: opp-v2-base.yaml#

Documentation/power/pci.rst

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -979,18 +979,17 @@ subsections can be defined as a separate function, it often is convenient to
979979
point two or more members of struct dev_pm_ops to the same routine. There are
980980
a few convenience macros that can be used for this purpose.
981981

982-
The SIMPLE_DEV_PM_OPS macro declares a struct dev_pm_ops object with one
982+
The DEFINE_SIMPLE_DEV_PM_OPS() declares a struct dev_pm_ops object with one
983983
suspend routine pointed to by the .suspend(), .freeze(), and .poweroff()
984984
members and one resume routine pointed to by the .resume(), .thaw(), and
985985
.restore() members. The other function pointers in this struct dev_pm_ops are
986986
unset.
987987

988-
The UNIVERSAL_DEV_PM_OPS macro is similar to SIMPLE_DEV_PM_OPS, but it
989-
additionally sets the .runtime_resume() pointer to the same value as
990-
.resume() (and .thaw(), and .restore()) and the .runtime_suspend() pointer to
991-
the same value as .suspend() (and .freeze() and .poweroff()).
988+
The DEFINE_RUNTIME_DEV_PM_OPS() is similar to DEFINE_SIMPLE_DEV_PM_OPS(), but it
989+
additionally sets the .runtime_resume() pointer to pm_runtime_force_resume()
990+
and the .runtime_suspend() pointer to pm_runtime_force_suspend().
992991

993-
The SET_SYSTEM_SLEEP_PM_OPS can be used inside of a declaration of struct
992+
The SYSTEM_SLEEP_PM_OPS() can be used inside of a declaration of struct
994993
dev_pm_ops to indicate that one suspend routine is to be pointed to by the
995994
.suspend(), .freeze(), and .poweroff() members and one resume routine is to
996995
be pointed to by the .resume(), .thaw(), and .restore() members.

Documentation/power/runtime_pm.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -811,8 +811,8 @@ subsystem-level dev_pm_ops structure.
811811

812812
Device drivers that wish to use the same function as a system suspend, freeze,
813813
poweroff and runtime suspend callback, and similarly for system resume, thaw,
814-
restore, and runtime resume, can achieve this with the help of the
815-
UNIVERSAL_DEV_PM_OPS macro defined in include/linux/pm.h (possibly setting its
814+
restore, and runtime resume, can achieve similar behaviour with the help of the
815+
DEFINE_RUNTIME_DEV_PM_OPS() defined in include/linux/pm_runtime.h (possibly setting its
816816
last argument to NULL).
817817

818818
8. "No-Callback" Devices

MAINTAINERS

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5851,6 +5851,9 @@ CPU POWER MONITORING SUBSYSTEM
58515851
M: Thomas Renninger <[email protected]>
58525852
M: Shuah Khan <[email protected]>
58535853
M: Shuah Khan <[email protected]>
5854+
M: John B. Wyatt IV <[email protected]>
5855+
M: John B. Wyatt IV <[email protected]>
5856+
M: John Kacur <[email protected]>
58545857
58555858
S: Maintained
58565859
F: tools/power/cpupower/

arch/x86/include/asm/processor.h

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -691,8 +691,6 @@ static inline u32 per_cpu_l2c_id(unsigned int cpu)
691691
}
692692

693693
#ifdef CONFIG_CPU_SUP_AMD
694-
extern u32 amd_get_highest_perf(void);
695-
696694
/*
697695
* Issue a DIV 0/1 insn to clear any division data from previous DIV
698696
* operations.
@@ -705,7 +703,6 @@ static __always_inline void amd_clear_divider(void)
705703

706704
extern void amd_check_microcode(void);
707705
#else
708-
static inline u32 amd_get_highest_perf(void) { return 0; }
709706
static inline void amd_clear_divider(void) { }
710707
static inline void amd_check_microcode(void) { }
711708
#endif

arch/x86/include/asm/topology.h

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -282,9 +282,22 @@ static inline long arch_scale_freq_capacity(int cpu)
282282
}
283283
#define arch_scale_freq_capacity arch_scale_freq_capacity
284284

285+
bool arch_enable_hybrid_capacity_scale(void);
286+
void arch_set_cpu_capacity(int cpu, unsigned long cap, unsigned long max_cap,
287+
unsigned long cap_freq, unsigned long base_freq);
288+
289+
unsigned long arch_scale_cpu_capacity(int cpu);
290+
#define arch_scale_cpu_capacity arch_scale_cpu_capacity
291+
285292
extern void arch_set_max_freq_ratio(bool turbo_disabled);
286293
extern void freq_invariance_set_perf_ratio(u64 ratio, bool turbo_disabled);
287294
#else
295+
static inline bool arch_enable_hybrid_capacity_scale(void) { return false; }
296+
static inline void arch_set_cpu_capacity(int cpu, unsigned long cap,
297+
unsigned long max_cap,
298+
unsigned long cap_freq,
299+
unsigned long base_freq) { }
300+
288301
static inline void arch_set_max_freq_ratio(bool turbo_disabled) { }
289302
static inline void freq_invariance_set_perf_ratio(u64 ratio, bool turbo_disabled) { }
290303
#endif

arch/x86/kernel/acpi/cppc.c

Lines changed: 161 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,17 @@
99
#include <asm/processor.h>
1010
#include <asm/topology.h>
1111

12+
#define CPPC_HIGHEST_PERF_PERFORMANCE 196
13+
#define CPPC_HIGHEST_PERF_PREFCORE 166
14+
15+
enum amd_pref_core {
16+
AMD_PREF_CORE_UNKNOWN = 0,
17+
AMD_PREF_CORE_SUPPORTED,
18+
AMD_PREF_CORE_UNSUPPORTED,
19+
};
20+
static enum amd_pref_core amd_pref_core_detected;
21+
static u64 boost_numerator;
22+
1223
/* Refer to drivers/acpi/cppc_acpi.c for the description of functions */
1324

1425
bool cpc_supported_by_cpu(void)
@@ -69,31 +80,30 @@ int cpc_write_ffh(int cpunum, struct cpc_reg *reg, u64 val)
6980
static void amd_set_max_freq_ratio(void)
7081
{
7182
struct cppc_perf_caps perf_caps;
72-
u64 highest_perf, nominal_perf;
83+
u64 numerator, nominal_perf;
7384
u64 perf_ratio;
7485
int rc;
7586

7687
rc = cppc_get_perf_caps(0, &perf_caps);
7788
if (rc) {
78-
pr_debug("Could not retrieve perf counters (%d)\n", rc);
89+
pr_warn("Could not retrieve perf counters (%d)\n", rc);
7990
return;
8091
}
8192

82-
highest_perf = amd_get_highest_perf();
93+
rc = amd_get_boost_ratio_numerator(0, &numerator);
94+
if (rc) {
95+
pr_warn("Could not retrieve highest performance (%d)\n", rc);
96+
return;
97+
}
8398
nominal_perf = perf_caps.nominal_perf;
8499

85-
if (!highest_perf || !nominal_perf) {
86-
pr_debug("Could not retrieve highest or nominal performance\n");
100+
if (!nominal_perf) {
101+
pr_warn("Could not retrieve nominal performance\n");
87102
return;
88103
}
89104

90-
perf_ratio = div_u64(highest_perf * SCHED_CAPACITY_SCALE, nominal_perf);
91105
/* midpoint between max_boost and max_P */
92-
perf_ratio = (perf_ratio + SCHED_CAPACITY_SCALE) >> 1;
93-
if (!perf_ratio) {
94-
pr_debug("Non-zero highest/nominal perf values led to a 0 ratio\n");
95-
return;
96-
}
106+
perf_ratio = (div_u64(numerator * SCHED_CAPACITY_SCALE, nominal_perf) + SCHED_CAPACITY_SCALE) >> 1;
97107

98108
freq_invariance_set_perf_ratio(perf_ratio, false);
99109
}
@@ -116,3 +126,143 @@ void init_freq_invariance_cppc(void)
116126
init_done = true;
117127
mutex_unlock(&freq_invariance_lock);
118128
}
129+
130+
/*
131+
* Get the highest performance register value.
132+
* @cpu: CPU from which to get highest performance.
133+
* @highest_perf: Return address for highest performance value.
134+
*
135+
* Return: 0 for success, negative error code otherwise.
136+
*/
137+
int amd_get_highest_perf(unsigned int cpu, u32 *highest_perf)
138+
{
139+
u64 val;
140+
int ret;
141+
142+
if (cpu_feature_enabled(X86_FEATURE_CPPC)) {
143+
ret = rdmsrl_safe_on_cpu(cpu, MSR_AMD_CPPC_CAP1, &val);
144+
if (ret)
145+
goto out;
146+
147+
val = AMD_CPPC_HIGHEST_PERF(val);
148+
} else {
149+
ret = cppc_get_highest_perf(cpu, &val);
150+
if (ret)
151+
goto out;
152+
}
153+
154+
WRITE_ONCE(*highest_perf, (u32)val);
155+
out:
156+
return ret;
157+
}
158+
EXPORT_SYMBOL_GPL(amd_get_highest_perf);
159+
160+
/**
161+
* amd_detect_prefcore: Detect if CPUs in the system support preferred cores
162+
* @detected: Output variable for the result of the detection.
163+
*
164+
* Determine whether CPUs in the system support preferred cores. On systems
165+
* that support preferred cores, different highest perf values will be found
166+
* on different cores. On other systems, the highest perf value will be the
167+
* same on all cores.
168+
*
169+
* The result of the detection will be stored in the 'detected' parameter.
170+
*
171+
* Return: 0 for success, negative error code otherwise
172+
*/
173+
int amd_detect_prefcore(bool *detected)
174+
{
175+
int cpu, count = 0;
176+
u64 highest_perf[2] = {0};
177+
178+
if (WARN_ON(!detected))
179+
return -EINVAL;
180+
181+
switch (amd_pref_core_detected) {
182+
case AMD_PREF_CORE_SUPPORTED:
183+
*detected = true;
184+
return 0;
185+
case AMD_PREF_CORE_UNSUPPORTED:
186+
*detected = false;
187+
return 0;
188+
default:
189+
break;
190+
}
191+
192+
for_each_present_cpu(cpu) {
193+
u32 tmp;
194+
int ret;
195+
196+
ret = amd_get_highest_perf(cpu, &tmp);
197+
if (ret)
198+
return ret;
199+
200+
if (!count || (count == 1 && tmp != highest_perf[0]))
201+
highest_perf[count++] = tmp;
202+
203+
if (count == 2)
204+
break;
205+
}
206+
207+
*detected = (count == 2);
208+
boost_numerator = highest_perf[0];
209+
210+
amd_pref_core_detected = *detected ? AMD_PREF_CORE_SUPPORTED :
211+
AMD_PREF_CORE_UNSUPPORTED;
212+
213+
pr_debug("AMD CPPC preferred core is %ssupported (highest perf: 0x%llx)\n",
214+
*detected ? "" : "un", highest_perf[0]);
215+
216+
return 0;
217+
}
218+
EXPORT_SYMBOL_GPL(amd_detect_prefcore);
219+
220+
/**
221+
* amd_get_boost_ratio_numerator: Get the numerator to use for boost ratio calculation
222+
* @cpu: CPU to get numerator for.
223+
* @numerator: Output variable for numerator.
224+
*
225+
* Determine the numerator to use for calculating the boost ratio on
226+
* a CPU. On systems that support preferred cores, this will be a hardcoded
227+
* value. On other systems this will the highest performance register value.
228+
*
229+
* If booting the system with amd-pstate enabled but preferred cores disabled then
230+
* the correct boost numerator will be returned to match hardware capabilities
231+
* even if the preferred cores scheduling hints are not enabled.
232+
*
233+
* Return: 0 for success, negative error code otherwise.
234+
*/
235+
int amd_get_boost_ratio_numerator(unsigned int cpu, u64 *numerator)
236+
{
237+
bool prefcore;
238+
int ret;
239+
240+
ret = amd_detect_prefcore(&prefcore);
241+
if (ret)
242+
return ret;
243+
244+
/* without preferred cores, return the highest perf register value */
245+
if (!prefcore) {
246+
*numerator = boost_numerator;
247+
return 0;
248+
}
249+
250+
/*
251+
* For AMD CPUs with Family ID 19H and Model ID range 0x70 to 0x7f,
252+
* the highest performance level is set to 196.
253+
* https://bugzilla.kernel.org/show_bug.cgi?id=218759
254+
*/
255+
if (cpu_feature_enabled(X86_FEATURE_ZEN4)) {
256+
switch (boot_cpu_data.x86_model) {
257+
case 0x70 ... 0x7f:
258+
*numerator = CPPC_HIGHEST_PERF_PERFORMANCE;
259+
return 0;
260+
default:
261+
break;
262+
}
263+
}
264+
*numerator = CPPC_HIGHEST_PERF_PREFCORE;
265+
266+
return 0;
267+
}
268+
EXPORT_SYMBOL_GPL(amd_get_boost_ratio_numerator);

arch/x86/kernel/cpu/amd.c

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1190,22 +1190,6 @@ unsigned long amd_get_dr_addr_mask(unsigned int dr)
11901190
}
11911191
EXPORT_SYMBOL_GPL(amd_get_dr_addr_mask);
11921192

1193-
u32 amd_get_highest_perf(void)
1194-
{
1195-
struct cpuinfo_x86 *c = &boot_cpu_data;
1196-
1197-
if (c->x86 == 0x17 && ((c->x86_model >= 0x30 && c->x86_model < 0x40) ||
1198-
(c->x86_model >= 0x70 && c->x86_model < 0x80)))
1199-
return 166;
1200-
1201-
if (c->x86 == 0x19 && ((c->x86_model >= 0x20 && c->x86_model < 0x30) ||
1202-
(c->x86_model >= 0x40 && c->x86_model < 0x70)))
1203-
return 166;
1204-
1205-
return 255;
1206-
}
1207-
EXPORT_SYMBOL_GPL(amd_get_highest_perf);
1208-
12091193
static void zenbleed_check_cpu(void *unused)
12101194
{
12111195
struct cpuinfo_x86 *c = &cpu_data(smp_processor_id());

0 commit comments

Comments
 (0)