Skip to content

Commit 954a83f

Browse files
committed
Merge branches 'pm-core', 'pm-sleep', 'powercap', 'pm-domains' and 'pm-em'
Merge core device power management changes for v5.20-rc1: - Extend support for wakeirq to callback wrappers used during system suspend and resume (Ulf Hansson). - Defer waiting for device probe before loading a hibernation image till the first actual device access to avoid possible deadlocks reported by syzbot (Tetsuo Handa). - Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP (Bjorn Helgaas). - Add Raptor Lake-P to the list of processors supported by the Intel RAPL driver (George D Sworo). - Add Alder Lake-N and Raptor Lake-P to the list of processors for which Power Limit4 is supported in the Intel RAPL driver (Sumeet Pawnikar). - Make pm_genpd_remove() check genpd_debugfs_dir against NULL before attempting to remove it (Hsin-Yi Wang). - Change the Energy Model code to represent power in micro-Watts and adjust its users accordingly (Lukasz Luba). * pm-core: PM: runtime: Extend support for wakeirq for force_suspend|resume * pm-sleep: PM: hibernate: defer device probing when resuming from hibernation PM: wakeup: Unify device_init_wakeup() for PM_SLEEP and !PM_SLEEP * powercap: powercap: RAPL: Add Power Limit4 support for Alder Lake-N and Raptor Lake-P powercap: intel_rapl: Add support for RAPTORLAKE_P * pm-domains: PM: domains: Ensure genpd_debugfs_dir exists before remove * pm-em: cpufreq: scmi: Support the power scale in micro-Watts in SCMI v3.1 firmware: arm_scmi: Get detailed power scale from perf Documentation: EM: Switch to micro-Watts scale PM: EM: convert power field to micro-Watts precision and align drivers
6 parents 82b6c2e + c46a0d5 + 8386c41 + b08b95c + 37101d3 + f3ac888 commit 954a83f

File tree

18 files changed

+179
-99
lines changed

18 files changed

+179
-99
lines changed

Documentation/power/energy-model.rst

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -20,20 +20,20 @@ possible source of information on its own, the EM framework intervenes as an
2020
abstraction layer which standardizes the format of power cost tables in the
2121
kernel, hence enabling to avoid redundant work.
2222

23-
The power values might be expressed in milli-Watts or in an 'abstract scale'.
23+
The power values might be expressed in micro-Watts or in an 'abstract scale'.
2424
Multiple subsystems might use the EM and it is up to the system integrator to
2525
check that the requirements for the power value scale types are met. An example
2626
can be found in the Energy-Aware Scheduler documentation
2727
Documentation/scheduler/sched-energy.rst. For some subsystems like thermal or
2828
powercap power values expressed in an 'abstract scale' might cause issues.
2929
These subsystems are more interested in estimation of power used in the past,
30-
thus the real milli-Watts might be needed. An example of these requirements can
30+
thus the real micro-Watts might be needed. An example of these requirements can
3131
be found in the Intelligent Power Allocation in
3232
Documentation/driver-api/thermal/power_allocator.rst.
3333
Kernel subsystems might implement automatic detection to check whether EM
3434
registered devices have inconsistent scale (based on EM internal flag).
3535
Important thing to keep in mind is that when the power values are expressed in
36-
an 'abstract scale' deriving real energy in milli-Joules would not be possible.
36+
an 'abstract scale' deriving real energy in micro-Joules would not be possible.
3737

3838
The figure below depicts an example of drivers (Arm-specific here, but the
3939
approach is applicable to any architecture) providing power costs to the EM
@@ -98,18 +98,18 @@ Drivers are expected to register performance domains into the EM framework by
9898
calling the following API::
9999

100100
int em_dev_register_perf_domain(struct device *dev, unsigned int nr_states,
101-
struct em_data_callback *cb, cpumask_t *cpus, bool milliwatts);
101+
struct em_data_callback *cb, cpumask_t *cpus, bool microwatts);
102102

103103
Drivers must provide a callback function returning <frequency, power> tuples
104104
for each performance state. The callback function provided by the driver is free
105105
to fetch data from any relevant location (DT, firmware, ...), and by any mean
106106
deemed necessary. Only for CPU devices, drivers must specify the CPUs of the
107107
performance domains using cpumask. For other devices than CPUs the last
108108
argument must be set to NULL.
109-
The last argument 'milliwatts' is important to set with correct value. Kernel
109+
The last argument 'microwatts' is important to set with correct value. Kernel
110110
subsystems which use EM might rely on this flag to check if all EM devices use
111111
the same scale. If there are different scales, these subsystems might decide
112-
to: return warning/error, stop working or panic.
112+
to return warning/error, stop working or panic.
113113
See Section 3. for an example of driver implementing this
114114
callback, or Section 2.4 for further documentation on this API
115115

@@ -137,7 +137,7 @@ The .get_cost() allows to provide the 'cost' values which reflect the
137137
efficiency of the CPUs. This would allow to provide EAS information which
138138
has different relation than what would be forced by the EM internal
139139
formulas calculating 'cost' values. To register an EM for such platform, the
140-
driver must set the flag 'milliwatts' to 0, provide .get_power() callback
140+
driver must set the flag 'microwatts' to 0, provide .get_power() callback
141141
and provide .get_cost() callback. The EM framework would handle such platform
142142
properly during registration. A flag EM_PERF_DOMAIN_ARTIFICIAL is set for such
143143
platform. Special care should be taken by other frameworks which are using EM

drivers/base/power/domain.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,9 @@ static void genpd_debug_remove(struct generic_pm_domain *genpd)
222222
{
223223
struct dentry *d;
224224

225+
if (!genpd_debugfs_dir)
226+
return;
227+
225228
d = debugfs_lookup(genpd->name, genpd_debugfs_dir);
226229
debugfs_remove(d);
227230
}

drivers/base/power/runtime.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1862,10 +1862,13 @@ int pm_runtime_force_suspend(struct device *dev)
18621862

18631863
callback = RPM_GET_CALLBACK(dev, runtime_suspend);
18641864

1865+
dev_pm_enable_wake_irq_check(dev, true);
18651866
ret = callback ? callback(dev) : 0;
18661867
if (ret)
18671868
goto err;
18681869

1870+
dev_pm_enable_wake_irq_complete(dev);
1871+
18691872
/*
18701873
* If the device can stay in suspend after the system-wide transition
18711874
* to the working state that will follow, drop the children counter of
@@ -1882,6 +1885,7 @@ int pm_runtime_force_suspend(struct device *dev)
18821885
return 0;
18831886

18841887
err:
1888+
dev_pm_disable_wake_irq_check(dev, true);
18851889
pm_runtime_enable(dev);
18861890
return ret;
18871891
}
@@ -1915,9 +1919,11 @@ int pm_runtime_force_resume(struct device *dev)
19151919

19161920
callback = RPM_GET_CALLBACK(dev, runtime_resume);
19171921

1922+
dev_pm_disable_wake_irq_check(dev, false);
19181923
ret = callback ? callback(dev) : 0;
19191924
if (ret) {
19201925
pm_runtime_set_suspended(dev);
1926+
dev_pm_enable_wake_irq_check(dev, false);
19211927
goto out;
19221928
}
19231929

drivers/base/power/wakeup.c

Lines changed: 0 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -500,36 +500,6 @@ void device_set_wakeup_capable(struct device *dev, bool capable)
500500
}
501501
EXPORT_SYMBOL_GPL(device_set_wakeup_capable);
502502

503-
/**
504-
* device_init_wakeup - Device wakeup initialization.
505-
* @dev: Device to handle.
506-
* @enable: Whether or not to enable @dev as a wakeup device.
507-
*
508-
* By default, most devices should leave wakeup disabled. The exceptions are
509-
* devices that everyone expects to be wakeup sources: keyboards, power buttons,
510-
* possibly network interfaces, etc. Also, devices that don't generate their
511-
* own wakeup requests but merely forward requests from one bus to another
512-
* (like PCI bridges) should have wakeup enabled by default.
513-
*/
514-
int device_init_wakeup(struct device *dev, bool enable)
515-
{
516-
int ret = 0;
517-
518-
if (!dev)
519-
return -EINVAL;
520-
521-
if (enable) {
522-
device_set_wakeup_capable(dev, true);
523-
ret = device_wakeup_enable(dev);
524-
} else {
525-
device_wakeup_disable(dev);
526-
device_set_wakeup_capable(dev, false);
527-
}
528-
529-
return ret;
530-
}
531-
EXPORT_SYMBOL_GPL(device_init_wakeup);
532-
533503
/**
534504
* device_set_wakeup_enable - Enable or disable a device to wake up the system.
535505
* @dev: Device to handle.

drivers/cpufreq/mediatek-cpufreq-hw.c

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ static const u16 cpufreq_mtk_offsets[REG_ARRAY_SIZE] = {
5151
};
5252

5353
static int __maybe_unused
54-
mtk_cpufreq_get_cpu_power(struct device *cpu_dev, unsigned long *mW,
54+
mtk_cpufreq_get_cpu_power(struct device *cpu_dev, unsigned long *uW,
5555
unsigned long *KHz)
5656
{
5757
struct mtk_cpufreq_data *data;
@@ -71,8 +71,9 @@ mtk_cpufreq_get_cpu_power(struct device *cpu_dev, unsigned long *mW,
7171
i--;
7272

7373
*KHz = data->table[i].frequency;
74-
*mW = readl_relaxed(data->reg_bases[REG_EM_POWER_TBL] +
75-
i * LUT_ROW_SIZE) / 1000;
74+
/* Provide micro-Watts value to the Energy Model */
75+
*uW = readl_relaxed(data->reg_bases[REG_EM_POWER_TBL] +
76+
i * LUT_ROW_SIZE);
7677

7778
return 0;
7879
}

drivers/cpufreq/scmi-cpufreq.c

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
#include <linux/slab.h>
2020
#include <linux/scmi_protocol.h>
2121
#include <linux/types.h>
22+
#include <linux/units.h>
2223

2324
struct scmi_data {
2425
int domain_id;
@@ -99,6 +100,7 @@ static int __maybe_unused
99100
scmi_get_cpu_power(struct device *cpu_dev, unsigned long *power,
100101
unsigned long *KHz)
101102
{
103+
enum scmi_power_scale power_scale = perf_ops->power_scale_get(ph);
102104
unsigned long Hz;
103105
int ret, domain;
104106

@@ -112,6 +114,10 @@ scmi_get_cpu_power(struct device *cpu_dev, unsigned long *power,
112114
if (ret)
113115
return ret;
114116

117+
/* Convert the power to uW if it is mW (ignore bogoW) */
118+
if (power_scale == SCMI_POWER_MILLIWATTS)
119+
*power *= MICROWATT_PER_MILLIWATT;
120+
115121
/* The EM framework specifies the frequency in KHz. */
116122
*KHz = Hz / 1000;
117123

@@ -249,8 +255,9 @@ static int scmi_cpufreq_exit(struct cpufreq_policy *policy)
249255
static void scmi_cpufreq_register_em(struct cpufreq_policy *policy)
250256
{
251257
struct em_data_callback em_cb = EM_DATA_CB(scmi_get_cpu_power);
252-
bool power_scale_mw = perf_ops->power_scale_mw_get(ph);
258+
enum scmi_power_scale power_scale = perf_ops->power_scale_get(ph);
253259
struct scmi_data *priv = policy->driver_data;
260+
bool em_power_scale = false;
254261

255262
/*
256263
* This callback will be called for each policy, but we don't need to
@@ -262,9 +269,13 @@ static void scmi_cpufreq_register_em(struct cpufreq_policy *policy)
262269
if (!priv->nr_opp)
263270
return;
264271

272+
if (power_scale == SCMI_POWER_MILLIWATTS
273+
|| power_scale == SCMI_POWER_MICROWATTS)
274+
em_power_scale = true;
275+
265276
em_dev_register_perf_domain(get_cpu_device(policy->cpu), priv->nr_opp,
266277
&em_cb, priv->opp_shared_cpus,
267-
power_scale_mw);
278+
em_power_scale);
268279
}
269280

270281
static struct cpufreq_driver scmi_cpufreq_driver = {

drivers/firmware/arm_scmi/perf.c

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -170,8 +170,7 @@ struct perf_dom_info {
170170
struct scmi_perf_info {
171171
u32 version;
172172
int num_domains;
173-
bool power_scale_mw;
174-
bool power_scale_uw;
173+
enum scmi_power_scale power_scale;
175174
u64 stats_addr;
176175
u32 stats_size;
177176
struct perf_dom_info *dom_info;
@@ -201,9 +200,13 @@ static int scmi_perf_attributes_get(const struct scmi_protocol_handle *ph,
201200
u16 flags = le16_to_cpu(attr->flags);
202201

203202
pi->num_domains = le16_to_cpu(attr->num_domains);
204-
pi->power_scale_mw = POWER_SCALE_IN_MILLIWATT(flags);
203+
204+
if (POWER_SCALE_IN_MILLIWATT(flags))
205+
pi->power_scale = SCMI_POWER_MILLIWATTS;
205206
if (PROTOCOL_REV_MAJOR(pi->version) >= 0x3)
206-
pi->power_scale_uw = POWER_SCALE_IN_MICROWATT(flags);
207+
if (POWER_SCALE_IN_MICROWATT(flags))
208+
pi->power_scale = SCMI_POWER_MICROWATTS;
209+
207210
pi->stats_addr = le32_to_cpu(attr->stats_addr_low) |
208211
(u64)le32_to_cpu(attr->stats_addr_high) << 32;
209212
pi->stats_size = le32_to_cpu(attr->stats_size);
@@ -792,11 +795,12 @@ static bool scmi_fast_switch_possible(const struct scmi_protocol_handle *ph,
792795
return dom->fc_info && dom->fc_info->level_set_addr;
793796
}
794797

795-
static bool scmi_power_scale_mw_get(const struct scmi_protocol_handle *ph)
798+
static enum scmi_power_scale
799+
scmi_power_scale_get(const struct scmi_protocol_handle *ph)
796800
{
797801
struct scmi_perf_info *pi = ph->get_priv(ph);
798802

799-
return pi->power_scale_mw;
803+
return pi->power_scale;
800804
}
801805

802806
static const struct scmi_perf_proto_ops perf_proto_ops = {
@@ -811,7 +815,7 @@ static const struct scmi_perf_proto_ops perf_proto_ops = {
811815
.freq_get = scmi_dvfs_freq_get,
812816
.est_power_get = scmi_dvfs_est_power_get,
813817
.fast_switch_possible = scmi_fast_switch_possible,
814-
.power_scale_mw_get = scmi_power_scale_mw_get,
818+
.power_scale_get = scmi_power_scale_get,
815819
};
816820

817821
static int scmi_perf_set_notify_enabled(const struct scmi_protocol_handle *ph,

drivers/opp/of.c

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1443,12 +1443,12 @@ EXPORT_SYMBOL_GPL(dev_pm_opp_get_of_node);
14431443
* It provides the power used by @dev at @kHz if it is the frequency of an
14441444
* existing OPP, or at the frequency of the first OPP above @kHz otherwise
14451445
* (see dev_pm_opp_find_freq_ceil()). This function updates @kHz to the ceiled
1446-
* frequency and @mW to the associated power.
1446+
* frequency and @uW to the associated power.
14471447
*
14481448
* Returns 0 on success or a proper -EINVAL value in case of error.
14491449
*/
14501450
static int __maybe_unused
1451-
_get_dt_power(struct device *dev, unsigned long *mW, unsigned long *kHz)
1451+
_get_dt_power(struct device *dev, unsigned long *uW, unsigned long *kHz)
14521452
{
14531453
struct dev_pm_opp *opp;
14541454
unsigned long opp_freq, opp_power;
@@ -1465,7 +1465,7 @@ _get_dt_power(struct device *dev, unsigned long *mW, unsigned long *kHz)
14651465
return -EINVAL;
14661466

14671467
*kHz = opp_freq / 1000;
1468-
*mW = opp_power / 1000;
1468+
*uW = opp_power;
14691469

14701470
return 0;
14711471
}
@@ -1475,14 +1475,14 @@ _get_dt_power(struct device *dev, unsigned long *mW, unsigned long *kHz)
14751475
* This computes the power estimated by @dev at @kHz if it is the frequency
14761476
* of an existing OPP, or at the frequency of the first OPP above @kHz otherwise
14771477
* (see dev_pm_opp_find_freq_ceil()). This function updates @kHz to the ceiled
1478-
* frequency and @mW to the associated power. The power is estimated as
1478+
* frequency and @uW to the associated power. The power is estimated as
14791479
* P = C * V^2 * f with C being the device's capacitance and V and f
14801480
* respectively the voltage and frequency of the OPP.
14811481
*
14821482
* Returns -EINVAL if the power calculation failed because of missing
14831483
* parameters, 0 otherwise.
14841484
*/
1485-
static int __maybe_unused _get_power(struct device *dev, unsigned long *mW,
1485+
static int __maybe_unused _get_power(struct device *dev, unsigned long *uW,
14861486
unsigned long *kHz)
14871487
{
14881488
struct dev_pm_opp *opp;
@@ -1512,9 +1512,10 @@ static int __maybe_unused _get_power(struct device *dev, unsigned long *mW,
15121512
return -EINVAL;
15131513

15141514
tmp = (u64)cap * mV * mV * (Hz / 1000000);
1515-
do_div(tmp, 1000000000);
1515+
/* Provide power in micro-Watts */
1516+
do_div(tmp, 1000000);
15161517

1517-
*mW = (unsigned long)tmp;
1518+
*uW = (unsigned long)tmp;
15181519
*kHz = Hz / 1000;
15191520

15201521
return 0;

drivers/powercap/dtpm_cpu.c

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
5353

5454
for (i = 0; i < pd->nr_perf_states; i++) {
5555

56-
power = pd->table[i].power * MICROWATT_PER_MILLIWATT * nr_cpus;
56+
power = pd->table[i].power * nr_cpus;
5757

5858
if (power > power_limit)
5959
break;
@@ -63,8 +63,7 @@ static u64 set_pd_power_limit(struct dtpm *dtpm, u64 power_limit)
6363

6464
freq_qos_update_request(&dtpm_cpu->qos_req, freq);
6565

66-
power_limit = pd->table[i - 1].power *
67-
MICROWATT_PER_MILLIWATT * nr_cpus;
66+
power_limit = pd->table[i - 1].power * nr_cpus;
6867

6968
return power_limit;
7069
}

drivers/powercap/intel_rapl_common.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1109,6 +1109,7 @@ static const struct x86_cpu_id rapl_ids[] __initconst = {
11091109
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, &rapl_defaults_core),
11101110
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_N, &rapl_defaults_core),
11111111
X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE, &rapl_defaults_core),
1112+
X86_MATCH_INTEL_FAM6_MODEL(RAPTORLAKE_P, &rapl_defaults_core),
11121113
X86_MATCH_INTEL_FAM6_MODEL(SAPPHIRERAPIDS_X, &rapl_defaults_spr_server),
11131114
X86_MATCH_INTEL_FAM6_MODEL(LAKEFIELD, &rapl_defaults_core),
11141115

0 commit comments

Comments
 (0)