Skip to content

Commit c1f0fcd

Browse files
committed
Merge tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams: "Compute Express Link (CXL) updates for 6.2. While it may seem backwards, the CXL update this time around includes some focus on CXL 1.x enabling where the work to date had been with CXL 2.0 (VH topologies) in mind. First generation CXL can mostly be supported via BIOS, similar to DDR, however it became clear there are use cases for OS native CXL error handling and some CXL 3.0 endpoint features can be deployed on CXL 1.x hosts (Restricted CXL Host (RCH) topologies). So, this update brings RCH topologies into the Linux CXL device model. In support of the ongoing CXL 2.0+ enabling two new core kernel facilities are added. One is the ability for the kernel to flag collisions between userspace access to PCI configuration registers and kernel accesses. This is brought on by the PCIe Data-Object-Exchange (DOE) facility, a hardware mailbox over config-cycles. The other is a cpu_cache_invalidate_memregion() API that maps to wbinvd_on_all_cpus() on x86. To prevent abuse it is disabled in guest VMs and architectures that do not support it yet. The CXL paths that need it, dynamic memory region creation and security commands (erase / unlock), are disabled when it is not present. As for the CXL 2.0+ this cycle the subsystem gains support Persistent Memory Security commands, error handling in response to PCIe AER notifications, and support for the "XOR" host bridge interleave algorithm. Summary: - Add the cpu_cache_invalidate_memregion() API for cache flushing in response to physical memory reconfiguration, or memory-side data invalidation from operations like secure erase or memory-device unlock. - Add a facility for the kernel to warn about collisions between kernel and userspace access to PCI configuration registers - Add support for Restricted CXL Host (RCH) topologies (formerly CXL 1.1) - Add handling and reporting of CXL errors reported via the PCIe AER mechanism - Add support for CXL Persistent Memory Security commands - Add support for the "XOR" algorithm for CXL host bridge interleave - Rework / simplify CXL to NVDIMM interactions - Miscellaneous cleanups and fixes" * tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (71 commits) cxl/region: Fix memdev reuse check cxl/pci: Remove endian confusion cxl/pci: Add some type-safety to the AER trace points cxl/security: Drop security command ioctl uapi cxl/mbox: Add variable output size validation for internal commands cxl/mbox: Enable cxl_mbox_send_cmd() users to validate output size cxl/security: Fix Get Security State output payload endian handling cxl: update names for interleave ways conversion macros cxl: update names for interleave granularity conversion macros cxl/acpi: Warn about an invalid CHBCR in an existing CHBS entry tools/testing/cxl: Require cache invalidation bypass cxl/acpi: Fail decoder add if CXIMS for HBIG is missing cxl/region: Fix spelling mistake "memergion" -> "memregion" cxl/regs: Fix sparse warning cxl/acpi: Set ACPI's CXL _OSC to indicate RCD mode support tools/testing/cxl: Add an RCH topology cxl/port: Add RCD endpoint port enumeration cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem tools/testing/cxl: Add XOR Math support to cxl_test cxl/acpi: Support CXL XOR Interleave Math (CXIMS) ...
2 parents 691806e + f04facf commit c1f0fcd

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

49 files changed

+2649
-836
lines changed

Documentation/ABI/testing/sysfs-bus-nvdimm

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,17 @@ KernelVersion: 5.18
4141
Contact: Kajol Jain <[email protected]>
4242
Description: (RO) This sysfs file exposes the cpumask which is designated to
4343
to retrieve nvdimm pmu event counter data.
44+
45+
What: /sys/bus/nd/devices/nmemX/cxl/id
46+
Date: November 2022
47+
KernelVersion: 6.2
48+
Contact: Dave Jiang <[email protected]>
49+
Description: (RO) Show the id (serial) of the device. This is CXL specific.
50+
51+
What: /sys/bus/nd/devices/nmemX/cxl/provider
52+
Date: November 2022
53+
KernelVersion: 6.2
54+
Contact: Dave Jiang <[email protected]>
55+
Description: (RO) Shows the CXL bridge device that ties to a CXL memory device
56+
to this NVDIMM device. I.e. the parent of the device returned is
57+
a /sys/bus/cxl/devices/memX instance.

Documentation/PCI/pci-error-recovery.rst

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ This structure has the form::
8383
int (*mmio_enabled)(struct pci_dev *dev);
8484
int (*slot_reset)(struct pci_dev *dev);
8585
void (*resume)(struct pci_dev *dev);
86+
void (*cor_error_detected)(struct pci_dev *dev);
8687
};
8788

8889
The possible channel states are::
@@ -422,5 +423,11 @@ That is, the recovery API only requires that:
422423
- drivers/net/cxgb3
423424
- drivers/net/s2io.c
424425

426+
The cor_error_detected() callback is invoked in handle_error_source() when
427+
the error severity is "correctable". The callback is optional and allows
428+
additional logging to be done if desired. See example:
429+
430+
- drivers/cxl/pci.c
431+
425432
The End
426433
-------

arch/x86/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ config X86
6969
select ARCH_ENABLE_THP_MIGRATION if X86_64 && TRANSPARENT_HUGEPAGE
7070
select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
7171
select ARCH_HAS_CACHE_LINE_SIZE
72+
select ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
7273
select ARCH_HAS_CURRENT_STACK_POINTER
7374
select ARCH_HAS_DEBUG_VIRTUAL
7475
select ARCH_HAS_DEBUG_VM_PGTABLE if !X86_PAE

arch/x86/mm/pat/set_memory.c

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020
#include <linux/kernel.h>
2121
#include <linux/cc_platform.h>
2222
#include <linux/set_memory.h>
23+
#include <linux/memregion.h>
2324

2425
#include <asm/e820/api.h>
2526
#include <asm/processor.h>
@@ -330,6 +331,23 @@ void arch_invalidate_pmem(void *addr, size_t size)
330331
EXPORT_SYMBOL_GPL(arch_invalidate_pmem);
331332
#endif
332333

334+
#ifdef CONFIG_ARCH_HAS_CPU_CACHE_INVALIDATE_MEMREGION
335+
bool cpu_cache_has_invalidate_memregion(void)
336+
{
337+
return !cpu_feature_enabled(X86_FEATURE_HYPERVISOR);
338+
}
339+
EXPORT_SYMBOL_NS_GPL(cpu_cache_has_invalidate_memregion, DEVMEM);
340+
341+
int cpu_cache_invalidate_memregion(int res_desc)
342+
{
343+
if (WARN_ON_ONCE(!cpu_cache_has_invalidate_memregion()))
344+
return -ENXIO;
345+
wbinvd_on_all_cpus();
346+
return 0;
347+
}
348+
EXPORT_SYMBOL_NS_GPL(cpu_cache_invalidate_memregion, DEVMEM);
349+
#endif
350+
333351
static void __cpa_flush_all(void *arg)
334352
{
335353
unsigned long cache = (unsigned long)arg;

drivers/acpi/nfit/intel.c

Lines changed: 1 addition & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
#include <linux/libnvdimm.h>
44
#include <linux/ndctl.h>
55
#include <linux/acpi.h>
6+
#include <linux/memregion.h>
67
#include <asm/smp.h>
78
#include "intel.h"
89
#include "nfit.h"
@@ -190,8 +191,6 @@ static int intel_security_change_key(struct nvdimm *nvdimm,
190191
}
191192
}
192193

193-
static void nvdimm_invalidate_cache(void);
194-
195194
static int __maybe_unused intel_security_unlock(struct nvdimm *nvdimm,
196195
const struct nvdimm_key_data *key_data)
197196
{
@@ -227,9 +226,6 @@ static int __maybe_unused intel_security_unlock(struct nvdimm *nvdimm,
227226
return -EIO;
228227
}
229228

230-
/* DIMM unlocked, invalidate all CPU caches before we read it */
231-
nvdimm_invalidate_cache();
232-
233229
return 0;
234230
}
235231

@@ -297,8 +293,6 @@ static int __maybe_unused intel_security_erase(struct nvdimm *nvdimm,
297293
if (!test_bit(cmd, &nfit_mem->dsm_mask))
298294
return -ENOTTY;
299295

300-
/* flush all cache before we erase DIMM */
301-
nvdimm_invalidate_cache();
302296
memcpy(nd_cmd.cmd.passphrase, key->data,
303297
sizeof(nd_cmd.cmd.passphrase));
304298
rc = nvdimm_ctl(nvdimm, ND_CMD_CALL, &nd_cmd, sizeof(nd_cmd), NULL);
@@ -317,8 +311,6 @@ static int __maybe_unused intel_security_erase(struct nvdimm *nvdimm,
317311
return -ENXIO;
318312
}
319313

320-
/* DIMM erased, invalidate all CPU caches before we read it */
321-
nvdimm_invalidate_cache();
322314
return 0;
323315
}
324316

@@ -354,8 +346,6 @@ static int __maybe_unused intel_security_query_overwrite(struct nvdimm *nvdimm)
354346
return -ENXIO;
355347
}
356348

357-
/* flush all cache before we make the nvdimms available */
358-
nvdimm_invalidate_cache();
359349
return 0;
360350
}
361351

@@ -380,8 +370,6 @@ static int __maybe_unused intel_security_overwrite(struct nvdimm *nvdimm,
380370
if (!test_bit(NVDIMM_INTEL_OVERWRITE, &nfit_mem->dsm_mask))
381371
return -ENOTTY;
382372

383-
/* flush all cache before we erase DIMM */
384-
nvdimm_invalidate_cache();
385373
memcpy(nd_cmd.cmd.passphrase, nkey->data,
386374
sizeof(nd_cmd.cmd.passphrase));
387375
rc = nvdimm_ctl(nvdimm, ND_CMD_CALL, &nd_cmd, sizeof(nd_cmd), NULL);
@@ -401,22 +389,6 @@ static int __maybe_unused intel_security_overwrite(struct nvdimm *nvdimm,
401389
}
402390
}
403391

404-
/*
405-
* TODO: define a cross arch wbinvd equivalent when/if
406-
* NVDIMM_FAMILY_INTEL command support arrives on another arch.
407-
*/
408-
#ifdef CONFIG_X86
409-
static void nvdimm_invalidate_cache(void)
410-
{
411-
wbinvd_on_all_cpus();
412-
}
413-
#else
414-
static void nvdimm_invalidate_cache(void)
415-
{
416-
WARN_ON_ONCE("cache invalidation required after unlock\n");
417-
}
418-
#endif
419-
420392
static const struct nvdimm_security_ops __intel_security_ops = {
421393
.get_flags = intel_security_flags,
422394
.freeze = intel_security_freeze,

drivers/acpi/pci_root.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -493,6 +493,7 @@ static u32 calculate_cxl_support(void)
493493
u32 support;
494494

495495
support = OSC_CXL_2_0_PORT_DEV_REG_ACCESS_SUPPORT;
496+
support |= OSC_CXL_1_1_PORT_REG_ACCESS_SUPPORT;
496497
if (pci_aer_available())
497498
support |= OSC_CXL_PROTOCOL_ERR_REPORTING_SUPPORT;
498499
if (IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE))

drivers/cxl/Kconfig

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,4 +111,22 @@ config CXL_REGION
111111
select MEMREGION
112112
select GET_FREE_REGION
113113

114+
config CXL_REGION_INVALIDATION_TEST
115+
bool "CXL: Region Cache Management Bypass (TEST)"
116+
depends on CXL_REGION
117+
help
118+
CXL Region management and security operations potentially invalidate
119+
the content of CPU caches without notifiying those caches to
120+
invalidate the affected cachelines. The CXL Region driver attempts
121+
to invalidate caches when those events occur. If that invalidation
122+
fails the region will fail to enable. Reasons for cache
123+
invalidation failure are due to the CPU not providing a cache
124+
invalidation mechanism. For example usage of wbinvd is restricted to
125+
bare metal x86. However, for testing purposes toggling this option
126+
can disable that data integrity safety and proceed with enabling
127+
regions when there might be conflicting contents in the CPU cache.
128+
129+
If unsure, or if this kernel is meant for production environments,
130+
say N.
131+
114132
endif

drivers/cxl/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,5 +9,5 @@ obj-$(CONFIG_CXL_PORT) += cxl_port.o
99
cxl_mem-y := mem.o
1010
cxl_pci-y := pci.o
1111
cxl_acpi-y := acpi.o
12-
cxl_pmem-y := pmem.o
12+
cxl_pmem-y := pmem.o security.o
1313
cxl_port-y := port.o

0 commit comments

Comments
 (0)