summaryrefslogtreecommitdiff
path: root/drivers/cxl/core
AgeCommit message (Collapse)AuthorFilesLines
7 daysMerge tag 'cxl-for-6.11' of ↵Linus Torvalds6-80/+100
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL updates from Dave Jiang: "Core: - A CXL maturity map has been added to the documentation to detail the current state of CXL enabling. It provides the status of the current state of various CXL features to inform current and future contributors of where things are and which areas need contribution. - A notifier handler has been added in order for a newly created CXL memory region to trigger the abstract distance metrics calculation. This should bring parity for CXL memory to the same level vs hotplugged DRAM for NUMA abstract distance calculation. The abstract distance reflects relative performance used for memory tiering handling. - An addition for XOR math has been added to address the CXL DPA to SPA translation. CXL address translation did not support address interleave math with XOR prior to this change. Fixes: - Fix to address race condition in the CXL memory hotplug notifier - Add missing MODULE_DESCRIPTION() for CXL modules - Fix incorrect vendor debug UUID define Misc: - A warning has been added to inform users of an unsupported configuration when mixing CXL VH and RCH/RCD hierarchies - The ENXIO error code has been replaced with EBUSY for inject poison limit reached via debugfs and cxl-test support - Moving the PCI config read in cxl_dvsec_rr_decode() to avoid unnecessary PCI config reads - A refactor to a common struct for DRAM and general media CXL events" * tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/core/pci: Move reading of control register to immediately before usage cxl: Remove defunct code calculating host bridge target positions cxl/region: Verify target positions using the ordered target list cxl: Restore XOR'd position bits during address translation cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa() cxl/test: Replace ENXIO with EBUSY for inject poison limit reached cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy cxl/core: Fix incorrect vendor debug UUID define Documentation: CXL Maturity Map cxl/region: Simplify cxl_region_nid() cxl/region: Support to calculate memory tier abstract distance cxl/region: Fix a race condition in memory hotplug notifier cxl: add missing MODULE_DESCRIPTION() macros cxl/events: Use a common struct for DRAM and General Media events
10 daysMerge tag 'driver-core-6.11-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core changes for 6.11-rc1. Lots of stuff in here, with not a huge diffstat, but apis are evolving which required lots of files to be touched. Highlights of the changes in here are: - platform remove callback api final fixups (Uwe took many releases to get here, finally!) - Rust bindings for basic firmware apis and initial driver-core interactions. It's not all that useful for a "write a whole driver in rust" type of thing, but the firmware bindings do help out the phy rust drivers, and the driver core bindings give a solid base on which others can start their work. There is still a long way to go here before we have a multitude of rust drivers being added, but it's a great first step. - driver core const api changes. This reached across all bus types, and there are some fix-ups for some not-common bus types that linux-next and 0-day testing shook out. This work is being done to help make the rust bindings more safe, as well as the C code, moving toward the end-goal of allowing us to put driver structures into read-only memory. We aren't there yet, but are getting closer. - minor devres cleanups and fixes found by code inspection - arch_topology minor changes - other minor driver core cleanups All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) ARM: sa1100: make match function take a const pointer sysfs/cpu: Make crash_hotplug attribute world-readable dio: Have dio_bus_match() callback take a const * zorro: make match function take a const pointer driver core: module: make module_[add|remove]_driver take a const * driver core: make driver_find_device() take a const * driver core: make driver_[create|remove]_file take a const * firmware_loader: fix soundness issue in `request_internal` firmware_loader: annotate doctests as `no_run` devres: Correct code style for functions that return a pointer type devres: Initialize an uninitialized struct member devres: Fix memory leakage caused by driver API devm_free_percpu() devres: Fix devm_krealloc() wasting memory driver core: platform: Switch to use kmemdup_array() driver core: have match() callback in struct bus_type take a const * MAINTAINERS: add Rust device abstractions to DRIVER CORE device: rust: improve safety comments MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER firmware: rust: improve safety comments ...
2024-07-17cxl/core/pci: Move reading of control register to immediately before usageForyun Ma1-4/+4
Relocate the reading of the DVSEC control register to immediately before usage and avoid unnecessary PCI config access from the read if DVSEC capability check, hdm_count check, or device validity check results in failure. Signed-off-by: Foryun Ma <foryun.ma@jaguarmicro.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240604032151.655-1-foryun.ma@jaguarmicro.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-12Merge branch 'for-6.11/xor_fixes' into cxl-for-nextDave Jiang5-55/+38
Series to fix XOR math for DPA to SPA translation - Refactor and fold cxl_trace_hpa() into cxl_dpa_to_hpa() - Complete DPA->HPA->SPA translation and correct XOR translation issue - Add new method to verify a CXL target position - Remove old method of CXL target position verifiation
2024-07-12cxl: Remove defunct code calculating host bridge target positionsAlison Schofield1-19/+1
The CXL Spec 3.1 Table 9-22 requires that the BIOS populate the CFMWS target list in interleave target order. This means the calculations the CXL driver added to determine positions when XOR math is in use, along with the entire XOR vs Modulo call back setup is not needed. A prior patch added a common method to verify positions. Remove the now unused code related to the cxl_calc_hb_fn. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/2e2c32a2d0f1007e920b58712d15edad2e48d857.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-12cxl/region: Verify target positions using the ordered target listAlison Schofield1-1/+4
When a root decoder is configured the interleave target list is read from the BIOS populated CFMWS structure. Per the CXL spec 3.1 Table 9-22 the target list is in interleave order. The CXL driver populates its decoder target list in the same order and stores it in 'struct cxl_switch_decoder' field "@target: active ordered target list in current decoder configuration" Given the promise of an ordered list, the driver can stop duplicating the work of BIOS and simply check target positions against the ordered list during region configuration. The simplified check against the ordered list is presented here. A follow-on patch will remove the unused code. For Modulo arithmetic this is not a fix, only a simplification. For XOR arithmetic this is a fix for HB IW of 3,6,12. Fixes: f9db85bfec0d ("cxl/acpi: Support CXL XOR Interleave Math (CXIMS)") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/35d08d3aba08fee0f9b86ab1cef0c25116ca8a55.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-12cxl: Restore XOR'd position bits during address translationAlison Schofield1-9/+14
When a device reports a DPA in events like poison, general_media, and dram, the driver translates that DPA back to an HPA. Presently, the CXL driver translation only considers the Modulo position and will report the wrong HPA for XOR configured root decoders. Add a helper function that restores the XOR'd bits during DPA->HPA address translation. Plumb a root decoder callback to the new helper when XOR interleave arithmetic is in use. For Modulo arithmetic, just let the callback be NULL - as in no extra work required. Upon completion of a DPA->HPA translation a couple of checks are performed on the result. One simply confirms that the calculated HPA is within the address range of the region. That test is useful for both Modulo and XOR interleave arithmetic decodes. A second check confirms that the HPA is within an expected chunk based on the endpoints position in the region and the region granularity. An XOR decode disrupts the Modulo pattern making the chunk check useless. To align the checks with the proper decode, pull the region range check inline and use the helper to do the chunk check for Modulo decodes only. A cxl-test unit test is posted for upstream review here: https://lore.kernel.org/20240624210644.495563-1-alison.schofield@intel.com/ Fixes: 28a3ae4ff66c ("cxl/trace: Add an HPA to cxl_poison trace events") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Diego Garcia Rodriguez <diego.garcia.rodriguez@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/1a1ac880d9f889bd6384e657e810431b9a0a72e5.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-12cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()Alison Schofield4-27/+20
Although cxl_trace_hpa() is used to populate TRACE EVENTs with HPA addresses the work it performs is a DPA to HPA translation not a trace. Tidy up this naming by moving the minimal work done in cxl_trace_hpa() into cxl_dpa_to_hpa() and use cxl_dpa_to_hpa() for trace event callbacks. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/452a9b0c525b774c72d9d5851515ffa928750132.1719980933.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-03driver core: have match() callback in struct bus_type take a const *Greg Kroah-Hartman1-1/+1
In the match() callback, the struct device_driver * should not be changed, so change the function callback to be a const *. This is one step of many towards making the driver core safe to have struct device_driver in read-only memory. Because the match() callback is in all busses, all busses are modified to handle this properly. This does entail switching some container_of() calls to container_of_const() to properly handle the constant *. For some busses, like PCI and USB and HV, the const * is cast away in the match callback as those busses do want to modify those structures at this point in time (they have a local lock in the driver structure.) That will have to be changed in the future if they wish to have their struct device * in read-only-memory. Cc: Rafael J. Wysocki <rafael@kernel.org> Reviewed-by: Alex Elder <elder@kernel.org> Acked-by: Sumit Garg <sumit.garg@linaro.org> Link: https://lore.kernel.org/r/2024070136-wrongdoer-busily-01e8@gregkh Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-02cxl/region: Simplify cxl_region_nid()Huang Ying1-6/+4
The node ID of the region can be gotten via resource start address directly. This simplifies the implementation of cxl_region_nid(). Signed-off-by: Huang Ying <ying.huang@intel.com> Suggested-by: Alison Schofield <alison.schofield@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Bharata B Rao <bharata@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240618084639.1419629-4-ying.huang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-02cxl/region: Support to calculate memory tier abstract distanceHuang Ying1-0/+27
An abstract distance value must be assigned by the driver that makes the memory available to the system. It reflects relative performance and is used to place memory nodes backed by CXL regions in the appropriate memory tiers allowing promotion/demotion within the existing memory tiering mechanism. The abstract distance is calculated based on the memory access latency and bandwidth of CXL regions. Signed-off-by: Huang, Ying <ying.huang@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Bharata B Rao <bharata@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240618084639.1419629-3-ying.huang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-02cxl/region: Fix a race condition in memory hotplug notifierHuang Ying1-4/+15
In the memory hotplug notifier function of the CXL region, cxl_region_perf_attrs_callback(), the node ID is obtained by checking the host address range of the region. However, the address range information is not available when the region is registered in devm_cxl_add_region(). Additionally, this information may be removed or added under the protection of cxl_region_rwsem during runtime. If the memory notifier is called for nodes other than that backed by the region, a race condition may occur, potentially leading to a NULL dereference or an invalid address range. The race condition is addressed by checking the availability of the address range information under the protection of cxl_region_rwsem. To enhance code readability and use guard(), the relevant code has been moved into a newly added function: cxl_region_nid(). Fixes: 067353a46d8c ("cxl/region: Add memory hotplug notifier for cxl region") Signed-off-by: Huang, Ying <ying.huang@intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Bharata B Rao <bharata@amd.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240618084639.1419629-2-ying.huang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-02cxl: add missing MODULE_DESCRIPTION() macrosJeff Johnson1-0/+1
make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/cxl/core/cxl_core.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/cxl/cxl_pci.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/cxl/cxl_mem.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/cxl/cxl_acpi.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/cxl/cxl_pmem.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/cxl/cxl_port.o Add the missing invocations of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240607-md-drivers-cxl-v2-1-0c61d95ee7a7@quicinc.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-07-02cxl/events: Use a common struct for DRAM and General Media eventsFabio M. De Francesco2-17/+17
cxl_event_common was an unfortunate naming choice and caused confusion with the existing Common Event Record. Furthermore, its fields didn't map all the common information between DRAM and General Media Events. Remove cxl_event_common and introduce cxl_event_media_hdr to record common information between DRAM and General Media events. cxl_event_media_hdr, which is embedded in both cxl_event_gen_media and cxl_event_dram, leverages the commonalities between the two events to simplify their respective handling. Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240607144423.48681-1-fabio.m.de.francesco@linux.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-06-26cxl/region: check interleave capabilityYao Xingtao2-0/+95
Since interleave capability is not verified, if the interleave capability of a target does not match the region need, committing decoder should have failed at the device end. In order to checkout this error as quickly as possible, driver needs to check the interleave capability of target during attaching it to region. Per CXL specification r3.1(8.2.4.20.1 CXL HDM Decoder Capability Register), bits 11 and 12 indicate the capability to establish interleaving in 3, 6, 12 and 16 ways. If these bits are not set, the target cannot be attached to a region utilizing such interleave ways. Additionally, bits 8 and 9 represent the capability of the bits used for interleaving in the address, Linux tracks this in the cxl_port interleave_mask. Per CXL specification r3.1(8.2.4.20.13 Decoder Protection): eIW means encoded Interleave Ways. eIG means encoded Interleave Granularity. in HPA: if eIW is 0 or 8 (interleave ways: 1, 3), all the bits of HPA are used, the interleave bits are none, the following check is ignored. if eIW is less than 8 (interleave ways: 2, 4, 8, 16), the interleave bits start at bit position eIG + 8 and end at eIG + eIW + 8 - 1. if eIW is greater than 8 (interleave ways: 6, 12), the interleave bits start at bit position eIG + 8 and end at eIG + eIW - 1. if the interleave mask is insufficient to cover the required interleave bits, the target cannot be attached to the region. Fixes: 384e624bb211 ("cxl/region: Attach endpoint decoders") Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240614084755.59503-2-yaoxt.fnst@fujitsu.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-06-26cxl/region: Avoid null pointer dereference in region lookupAlison Schofield1-4/+15
cxl_dpa_to_region() looks up a region based on a memdev and DPA. It wrongly assumes an endpoint found mapping the DPA is also of a fully assembled region. When not true it leads to a null pointer dereference looking up the region name. This appears during testing of region lookup after a failure to assemble a BIOS defined region or if the lookup raced with the assembly of the BIOS defined region. Failure to clean up BIOS defined regions that fail assembly is an issue in itself and a fix to that problem will alleviate some of the impact. It will not alleviate the race condition so let's harden this path. The behavior change is that the kernel oops due to a null pointer dereference is replaced with a dev_dbg() message noting that an endpoint was mapped. Additional comments are added so that future users of this function can more clearly understand what it provides. Fixes: 0a105ab28a4d ("cxl/memdev: Warn of poison inject or clear to a mapped region") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://patch.msgid.link/20240604003609.202682-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-06-19cxl/mem: Fix no cxl_nvd during pmem region auto-assemblingLi Ming2-6/+12
When CXL subsystem is auto-assembling a pmem region during cxl endpoint port probing, always hit below calltrace. BUG: kernel NULL pointer dereference, address: 0000000000000078 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page RIP: 0010:cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] Call Trace: <TASK> ? __die+0x24/0x70 ? page_fault_oops+0x82/0x160 ? do_user_addr_fault+0x65/0x6b0 ? exc_page_fault+0x7d/0x170 ? asm_exc_page_fault+0x26/0x30 ? cxl_pmem_region_probe+0x22e/0x360 [cxl_pmem] ? cxl_pmem_region_probe+0x1ac/0x360 [cxl_pmem] cxl_bus_probe+0x1b/0x60 [cxl_core] really_probe+0x173/0x410 ? __pfx___device_attach_driver+0x10/0x10 __driver_probe_device+0x80/0x170 driver_probe_device+0x1e/0x90 __device_attach_driver+0x90/0x120 bus_for_each_drv+0x84/0xe0 __device_attach+0xbc/0x1f0 bus_probe_device+0x90/0xa0 device_add+0x51c/0x710 devm_cxl_add_pmem_region+0x1b5/0x380 [cxl_core] cxl_bus_probe+0x1b/0x60 [cxl_core] The cxl_nvd of the memdev needs to be available during the pmem region probe. Currently the cxl_nvd is registered after the endpoint port probe. The endpoint probe, in the case of autoassembly of regions, can cause a pmem region probe requiring the not yet available cxl_nvd. Adjust the sequence so this dependency is met. This requires adding a port parameter to cxl_find_nvdimm_bridge() that can be used to query the ancestor root port. The endpoint port is not yet available, but will share a common ancestor with its parent, so start the query from there instead. Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue") Co-developed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Li Ming <ming4.li@intel.com> Tested-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://patch.msgid.link/20240612064423.2567625-1-ming4.li@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-05-29cxl/region: Fix memregion leaks in devm_cxl_add_region()Li Zhijian1-9/+9
Move the mode verification to __create_region() before allocating the memregion to avoid the memregion leaks. Fixes: 6e099264185d ("cxl/region: Add volatile region creation support") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240507053421.456439-1-lizhijian@fujitsu.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-05-23tracing/treewide: Remove second parameter of __assign_str()Steven Rostedt (Google)1-16/+16
With the rework of how the __string() handles dynamic strings where it saves off the source string in field in the helper structure[1], the assignment of that value to the trace event field is stored in the helper value and does not need to be passed in again. This means that with: __string(field, mystring) Which use to be assigned with __assign_str(field, mystring), no longer needs the second parameter and it is unused. With this, __assign_str() will now only get a single parameter. There's over 700 users of __assign_str() and because coccinelle does not handle the TRACE_EVENT() macro I ended up using the following sed script: git grep -l __assign_str | while read a ; do sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file; mv /tmp/test-file $a; done I then searched for __assign_str() that did not end with ';' as those were multi line assignments that the sed script above would fail to catch. Note, the same updates will need to be done for: __assign_str_len() __assign_rel_str() __assign_rel_str_len() I tested this with both an allmodconfig and an allyesconfig (build only for both). [1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/ Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts. Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal Acked-by: Takashi Iwai <tiwai@suse.de> Acked-by: Darrick J. Wong <djwong@kernel.org> # xfs Tested-by: Guenter Roeck <linux@roeck-us.net>
2024-05-21Merge tag 'pci-v6.10-changes' of ↵Linus Torvalds2-4/+33
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci updates from Bjorn Helgaas: "Enumeration: - Skip E820 checks for MCFG ECAM regions for new (2016+) machines, since there's no requirement to describe them in E820 and some platforms require ECAM to work (Bjorn Helgaas) - Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific (Damien Le Moal) - Remove last user and pci_enable_device_io() (Heiner Kallweit) - Wait for Link Training==0 to avoid possible race (Ilpo Järvinen) - Skip waiting for devices that have been disconnected while suspended (Ilpo Järvinen) - Clear Secondary Status errors after enumeration since Master Aborts and Unsupported Request errors are an expected part of enumeration (Vidya Sagar) MSI: - Remove unused IMS (Interrupt Message Store) support (Bjorn Helgaas) Error handling: - Mask Genesys GL975x SD host controller Replay Timer Timeout correctable errors caused by a hardware defect; the errors cause interrupts that prevent system suspend (Kai-Heng Feng) - Fix EDR-related _DSM support, which previously evaluated revision 5 but assumed revision 6 behavior (Kuppuswamy Sathyanarayanan) ASPM: - Simplify link state definitions and mask calculation (Ilpo Järvinen) Power management: - Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports, where BIOS apparently doesn't know how to put them back in D0 (Mario Limonciello) CXL: - Support resetting CXL devices; special handling required because CXL Ports mask Secondary Bus Reset by default (Dave Jiang) DOE: - Support DOE Discovery Version 2 (Alexey Kardashevskiy) Endpoint framework: - Set endpoint BAR to be 64-bit if the driver says that's all the device supports, in addition to doing so if the size is >2GB (Niklas Cassel) - Simplify endpoint BAR allocation and setting interfaces (Niklas Cassel) Cadence PCIe controller driver: - Drop DT binding redundant msi-parent and pci-bus.yaml (Krzysztof Kozlowski) Cadence PCIe endpoint driver: - Configure endpoint BARs to be 64-bit based on the BAR type, not the BAR value (Niklas Cassel) Freescale Layerscape PCIe controller driver: - Convert DT binding to YAML (Frank Li) MediaTek MT7621 PCIe controller driver: - Add DT binding missing 'reg' property for child Root Ports (Krzysztof Kozlowski) - Fix theoretical string truncation in PHY name (Sergio Paracuellos) NVIDIA Tegra194 PCIe controller driver: - Return success for endpoint probe instead of falling through to the failure path (Vidya Sagar) Renesas R-Car PCIe controller driver: - Add DT binding missing IOMMU properties (Geert Uytterhoeven) - Add DT binding R-Car V4H compatible for host and endpoint mode (Yoshihiro Shimoda) Rockchip PCIe controller driver: - Configure endpoint BARs to be 64-bit based on the BAR type, not the BAR value (Niklas Cassel) - Add DT binding missing maxItems to ep-gpios (Krzysztof Kozlowski) - Set the Subsystem Vendor ID, which was previously zero because it was masked incorrectly (Rick Wertenbroek) Synopsys DesignWare PCIe controller driver: - Restructure DBI register access to accommodate devices where this requires Refclk to be active (Manivannan Sadhasivam) - Remove the deinit() callback, which was only need by the pcie-rcar-gen4, and do it directly in that driver (Manivannan Sadhasivam) - Add dw_pcie_ep_cleanup() so drivers that support PERST# can clean up things like eDMA (Manivannan Sadhasivam) - Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit() to make it parallel to dw_pcie_ep_init() (Manivannan Sadhasivam) - Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers() to reflect the actual functionality (Manivannan Sadhasivam) - Call dw_pcie_ep_init_registers() directly from all the glue drivers, not just those that require active Refclk from the host (Manivannan Sadhasivam) - Remove the "core_init_notifier" flag, which was an obscure way for glue drivers to indicate that they depend on Refclk from the host (Manivannan Sadhasivam) TI J721E PCIe driver: - Add DT binding J784S4 SoC Device ID (Siddharth Vadapalli) - Add DT binding J722S SoC support (Siddharth Vadapalli) TI Keystone PCIe controller driver: - Add DT binding missing num-viewport, phys and phy-name properties (Jan Kiszka) Miscellaneous: - Constify and annotate with __ro_after_init (Heiner Kallweit) - Convert DT bindings to YAML (Krzysztof Kozlowski) - Check for kcalloc() failure in of_pci_prop_intr_map() (Duoming Zhou)" * tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits) PCI: Do not wait for disconnected devices when resuming x86/pci: Skip early E820 check for ECAM region PCI: Remove unused pci_enable_device_io() ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io() PCI: Update pci_find_capability() stub return types PCI: Remove PCI_IRQ_LEGACY scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY dt-bindings: PCI: rockchip,rk3399-pcie: Add missing maxItems to ep-gpios Revert "genirq/msi: Provide constants for PCI/IMS support" Revert "x86/apic/msi: Enable PCI/IMS" Revert "iommu/vt-d: Enable PCI/IMS" Revert "iommu/amd: Enable PCI/IMS" Revert "PCI/MSI: Provide IMS (Interrupt Message Store) support" ...
2024-05-16Merge tag 'cxl-for-6.10' of ↵Linus Torvalds7-184/+253
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull CXL updates from Dave Jiang: - Three CXL mailbox passthrough commands are added to support the populating and clearing of vendor debug logs: - Get Log Capabilities - Get Supported Log Sub-List Commands - Clear Log - Add support of Device Phyiscal Address (DPA) to Host Physical Address (HPA) translation for CXL events of cxl_dram and cxl_general media. This allows user space to figure out which CXL region the event occured via trace event. - Connect CXL to CPER reporting. If a device is configured for firmware first, CXL event records are not sent directly to the host. Those records are reported through EFI Common Platform Error Records (CPER). Add support to route the CPER records through the CXL sub-system in order to provide DPA to HPA translation and also event decoding and tracing. This is useful for users to determine which system issues may correspond to specific hardware events. - A number of misc cleanups and fixes: - Fix for compile warning of cxl_security_ops - Add debug message for invalid interleave granularity - Enhancement to cxl-test event testing - Add dev_warn() on unsupported mixed mode decoder - Fix use of phys_to_target_node() for x86 - Use helper function for decoder enum instead of open coding - Include missing headers for cxl-event - Fix MAINTAINERS file entry - Fix cxlr_pmem memory leak - Cleanup __cxl_parse_cfmws via scope-based resource menagement - Convert cxl_pmem_region_alloc() to scope-based resource management * tag 'cxl-for-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (21 commits) cxl/cper: Remove duplicated GUID defines cxl/cper: Fix non-ACPI-APEI-GHES build cxl/pci: Process CPER events acpi/ghes: Process CXL Component Events cxl/region: Convert cxl_pmem_region_alloc to scope-based resource management cxl/acpi: Cleanup __cxl_parse_cfmws() cxl/region: Fix cxlr_pmem leaks cxl/core: Add region info to cxl_general_media and cxl_dram events cxl/region: Move cxl_trace_hpa() work to the region driver cxl/region: Move cxl_dpa_to_region() work to the region driver cxl/trace: Correct DPA field masks for general_media & dram events MAINTAINERS: repair file entry in COMPUTE EXPRESS LINK cxl/cxl-event: include missing <linux/types.h> and <linux/uuid.h> cxl/hdm: Debug, use decoder name function cxl: Fix use of phys_to_target_node() for x86 cxl/hdm: dev_warn() on unsupported mixed mode decoder cxl/test: Enhance event testing cxl/hdm: Add debug message for invalid interleave granularity cxl: Fix compile warning for cxl_security_ops extern cxl/mbox: Add Clear Log mailbox command ...
2024-05-08cxl: Add post-reset warning if reset results in loss of previously committed ↵Dave Jiang1-0/+29
HDM decoders Secondary Bus Reset (SBR) is equivalent to a device being hot removed and inserted again. Doing a SBR on a CXL type 3 device is problematic if the exported device memory is part of system memory that cannot be offlined. The event is equivalent to violently ripping out that range of memory from the kernel. While the hardware requires the "Unmask SBR" bit set in the Port Control Extensions register and the kernel currently does not unmask it, user can unmask this bit via setpci or similar tool. The driver does not have a way to detect whether a reset coming from the PCI subsystem is a Function Level Reset (FLR) or SBR. The only way to detect is to note if a decoder is marked as enabled in software but the decoder control register indicates it's not committed. Add a helper function to find discrepancy between the decoder software state versus the hardware register state. Suggested-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240502165851.1948523-6-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
2024-05-08PCI/CXL: Move CXL Vendor ID to pci_ids.hDave Jiang2-4/+4
Move PCI_DVSEC_VENDOR_ID_CXL in CXL private code to PCI_VENDOR_ID_CXL in pci_ids.h in order to be utilized in PCI subsystem. While the CXL Vendor ID (0x1e98) is not listed in the PCI SIG "Member Companies" database at https://pcisig.com/membership/member-companies, the SIG has confirmed that it is reserved by CXL. Link: https://lore.kernel.org/r/20240502165851.1948523-2-dave.jiang@intel.com Suggested-by: Bjorn Helgaas <helgaas@kernel.org> Link: https://lore.kernel.org/linux-cxl/20240402172323.GA1818777@bhelgaas/ Signed-off-by: Dave Jiang <dave.jiang@intel.com> [bhelgaas: update commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com>
2024-05-01cxl/region: Convert cxl_pmem_region_alloc to scope-based resource managementDan Williams1-26/+17
A recent bugfix to cxl_pmem_region_alloc() to fix an error-unwind-memleak [1], highlighted a use case for scope-based resource management. Delete the goto for releasing @cxl_region_rwsem, and return error codes directly from error condition paths. The caller, devm_cxl_add_pmem_region(), is no longer given @cxlr_pmem directly it must retrieve it from @cxlr->cxlr_pmem. This retrieval from @cxlr was already in place for @cxlr->cxl_nvb, and converting cxl_pmem_region_alloc() to return an int makes it less awkward to handle no_free_ptr(). Cc: Li Zhijian <lizhijian@fujitsu.com> Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Closes: http://lore.kernel.org/r/20240430174540.000039ce@Huawei.com Link: http://lore.kernel.org/r/20240428030748.318985-1-lizhijian@fujitsu.com Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/171451430965.1147997.15782562063090960666.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-05-01cxl/region: Fix cxlr_pmem leaksLi Zhijian1-0/+1
Before this error path, cxlr_pmem pointed to a kzalloc() memory, free it to avoid this memory leaking. Fixes: f17b558d6663 ("cxl/pmem: Refactor nvdimm device registration, delete the workqueue") Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240428030748.318985-1-lizhijian@fujitsu.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30Merge remote-tracking branch 'cxl/for-6.10/dpa-to-hpa' into cxl-for-nextDave Jiang6-154/+216
Support for HPA to DPA translation for CXL events cxl_dram and cxl_general_media.
2024-04-30cxl/core: Add region info to cxl_general_media and cxl_dram eventsAlison Schofield2-15/+65
User space may need to know which region, if any, maps the DPAs (device physical addresses) reported in a cxl_general_media or cxl_dram event. Since the mapping can change, the kernel provides this information at the time the event occurs. This informs user space that at event <timestamp> this <region> mapped this <DPA> to this <HPA>. Add the same region info that is included in the cxl_poison trace event: the DPA->HPA translation, region name, and region uuid. The new fields are inserted in the trace event and no existing fields are modified. If the DPA is not mapped, user will see: hpa=ULLONG_MAX, region="", and uuid=0 This work must be protected by dpa_rwsem & region_rwsem since it is looking up region mappings. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/dd8d708b7a7ebfb64a27020a5eb338091336b34d.1714496730.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30cxl/region: Move cxl_trace_hpa() work to the region driverAlison Schofield4-93/+98
This work belongs in the region driver as it is only useful with CONFIG_CXL_REGION. Add a stub in core.h for when the region driver is not built. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/183222631f11a43c5e6debc42ec22fe1bd4b818a.1714496730.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30cxl/region: Move cxl_dpa_to_region() work to the region driverAlison Schofield3-44/+51
This helper belongs in the region driver as it is only useful with CONFIG_CXL_REGION. Add a stub in core.h for when the region driver is not built. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/05e30f788d62b3dd398aff2d2ea50a6aaa7c3313.1714496730.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30cxl/trace: Correct DPA field masks for general_media & dram eventsAlison Schofield1-2/+2
The length of Physical Address in General Media and DRAM event records is 64-bit, so the field mask for extracting the DPA should be 64-bit also, otherwise the trace event reports DPA's with the upper 32 bits of a DPA address masked off. If users do DPA-to-HPA translations this could lead to incorrect page retirement decisions. Use GENMASK_ULL() for CXL_DPA_MASK to get all the DPA address bits. Tidy up CXL_DPA_FLAGS_MASK by using GENMASK() to only mask the exact flag bits. These bits are defined as part of the event record physical address descriptions of General Media and DRAM events in CXL Spec 3.1 Section 8.2.9.2 Events. Fixes: d54a531a430b ("cxl/mem: Trace General Media Event Record") Co-developed-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/2867fc43c57720a4a15a3179431829b8dbd2dc16.1714496730.git.alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30Merge remote-tracking branch 'cxl/for-6.10/add-log-mbox-cmds' into cxl-for-nextDave Jiang1-0/+12
Add CXL log related mailbox commands - Add Get Log Capabilities command - Add Get Supported Log Sub-List Commands command - Add Clear Log command
2024-04-30cxl/hdm: Debug, use decoder name functionIra Weiny1-2/+1
The decoder enum has a name conversion function defined now. Use that instead of open coding. Suggested-by: Navneet Singh <navneet.singh@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230604-dcd-type2-upstream-v2-1-f740c47e7916@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30cxl/hdm: dev_warn() on unsupported mixed mode decoderAlison Schofield1-2/+2
A mixed mode decoder is programmed with device physical addresses that span both ram and pmem partitions of a memdev. Linux does not support mixed mode decoders. The driver rejects sysfs writes that try to set decoder mode to mixed, and if a resource bieng allocated is not wholly contained in either the pmem or ram partition of a memdev, it is also rejected. Basically, the CXL region driver is not going to create regions with mixed mode decoders, but the BIOS could. If the kernel driver sees the mixed mode decoder, it will fail to enable the region, and emit a dev_dbg() message. A dev_dbg() is not noisy enough in this case. Change the message to be a dev_warn() that explicitly says mixed mode is not supported. Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230218013834.31237-1-alison.schofield@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30cxl/hdm: Add debug message for invalid interleave granularityHuang Ying1-1/+5
There's no debug message for invalid interleave granularity. This makes it hard to debug related bugs. So, this is added in this patch. Signed-off-by: Huang, Ying <ying.huang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240402061016.388408-1-ying.huang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30cxl/mbox: Add Clear Log mailbox commandSrinivasulu Thanneeru1-0/+10
Adding UAPI support for CXL r3.1 8.2.9.5.4 Clear Log command. This proposed patch will be useful for clearing and populating the Vendor debug log in certain scenarios, allowing for the aggregation of results over time. Signed-off-by: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240313071218.729-3-sthanneeru.opensrc@micron.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-30cxl/mbox: Add Get Log Capabilities and Get Supported Logs Sub-List commandsSrinivasulu Thanneeru1-0/+2
Adding UAPI support for 1. CXL r3.1 8.2.9.5.3 Get Log Capabilities. 2. CXL r3.1 8.2.9.5.6 Get Supported Logs Sub-List. Signed-off-by: Srinivasulu Thanneeru <sthanneeru.opensrc@micron.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20240313071218.729-2-sthanneeru.opensrc@micron.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-29cxl: Fix cxl_endpoint_get_perf_coordinate() support for RCHDave Jiang1-1/+14
Robert reported the following when booting a CXL host with Restricted CXL Host (RCH) topology: [ 39.815379] cxl_acpi ACPI0017:00: not a cxl_port device [ 39.827123] WARNING: CPU: 46 PID: 1754 at drivers/cxl/core/port.c:592 to_cxl_port+0x56/0x70 [cxl_core] ... plus some related subsequent NULL pointer dereference: [ 40.718708] BUG: kernel NULL pointer dereference, address: 00000000000002d8 The iterator to walk the PCIe path did not account for RCH topology. However RCH does not support hotplug and the memory exported by the Restricted CXL Device (RCD) should be covered by HMAT and therefore no access_coordinate is needed. Add check to see if the endpoint device is RCD and skip calculation. Also add a call to cxl_endpoint_get_perf_coordinates() in cxl_test in order to exercise the topology iterator. The dev_is_pci() check added is to help with this test and should be harmless for normal operation. Reported-by: Robert Richter <rrichter@amd.com> Closes: https://lore.kernel.org/all/Ziv8GfSMSbvlBB0h@rric.localdomain/ Fixes: 592780b8391f ("cxl: Fix retrieving of access_coordinates in PCIe path") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Tested-by: Robert Richter <rrichter@amd.com> Reviewed-by: Robert Richter <rrichter@amd.com> Link: https://lore.kernel.org/r/20240426224913.1027420-1-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-22cxl/core: Fix potential payload size confusion in cxl_mem_get_poison()Dan Williams1-21/+17
A recent change to cxl_mem_get_records_log() [1] highlighted a subtle nuance of looping calls to cxl_internal_send_cmd(), i.e. that cxl_internal_send_cmd() modifies the 'size_out' member of the @mbox_cmd argument. That mechanism is useful for communicating underflow, but it is unwanted when reusing @mbox_cmd for a subsequent submission. It turns out that cxl_xfer_log() avoids this scenario by always redefining @mbox_cmd each iteration. Update cxl_mem_get_records_log() and cxl_mem_get_poison() to follow the same style as cxl_xfer_log(), i.e. re-define @mbox_cmd each iteration. The cxl_mem_get_records_log() change is just a style fixup, but the cxl_mem_get_poison() change is a potential fix, per Alison [2]: Poison list retrieval can hit this case if the MORE flag is set and a follow on read of the list delivers more records than the previous read. ie. device gives one record, sets the _MORE flag, then gives 5. Not an urgent fix since this behavior has not been seen in the wild, but worth tracking as a fix. Cc: Kwangjin Ko <kwangjin.ko@sk.com> Cc: Alison Schofield <alison.schofield@intel.com> Fixes: ed83f7ca398b ("cxl/mbox: Add GET_POISON_LIST mailbox command") Link: http://lore.kernel.org/r/20240402081404.1106-2-kwangjin.ko@sk.com [1] Link: http://lore.kernel.org/r/ZhAhAL/GOaWFrauw@aschofie-mobl2 [2] Signed-off-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/171235441633.2716581.12330082428680958635.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-08cxl: Add checks to access_coordinate calculation to fail missing dataDave Jiang1-1/+18
Jonathan noted that when the coordinates for host bridge and switches can be 0s if no actual data are retrieved and the calculation continues. The resulting number would be inaccurate. Add checks to ensure that the calculation would complete only if the numbers are valid. While not seen in the wild, issue may show up with a BIOS that reported CXL root ports via Generic Ports (via a PCI handle in the SRAT entry). Fixes: 14a6960b3e92 ("cxl: Add helper function that calculate performance data for downstream ports") Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240403154844.3403859-6-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-08cxl: Consolidate dport access_coordinate ->hb_coord and ->sw_coord into ->coordDave Jiang2-38/+84
The driver stores access_coordinate for host bridge in ->hb_coord and switch CDAT access_coordinate in ->sw_coord. Since neither of these access_coordinate clobber each other, the variable name can be consolidated into ->coord to simplify the code. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240403154844.3403859-5-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-08cxl: Fix incorrect region perf data calculationDave Jiang2-86/+44
Current math in cxl_region_perf_data_calculate divides the latency by 1000 every time the function gets called. This causes the region latency to be divided by 1000 per memory device and the math is incorrect. This is user visible as the latency access_coordinate exposed via sysfs will show incorrect latency data. Normalize values from CDAT to nanoseconds. Adjust sub-nanoseconds latency to at least 1. Remove adjustment of perf numbers from the generic target since hmat handling code has already normalized those numbers. Now all computation and stored numbers should be in nanoseconds. cxl_hb_get_perf_coordinates() is removed and HB coords are calculated in the port access_coordinate calculation path since it no longer need to be treated special. Fixes: 3d9f4a197230 ("cxl/region: Calculate performance data for a region") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240403154844.3403859-4-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-08cxl: Fix retrieving of access_coordinates in PCIe pathDave Jiang1-13/+22
Current loop in cxl_endpoint_get_perf_coordinates() incorrectly assumes the Root Port (RP) dport is the one with generic port access_coordinate. However those coordinates are one level up in the Host Bridge (HB). Current code causes the computation code to pick up 0s as the coordinates and cause minimal bandwidth to result in 0. Add check to skip RP when combining coordinates. Fixes: 14a6960b3e92 ("cxl: Add helper function that calculate performance data for downstream ports") Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240403154844.3403859-3-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-05cxl: Remove checking of iter in cxl_endpoint_get_perf_coordinates()Dave Jiang1-1/+1
The while() loop in cxl_endpoint_get_perf_coordinates() checks to see if 'iter' is valid as part of the condition breaking out of the loop. is_cxl_root() will stop the loop before the next iteration could go NULL. Remove the iter check. The presence of the iter or removing the iter does not impact the behavior of the code. This is a code clean up and not a bug fix. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240403154844.3403859-2-dave.jiang@intel.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-04-04cxl/core: Fix initialization of mbox_cmd.size_out in get eventKwangjin Ko1-1/+2
Since mbox_cmd.size_out is overwritten with the actual output size in the function below, it needs to be initialized every time. cxl_internal_send_cmd -> __cxl_pci_mbox_send_cmd Problem scenario: 1) The size_out variable is initially set to the size of the mailbox. 2) Read an event. - size_out is set to 160 bytes(header 32B + one event 128B). - Two event are created while reading. 3) Read the new *two* events. - size_out is still set to 160 bytes. - Although the value of out_len is 288 bytes, only 160 bytes are copied from the mailbox register to the local variable. - record_count is set to 2. - Accessing records[1] will result in reading incorrect data. Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load") Tested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Kwangjin Ko <kwangjin.ko@sk.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-03-26cxl/core/regs: Fix usage of map->reg_type in cxl_decode_regblock() before ↵Dave Jiang1-2/+3
assigned In the error path, map->reg_type is being used for kernel warning before its value is setup. Found by code inspection. Exposure to user is wrong reg_type being emitted via kernel log. Use a local var for reg_type and retrieve value for usage. Fixes: 6c7f4f1e51c2 ("cxl/core/regs: Make cxl_map_{component, device}_regs() device generic") Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-03-26cxl/mem: Fix for the index of Clear Event Record HandleYuquan Wang1-1/+1
The dev_dbg info for Clear Event Records mailbox command would report the handle of the next record to clear not the current one. This was because the index 'i' had incremented before printing the current handle value. Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load") Signed-off-by: Yuquan Wang <wangyuquan1236@phytium.com.cn> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2024-03-19Merge tag 'trace-v6.9-2' of ↵Linus Torvalds1-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull tracing updates from Steven Rostedt: "Main user visible change: - User events can now have "multi formats" The current user events have a single format. If another event is created with a different format, it will fail to be created. That is, once an event name is used, it cannot be used again with a different format. This can cause issues if a library is using an event and updates its format. An application using the older format will prevent an application using the new library from registering its event. A task could also DOS another application if it knows the event names, and it creates events with different formats. The multi-format event is in a different name space from the single format. Both the event name and its format are the unique identifier. This will allow two different applications to use the same user event name but with different payloads. - Added support to have ftrace_dump_on_oops dump out instances and not just the main top level tracing buffer. Other changes: - Add eventfs_root_inode Only the root inode has a dentry that is static (never goes away) and stores it upon creation. There's no reason that the thousands of other eventfs inodes should have a pointer that never gets set in its descriptor. Create a eventfs_root_inode desciptor that has a eventfs_inode descriptor and a dentry pointer, and only the root inode will use this. - Added WARN_ON()s in eventfs There's some conditionals remaining in eventfs that should never be hit, but instead of removing them, add WARN_ON() around them to make sure that they are never hit. - Have saved_cmdlines allocation also include the map_cmdline_to_pid array The saved_cmdlines structure allocates a large amount of data to hold its mappings. Within it, it has three arrays. Two are already apart of it: map_pid_to_cmdline[] and saved_cmdlines[]. More memory can be saved by also including the map_cmdline_to_pid[] array as well. - Restructure __string() and __assign_str() macros used in TRACE_EVENT() Dynamic strings in TRACE_EVENT() are declared with: __string(name, source) And assigned with: __assign_str(name, source) In the tracepoint callback of the event, the __string() is used to get the size needed to allocate on the ring buffer and __assign_str() is used to copy the string into the ring buffer. There's a helper structure that is created in the TRACE_EVENT() macro logic that will hold the string length and its position in the ring buffer which is created by __string(). There are several trace events that have a function to create the string to save. This function is executed twice. Once for __string() and again for __assign_str(). There's no reason for this. The helper structure could also save the string it used in __string() and simply copy that into __assign_str() (it also already has its length). By using the structure to store the source string for the assignment, it means that the second argument to __assign_str() is no longer needed. It will be removed in the next merge window, but for now add a warning if the source string given to __string() is different than the source string given to __assign_str(), as the source to __assign_str() isn't even used and will be going away. - Added checks to make sure that the source of __string() is also the source of __assign_str() so that it can be safely removed in the next merge window. Included fixes that the above check found. - Other minor clean ups and fixes" * tag 'trace-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits) tracing: Add __string_src() helper to help compilers not to get confused tracing: Use strcmp() in __assign_str() WARN_ON() check tracepoints: Use WARN() and not WARN_ON() for warnings tracing: Use div64_u64() instead of do_div() tracing: Support to dump instance traces by ftrace_dump_on_oops tracing: Remove second parameter to __assign_rel_str() tracing: Add warning if string in __assign_str() does not match __string() tracing: Add __string_len() example tracing: Remove __assign_str_len() ftrace: Fix most kernel-doc warnings tracing: Decrement the snapshot if the snapshot trigger fails to register tracing: Fix snapshot counter going between two tracers that use it tracing: Use EVENT_NULL_STR macro instead of open coding "(null)" tracing: Use ? : shortcut in trace macros tracing: Do not calculate strlen() twice for __string() fields tracing: Rework __assign_str() and __string() to not duplicate getting the string cxl/trace: Properly initialize cxl_poison region name net: hns3: tracing: fix hclgevf trace event strings drm/i915: Add missing ; to __assign_str() macros in tracepoint code NFSD: Fix nfsd_clid_class use of __string_len() macro ...
2024-03-18cxl/trace: Properly initialize cxl_poison region nameAlison Schofield1-7/+7
The TP_STRUCT__entry that gets assigned the region name, or an empty string if no region is present, is erroneously initialized to the cxl_region pointer. It needs to be properly initialized otherwise it's length is wrong and garbage chars can appear in the kernel trace output: /sys/kernel/tracing/trace The bad initialization was due in part to a naming conflict with the parameter: struct cxl_region *region. The field 'region' is already exposed externally as the region name, so changing that to something logical, like 'region_name' is not an option. Instead rename the internal only struct cxl_region to the commonly used 'cxlr'. Impact is that tooling depending on that trace data can miss picking up a valid event when searching by region name. The TP_printk() output, if enabled, does emit the correct region names in the dmesg log. This was found during testing of the cxl-list option to report media-errors for a region. Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: stable@vger.kernel.org Fixes: ddf49d57b841 ("cxl/trace: Add TRACE support for CXL media-error records") Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-03-13Merge branch 'for-6.9/cxl-fixes' into for-6.9/cxlDan Williams1-15/+15
Pick up a parsing fix for the CDAT SSLBIS structure for v6.9.
2024-03-13Merge branch 'for-6.9/cxl-einj' into for-6.9/cxlDan Williams1-0/+41
Pick up support for injecting errors via ACPI EINJ into the CXL protocol for v6.9.