summaryrefslogtreecommitdiff
path: root/drivers/s390/crypto
AgeCommit message (Collapse)AuthorFilesLines
2024-03-13s390/zcrypt: fix reference counting on zcrypt card objectsHarald Freudenberger1-0/+2
Tests with hot-plugging crytpo cards on KVM guests with debug kernel build revealed an use after free for the load field of the struct zcrypt_card. The reason was an incorrect reference handling of the zcrypt card object which could lead to a free of the zcrypt card object while it was still in use. This is an example of the slab message: kernel: 0x00000000885a7512-0x00000000885a7513 @offset=1298. First byte 0x68 instead of 0x6b kernel: Allocated in zcrypt_card_alloc+0x36/0x70 [zcrypt] age=18046 cpu=3 pid=43 kernel: kmalloc_trace+0x3f2/0x470 kernel: zcrypt_card_alloc+0x36/0x70 [zcrypt] kernel: zcrypt_cex4_card_probe+0x26/0x380 [zcrypt_cex4] kernel: ap_device_probe+0x15c/0x290 kernel: really_probe+0xd2/0x468 kernel: driver_probe_device+0x40/0xf0 kernel: __device_attach_driver+0xc0/0x140 kernel: bus_for_each_drv+0x8c/0xd0 kernel: __device_attach+0x114/0x198 kernel: bus_probe_device+0xb4/0xc8 kernel: device_add+0x4d2/0x6e0 kernel: ap_scan_adapter+0x3d0/0x7c0 kernel: ap_scan_bus+0x5a/0x3b0 kernel: ap_scan_bus_wq_callback+0x40/0x60 kernel: process_one_work+0x26e/0x620 kernel: worker_thread+0x21c/0x440 kernel: Freed in zcrypt_card_put+0x54/0x80 [zcrypt] age=9024 cpu=3 pid=43 kernel: kfree+0x37e/0x418 kernel: zcrypt_card_put+0x54/0x80 [zcrypt] kernel: ap_device_remove+0x4c/0xe0 kernel: device_release_driver_internal+0x1c4/0x270 kernel: bus_remove_device+0x100/0x188 kernel: device_del+0x164/0x3c0 kernel: device_unregister+0x30/0x90 kernel: ap_scan_adapter+0xc8/0x7c0 kernel: ap_scan_bus+0x5a/0x3b0 kernel: ap_scan_bus_wq_callback+0x40/0x60 kernel: process_one_work+0x26e/0x620 kernel: worker_thread+0x21c/0x440 kernel: kthread+0x150/0x168 kernel: __ret_from_fork+0x3c/0x58 kernel: ret_from_fork+0xa/0x30 kernel: Slab 0x00000372022169c0 objects=20 used=18 fp=0x00000000885a7c88 flags=0x3ffff00000000a00(workingset|slab|node=0|zone=1|lastcpupid=0x1ffff) kernel: Object 0x00000000885a74b8 @offset=1208 fp=0x00000000885a7c88 kernel: Redzone 00000000885a74b0: bb bb bb bb bb bb bb bb ........ kernel: Object 00000000885a74b8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74c8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74d8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74e8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a74f8: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b kkkkkkkkkkkkkkkk kernel: Object 00000000885a7508: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 68 4b 6b 6b 6b a5 kkkkkkkkkkhKkkk. kernel: Redzone 00000000885a7518: bb bb bb bb bb bb bb bb ........ kernel: Padding 00000000885a756c: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZ kernel: CPU: 0 PID: 387 Comm: systemd-udevd Not tainted 6.8.0-HF #2 kernel: Hardware name: IBM 3931 A01 704 (KVM/Linux) kernel: Call Trace: kernel: [<00000000ca5ab5b8>] dump_stack_lvl+0x90/0x120 kernel: [<00000000c99d78bc>] check_bytes_and_report+0x114/0x140 kernel: [<00000000c99d53cc>] check_object+0x334/0x3f8 kernel: [<00000000c99d820c>] alloc_debug_processing+0xc4/0x1f8 kernel: [<00000000c99d852e>] get_partial_node.part.0+0x1ee/0x3e0 kernel: [<00000000c99d94ec>] ___slab_alloc+0xaf4/0x13c8 kernel: [<00000000c99d9e38>] __slab_alloc.constprop.0+0x78/0xb8 kernel: [<00000000c99dc8dc>] __kmalloc+0x434/0x590 kernel: [<00000000c9b4c0ce>] ext4_htree_store_dirent+0x4e/0x1c0 kernel: [<00000000c9b908a2>] htree_dirblock_to_tree+0x17a/0x3f0 kernel: [<00000000c9b919dc>] ext4_htree_fill_tree+0x134/0x400 kernel: [<00000000c9b4b3d0>] ext4_dx_readdir+0x160/0x2f0 kernel: [<00000000c9b4bedc>] ext4_readdir+0x5f4/0x760 kernel: [<00000000c9a7efc4>] iterate_dir+0xb4/0x280 kernel: [<00000000c9a7f1ea>] __do_sys_getdents64+0x5a/0x120 kernel: [<00000000ca5d6946>] __do_syscall+0x256/0x310 kernel: [<00000000ca5eea10>] system_call+0x70/0x98 kernel: INFO: lockdep is turned off. kernel: FIX kmalloc-96: Restoring Poison 0x00000000885a7512-0x00000000885a7513=0x6b kernel: FIX kmalloc-96: Marking all objects used The fix is simple: Before use of the queue not only the queue object but also the card object needs to increase it's reference count with a call to zcrypt_card_get(). Similar after use of the queue not only the queue but also the card object's reference count is decreased with zcrypt_card_put(). Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-07s390/pkey: improve pkey retry behaviorHarald Freudenberger1-18/+21
This patch reworks and improves the pkey retry behavior for the pkey_ep11key2pkey() function. In contrast to the pkey_skey2pkey() function which is used to trigger a protected key derivation from an CCA secure data or cipher key the EP11 counterpart function had no proper retry loop implemented. This patch now introduces code which acts similar to the retry already done for CCA keys for this function used for EP11 keys. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-07s390/zcrypt: improve zcrypt retry behaviorHarald Freudenberger3-80/+58
This patch reworks and improves the zcrypt retry behavior: - The zcrypt_rescan_req counter has been removed. This counter variable has been increased on some transport errors and was used as a gatekeeper for AP bus rescans. - Rework of the zcrypt_process_rescan() function to not use the above counter variable any more. Instead now always the ap_bus_force_rescan() function is called (as this has been improved with a previous patch). - As the zcrpyt_process_rescan() function is called in all cprb send functions in case of the first attempt to send failed with ENODEV now before the next attempt to send an cprb is started. - Introduce a define ZCRYPT_WAIT_BINDINGS_COMPLETE_MS for the amount of milliseconds to have the zcrypt API wait for AP bindings complete. This amount has been reduced to 30s (was 60s). Some playing around showed that 30s is a really fair limit. The result of the above together with the patches to improve the AP scan bus functions is that after the first loop of cprb send retries when the result is a ENODEV the AP bus scan is always triggered (synchronous). If the AP bus scan detects changes in the configuration, all the send functions now retry when the first attempt was failing with ENODEV in the hope that now a suitable device has appeared. About concurrency: The ap_bus_force_rescan() uses a mutex to ensure only one active AP bus scan is running. Another caller of this function is blocked as long as the scan is running but does not cause yet another scan. Instead the result of the 'other' scan is used. This affects only tasks which run into an initial ENODEV. Tasks with successful delivery of cprbs will never invoke the bus scan and thus never get blocked by the mutex. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-07s390/zcrypt: introduce retries on in-kernel send CPRB functionsHarald Freudenberger1-2/+40
The both functions zcrypt_send_cprb() and zcrypt_send_ep11_cprb() are used to send CPRBs in-kernel from different sources. For example the pkey module may call one of the functions in zcrypt_ep11misc.c to trigger a derive of a protected key from a secure key blob via an existing crypto card. These both functions are then the internal API to send the CPRB and receive the response. All the ioctl functions to send an CPRB down to the addressed crypto card use some kind of retry mechanism. When the first attempt fails with ENODEV, a bus rescan is triggered and a loop with retries is carried out. For the both named internal functions there was never any retry attempt made. This patch now introduces the retry code even for this both internal functions to have effectively same behavior on sending an CPRB from an in-kernel source and sending an CPRB from userspace via ioctl. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-07s390/ap: introduce mutex to lock the AP bus scanHarald Freudenberger2-11/+58
Rework the invocations around ap_scan_bus(): - Protect ap_scan_bus() with a mutex to make sure only one scan at a time is running. - The workqueue invocation which is triggered by either the module init or via AP bus scan timer expiration uses this mutex and if there is already a scan running, the work is simple aborted (as the job is done by another task). - The ap_bus_force_rescan() which is invoked by higher level layers mostly on failures which indicate a bus scan may help is reworked to call ap_scan_bus() direct instead of enqueuing work into a system workqueue and waiting for that to finish. Of course the mutex is respected and in case of another task already running a bus scan the shortcut of waiting for this scan to finish and reusing the scan result is taken. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-07s390/ap: rework ap_scan_bus() to return true on config changeHarald Freudenberger1-7/+20
The AP scan bus function now returns true if there have been any config changes detected. This will become important in a follow up patch which will exploit this hint for further actions. This also required to have the AP scan bus timer callback reworked as the function signature has changed to bool ap_scan_bus(void). Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-07s390/ap: clarify AP scan bus related functions and variablesHarald Freudenberger1-19/+24
This patch tries to clarify the functions and variables around the AP scan bus job. All these variables and functions start with ap_scan_bus and are declared in one place now. No functional changes in this patch - only renaming and move of code or declarations. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-07s390/ap: rearm APQNs bindings complete completionHarald Freudenberger3-21/+80
The APQN bindings complete completion was used to reflect that 1st the AP bus initial scan is done and 2nd all the detected APQNs have been bound to a device driver. This was a single-shot action. However, as the AP bus supports hot-plug it may be that new APQNs appear reflected as new AP queue and card devices which need to be bound to appropriate device drivers. So the condition that all existing AP queue devices are bound to device drivers may go away for a certain time. This patch now checks during AP bus scan for maybe new AP devices appearing and does a re-init of the internal completion variable. So the AP bus function ap_wait_apqn_bindings_complete() now may block on this condition variable even later after initial scan is through when new APQNs appear which need to get bound. This patch also moves the check for binding complete invocation from the probe function to the end of the AP bus scan function. This change also covers some weird scenarios where during a card hotplug the binding of the card device was sufficient for binding complete but the queue devices where still in the process of being discovered. As of now this change has no impact on existing code. The behavior change in the now later bindings complete should not impact any code (and has been tested so far). The only exploiter is the zcrypt function zcrypt_wait_api_operational() which only initial calls ap_wait_apqn_bindings_complete(). However, this new behavior of the AP bus wait for APQNs bindings complete function will be used in a later patch exploiting this for the zcrypt API layer. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-03-07s390/vfio-ap: handle hardware checkstop state on queue reset operationJason J. Herne1-17/+18
Update vfio_ap_mdev_reset_queue() to handle an unexpected checkstop (hardware error) the same as the deconfigured case. This prevents unexpected and unhelpful warnings in the event of a hardware error. We also stop lying about a queue's reset response code. This was originally done so we could force vfio_ap_mdev_filter_matrix to pass a deconfigured device through to the guest for the hotplug scenario. vfio_ap_mdev_filter_matrix is instead modified to allow passthrough for all queues with reset state normal, deconfigured, or checkstopped. In the checkstopped case we choose to pass the device through and let the error state be reflected at the guest level. Signed-off-by: "Jason J. Herne" <jjherne@linux.ibm.com> Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com> Link: https://lore.kernel.org/r/20240215153144.14747-1-jjherne@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-20s390/ap: explicitly include ultravisor headerHolger Dengler1-0/+1
The ap_bus is using inline functions of the ultravisor (uv) in-kernel API. The related header file is implicitly included via several other headers. Replace this by an explicit include of the ultravisor header in the ap_bus file. Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16s390/zcrypt: add debug possibility for CCA and EP11 messagesHarald Freudenberger1-0/+12
This patch introduces dynamic debug hexdump invocation possibilities to be able to: - dump an CCA or EP11 CPRB request as early as possible when received via ioctl from userspace but after the ap message has been collected together. - dump an CCA or EP11 CPRB reply short before it is transferred via ioctl into userspace. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16s390/ap: add debug possibility for AP messagesHarald Freudenberger1-0/+4
This patch introduces two dynamic debug hexdump invocation possibilities to be able to a) dump an AP message immediately before it goes into the firmware queue and b) dump a fresh from the firmware queue received AP message. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16s390/pkey: introduce dynamic debugging for pkeyHarald Freudenberger1-24/+23
This patch replaces all the s390 debug feature calls with debug level by dynamic debug calls pr_debug. These calls are much more flexible and each single invocation can get enabled/disabled at runtime wheres the s390 debug feature debug calls have only one knob - enable or disable all in one bunch. This patch follows a similar change for the AP bus and zcrypt device driver code. All this code uses dynamic debugging with pr_debug and friends for emitting debug traces now. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16s390/pkey: harmonize pkey s390 debug feature callsHarald Freudenberger1-91/+97
Cleanup and harmonize the s390 debug feature calls and defines for the pkey module to be similar to the debug feature as it is used in the zcrypt device driver and AP bus. More or less only renaming but no functional changes. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16s390/zcrypt: introduce dynamic debugging for AP and zcrypt codeHarald Freudenberger7-85/+89
This patch replaces all the s390 debug feature calls with debug level by dynamic debug calls pr_debug. These calls are much more flexible and each single invocation can get enabled/disabled at runtime wheres the s390 debug feature debug calls have only one knob - enable or disable all in one bunch. The benefit is especially significant with high frequency called functions like the AP bus scan. In most debugging scenarios you don't want and need them, but sometimes it is crucial to know exactly when and how long the AP bus scan took. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-16s390/zcrypt: harmonize debug feature calls and definesHarald Freudenberger6-193/+156
This patch harmonizes the calls and defines around the s390 debug feature as it is used in the AP bus and zcrypt device driver code. More or less cleanup and renaming, no functional changes. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-09s390/vfio-ap: make matrix_bus constRicardo B. Marliere1-1/+1
Now that the driver core can properly handle constant struct bus_type, move the matrix_bus variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Ricardo B. Marliere" <ricardo@marliere.net> Acked-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Anthony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: "Jason J. Herne" <jjherne@linux.ibm.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240203-bus_cleanup-s390-v1-6-ac891afc7282@marliere.net Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-09s390/ap: make ap_bus_type constRicardo B. Marliere1-2/+2
Now that the driver core can properly handle constant struct bus_type, move the ap_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Ricardo B. Marliere" <ricardo@marliere.net> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240203-bus_cleanup-s390-v1-5-ac891afc7282@marliere.net Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-01-17s390/vfio-ap: do not reset queue removed from host configTony Krowiak1-4/+12
When a queue is unbound from the vfio_ap device driver, it is reset to ensure its crypto data is not leaked when it is bound to another device driver. If the queue is unbound due to the fact that the adapter or domain was removed from the host's AP configuration, then attempting to reset it will fail with response code 01 (APID not valid) getting returned from the reset command. Let's ensure that the queue is assigned to the host's configuration before resetting it. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: "Jason J. Herne" <jjherne@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Fixes: eeb386aeb5b7 ("s390/vfio-ap: handle config changed and scan complete notification") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-7-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-01-17s390/vfio-ap: reset queues associated with adapter for queue unbound from driverTony Krowiak1-35/+41
When a queue is unbound from the vfio_ap device driver, if that queue is assigned to a guest's AP configuration, its associated adapter is removed because queues are defined to a guest via a matrix of adapters and domains; so, it is not possible to remove a single queue. If an adapter is removed from the guest's AP configuration, all associated queues must be reset to prevent leaking crypto data should any of them be assigned to a different guest or device driver. The one caveat is that if the queue is being removed because the adapter or domain has been removed from the host's AP configuration, then an attempt to reset the queue will fail with response code 01, AP-queue number not valid; so resetting these queues should be skipped. Acked-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Fixes: 09d31ff78793 ("s390/vfio-ap: hot plug/unplug of AP devices when probed/removed") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-6-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-01-17s390/vfio-ap: reset queues filtered from the guest's AP configTony Krowiak2-45/+129
When filtering the adapters from the configuration profile for a guest to create or update a guest's AP configuration, if the APID of an adapter and the APQI of a domain identify a queue device that is not bound to the vfio_ap device driver, the APID of the adapter will be filtered because an individual APQN can not be filtered due to the fact the APQNs are assigned to an AP configuration as a matrix of APIDs and APQIs. Consequently, a guest will not have access to all of the queues associated with the filtered adapter. If the queues are subsequently made available again to the guest, they should re-appear in a reset state; so, let's make sure all queues associated with an adapter unplugged from the guest are reset. In order to identify the set of queues that need to be reset, let's allow a vfio_ap_queue object to be simultaneously stored in both a hashtable and a list: A hashtable used to store all of the queues assigned to a matrix mdev; and/or, a list used to store a subset of the queues that need to be reset. For example, when an adapter is hot unplugged from a guest, all guest queues associated with that adapter must be reset. Since that may be a subset of those assigned to the matrix mdev, they can be stored in a list that can be passed to the vfio_ap_mdev_reset_queues function. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-5-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-01-17s390/vfio-ap: let on_scan_complete() callback filter matrix and update ↵Tony Krowiak1-0/+13
guest's APCB When adapters and/or domains are added to the host's AP configuration, this may result in multiple queue devices getting created and probed by the vfio_ap device driver. For each queue device probed, the matrix of adapters and domains assigned to a matrix mdev will be filtered to update the guest's APCB. If any adapters or domains get added to or removed from the APCB, the guest's AP configuration will be dynamically updated (i.e., hot plug/unplug). To dynamically update the guest's configuration, its VCPUs must be taken out of SIE for the period of time it takes to make the update. This is disruptive to the guest's operation and if there are many queues probed due to a change in the host's AP configuration, this could be troublesome. The problem is exacerbated by the fact that the 'on_scan_complete' callback also filters the mdev's matrix and updates the guest's AP configuration. In order to reduce the potential amount of disruption to the guest that may result from a change to the host's AP configuration, let's bypass the filtering of the matrix and updating of the guest's AP configuration in the probe callback - if due to a host config change - and defer it until the 'on_scan_complete' callback is invoked after the AP bus finishes its device scan operation. This way the filtering and updating will be performed only once regardless of the number of queues added. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-4-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-01-17s390/vfio-ap: loop over the shadow APCB when filtering guest's AP configurationTony Krowiak1-2/+3
While filtering the mdev matrix, it doesn't make sense - and will have unexpected results - to filter an APID from the matrix if the APID or one of the associated APQIs is not in the host's AP configuration. There are two reasons for this: 1. An adapter or domain that is not in the host's AP configuration can be assigned to the matrix; this is known as over-provisioning. Queue devices, however, are only created for adapters and domains in the host's AP configuration, so there will be no queues associated with an over-provisioned adapter or domain to filter. 2. The adapter or domain may have been externally removed from the host's configuration via an SE or HMC attached to a DPM enabled LPAR. In this case, the vfio_ap device driver would have been notified by the AP bus via the on_config_changed callback and the adapter or domain would have already been filtered. Since the matrix_mdev->shadow_apcb.apm and matrix_mdev->shadow_apcb.aqm are copied from the mdev matrix sans the APIDs and APQIs not in the host's AP configuration, let's loop over those bitmaps instead of those assigned to the matrix. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-3-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-01-17s390/vfio-ap: always filter entire AP matrixTony Krowiak1-40/+17
The vfio_ap_mdev_filter_matrix function is called whenever a new adapter or domain is assigned to the mdev. The purpose of the function is to update the guest's AP configuration by filtering the matrix of adapters and domains assigned to the mdev. When an adapter or domain is assigned, only the APQNs associated with the APID of the new adapter or APQI of the new domain are inspected. If an APQN does not reference a queue device bound to the vfio_ap device driver, then it's APID will be filtered from the mdev's matrix when updating the guest's AP configuration. Inspecting only the APID of the new adapter or APQI of the new domain will result in passing AP queues through to a guest that are not bound to the vfio_ap device driver under certain circumstances. Consider the following: guest's AP configuration (all also assigned to the mdev's matrix): 14.0004 14.0005 14.0006 16.0004 16.0005 16.0006 unassign domain 4 unbind queue 16.0005 assign domain 4 When domain 4 is re-assigned, since only domain 4 will be inspected, the APQNs that will be examined will be: 14.0004 16.0004 Since both of those APQNs reference queue devices that are bound to the vfio_ap device driver, nothing will get filtered from the mdev's matrix when updating the guest's AP configuration. Consequently, queue 16.0005 will get passed through despite not being bound to the driver. This violates the linux device model requirement that a guest shall only be given access to devices bound to the device driver facilitating their pass-through. To resolve this problem, every adapter and domain assigned to the mdev will be inspected when filtering the mdev's matrix. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Fixes: 48cae940c31d ("s390/vfio-ap: refresh guest's APCB by filtering AP resources assigned to mdev") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115185441.31526-2-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-01-11Merge tag 's390-6.8-1' of ↵Linus Torvalds7-161/+228
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Add machine variable capacity information to /proc/sysinfo. - Limit the waste of page tables and always align vmalloc area size and base address on segment boundary. - Fix a memory leak when an attempt to register interruption sub class (ISC) for the adjunct-processor (AP) guest failed. - Reset response code AP_RESPONSE_INVALID_GISA to understandable by guest AP_RESPONSE_INVALID_ADDRESS in response to a failed interruption sub class (ISC) registration attempt. - Improve reaction to adjunct-processor (AP) AP_RESPONSE_OTHERWISE_CHANGED response code when enabling interrupts on behalf of a guest. - Fix incorrect sysfs 'status' attribute of adjunct-processor (AP) queue device bound to the vfio_ap device driver when the mediated device is attached to a guest, but the queue device is not passed through. - Rework struct ap_card to hold the whole adjunct-processor (AP) card hardware information. As result, all the ugly bit checks are replaced by simple evaluations of the required bit fields. - Improve handling of some weird scenarios between service element (SE) host and SE guest with adjunct-processor (AP) pass-through support. - Change local_ctl_set_bit() and local_ctl_clear_bit() so they return the previous value of the to be changed control register. This is useful if a bit is only changed temporarily and the previous content needs to be restored. - The kernel starts with machine checks disabled and is expected to enable it once trap_init() is called. However the implementation allows machine checks early. Consistently enable it in trap_init() only. - local_mcck_disable() and local_mcck_enable() assume that machine checks are always enabled. Instead implement and use local_mcck_save() and local_mcck_restore() to disable machine checks and restore the previous state. - Modification of floating point control (FPC) register of a traced process using ptrace interface may lead to corruption of the FPC register of the tracing process. Fix this. - kvm_arch_vcpu_ioctl_set_fpu() allows to set the floating point control (FPC) register in vCPU, but may lead to corruption of the FPC register of the host process. Fix this. - Use READ_ONCE() to read a vCPU floating point register value from the memory mapped area. This avoids that, depending on code generation, a different value is tested for validity than the one that is used. - Get rid of test_fp_ctl(), since it is quite subtle to use it correctly. Instead copy a new floating point control register value into its save area and test the validity of the new value when loading it. - Remove superfluous save_fpu_regs() call. - Remove s390 support for ARCH_WANTS_DYNAMIC_TASK_STRUCT. All machines provide the vector facility since many years and the need to make the task structure size dependent on the vector facility does not exist. - Remove the "novx" kernel command line option, as the vector code runs without any problems since many years. - Add the vector facility to the z13 architecture level set (ALS). All hypervisors support the vector facility since many years. This allows compile time optimizations of the kernel. - Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx(). As result, the compiled code will have less runtime checks and less code. - Convert pgste_get_lock() and pgste_set_unlock() ASM inlines to C. - Convert the struct subchannel spinlock from pointer to member. * tag 's390-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (24 commits) Revert "s390: update defconfigs" s390/cio: make sch->lock spinlock pointer a member s390: update defconfigs s390/mm: convert pgste locking functions to C s390/fpu: get rid of MACHINE_HAS_VX s390/als: add vector facility to z13 architecture level set s390/fpu: remove "novx" option s390/fpu: remove ARCH_WANTS_DYNAMIC_TASK_STRUCT support KVM: s390: remove superfluous save_fpu_regs() call s390/fpu: get rid of test_fp_ctl() KVM: s390: use READ_ONCE() to read fpc register value KVM: s390: fix setting of fpc register s390/ptrace: handle setting of fpc register correctly s390/nmi: implement and use local_mcck_save() / local_mcck_restore() s390/nmi: consistently enable machine checks in trap_init() s390/ctlreg: return old register contents when changing bits s390/ap: handle outband SE bind state change s390/ap: store TAPQ hwinfo in struct ap_card s390/vfio-ap: fix sysfs status attribute for AP queue devices s390/vfio-ap: improve reaction to response code 07 from PQAP(AQIC) command ...
2023-11-30s390/ap: handle outband SE bind state changeHarald Freudenberger3-50/+122
This patch addresses some weird scenarios where an outband manipulation of the SE bind state of a queue assigned and maybe in use by an SE guest with AP pass-through support took place. So for example when the guest has bound and associated a queue and then this domain has been zeroed on the service element. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-30s390/ap: store TAPQ hwinfo in struct ap_cardHarald Freudenberger7-106/+83
As of now the AP card struct held only part of the queue's hwinfo (that is the GR2 register content returned with an TAPQ invocation). This patch reworks struct ap_card to hold the whole hwinfo now. As there is a nice bit field union on top of this ap_tapq_hwinfo struct, all the ugly bit checkings can now get replaced by simple evaluations of the required bit field. Suggested-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-30s390/vfio-ap: fix sysfs status attribute for AP queue devicesTony Krowiak1-1/+15
The 'status' attribute for AP queue devices bound to the vfio_ap device driver displays incorrect status when the mediated device is attached to a guest, but the queue device is not passed through. In the current implementation, the status displayed is 'in_use' which is not correct; it should be 'assigned'. This can happen if one of the queue devices associated with a given adapter is not bound to the vfio_ap device driver. For example: Queues listed in /sys/bus/ap/drivers/vfio_ap: 14.0005 14.0006 14.000d 16.0006 16.000d Queues listed in /sys/devices/vfio_ap/matrix/$UUID/matrix 14.0005 14.0006 14.000d 16.0005 16.0006 16.000d Queues listed in /sys/devices/vfio_ap/matrix/$UUID/guest_matrix 14.0005 14.0006 14.000d The reason no queues for adapter 0x16 are listed in the guest_matrix is because queue 16.0005 is not bound to the vfio_ap device driver, so no queue associated with the adapter is passed through to the guest; therefore, each queue device for adapter 0x16 should display 'assigned' instead of 'in_use', because those queues are not in use by a guest, but only assigned to the mediated device. Let's check the AP configuration for the guest to determine whether a queue device is passed through before displaying a status of 'in_use'. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Harald Freudenberger <freude@linux.ibm.com> Link: https://lore.kernel.org/r/20231108201135.351419-1-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-30s390/vfio-ap: improve reaction to response code 07 from PQAP(AQIC) commandTony Krowiak1-1/+4
Let's improve the vfio_ap driver's reaction to reception of response code 07 from the PQAP(AQIC) command when enabling interrupts on behalf of a guest: * Unregister the guest's ISC before the pages containing the notification indicator bytes are unpinned. * Capture the return code from the kvm_s390_gisc_unregister function and log a DBF warning if it fails. Suggested-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20231109164427.460493-4-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-30s390/vfio-ap: set status response code to 06 on gisc registration failureAnthony Krowiak1-3/+3
The interception handler for the PQAP(AQIC) command calls the kvm_s390_gisc_register function to register the guest ISC with the channel subsystem. If that call fails, the status response code 08 - indicating Invalid ZONE/GISA designation - is returned to the guest. This response code is not valid because setting the ZONE/GISA values is the responsibility of the hypervisor controlling the guest and there is nothing that can be done from the guest perspective to correct that problem. The likelihood of GISC registration failure is nil and there is no status response code to indicate an invalid ISC value, so let's set the response code to 06 indicating 'Invalid address of AP-queue notification byte'. While this is not entirely accurate, it is better than setting a response code which makes no sense for the guest. Signed-off-by: Anthony Krowiak <akrowiak@linux.ibm.com> Suggested-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Link: https://lore.kernel.org/r/20231109164427.460493-3-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-30s390/vfio-ap: unpin pages on gisc registration failureAnthony Krowiak1-0/+1
In the vfio_ap_irq_enable function, after the page containing the notification indicator byte (NIB) is pinned, the function attempts to register the guest ISC. If registration fails, the function sets the status response code and returns without unpinning the page containing the NIB. In order to avoid a memory leak, the NIB should be unpinned before returning from the vfio_ap_irq_enable function. Co-developed-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Anthony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Fixes: 783f0a3ccd79 ("s390/vfio-ap: add s390dbf logging to the vfio_ap_irq_enable function") Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20231109164427.460493-2-akrowiak@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-28eventfd: simplify eventfd_signal()Christian Brauner1-1/+1
Ever since the eventfd type was introduced back in 2007 in commit e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() function only ever passed 1 as a value for @n. There's no point in keeping that additional argument. Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-2-bd549b14ce0c@kernel.org Acked-by: Xu Yilun <yilun.xu@intel.com> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Eric Farman <farman@linux.ibm.com> # s390 Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-06s390/ap: fix vanishing crypto cards in SE environmentHarald Freudenberger1-23/+20
A secure execution (SE, also known as confidential computing) guest may see asynchronous errors on a crypto firmware queue. The current implementation to gather information about cards and queues in ap_queue_info() simple returns if an asynchronous error is hanging on the firmware queue. If such a situation happened and it was the only queue visible for a crypto card within an SE guest, then the card vanished from sysfs as the AP bus scan function refuses to hold a card without any type information. As lszcrypt evaluates the sysfs such a card vanished from the lszcrypt card listing and the user is baffled and has no way to reset and thus clear the pending asynchronous error. This patch improves the named function to also evaluate GR2 of the TAPQ in case of asynchronous error pending. If there is a not-null value stored in, the info is processed now. In the end, a queue with pending asynchronous error does not lead to a vanishing card any more. Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-11-06s390/zcrypt: don't report online if card or queue is in check-stop stateIngo Franzki2-4/+5
If a card or a queue is in check-stop state, it can't be used by applications to perform crypto operations. Applications check the 'online' sysfs attribute to check if a card or queue is usable. Report a card or queue as offline in case it is in check-stop state. Furthermore, don't allow to set a card or queue online, if it is in check-stop state. This is similar to when the card or the queue is in deconfigured state, there it is also reported as being offline, and it can't be set online. Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-11-06s390/ap: fix AP bus crash on early config change callback invocationHarald Freudenberger1-0/+4
Fix kernel crash in AP bus code caused by very early invocation of the config change callback function via SCLP. After a fresh IML of the machine the crypto cards are still offline and will get switched online only with activation of any LPAR which has the card in it's configuration. A crypto card coming online is reported to the LPAR via SCLP and the AP bus offers a callback function to get this kind of information. However, it may happen that the callback is invoked before the AP bus init function is complete. As the callback triggers a synchronous AP bus scan, the scan may already run but some internal states are not initialized by the AP bus init function resulting in a crash like this: [ 11.635859] Unable to handle kernel pointer dereference in virtual kernel address space [ 11.635861] Failing address: 0000000000000000 TEID: 0000000000000887 [ 11.635862] Fault in home space mode while using kernel ASCE. [ 11.635864] AS:00000000894c4007 R3:00000001fece8007 S:00000001fece7800 P:000000000000013d [ 11.635879] Oops: 0004 ilc:1 [#1] SMP [ 11.635882] Modules linked in: [ 11.635884] CPU: 5 PID: 42 Comm: kworker/5:0 Not tainted 6.6.0-rc3-00003-g4dbf7cdc6b42 #12 [ 11.635886] Hardware name: IBM 3931 A01 751 (LPAR) [ 11.635887] Workqueue: events_long ap_scan_bus [ 11.635891] Krnl PSW : 0704c00180000000 0000000000000000 (0x0) [ 11.635895] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 11.635897] Krnl GPRS: 0000000001000a00 0000000000000000 0000000000000006 0000000089591940 [ 11.635899] 0000000080000000 0000000000000a00 0000000000000000 0000000000000000 [ 11.635901] 0000000081870c00 0000000089591000 000000008834e4e2 0000000002625a00 [ 11.635903] 0000000081734200 0000038000913c18 000000008834c6d6 0000038000913ac8 [ 11.635906] Krnl Code:>0000000000000000: 0000 illegal [ 11.635906] 0000000000000002: 0000 illegal [ 11.635906] 0000000000000004: 0000 illegal [ 11.635906] 0000000000000006: 0000 illegal [ 11.635906] 0000000000000008: 0000 illegal [ 11.635906] 000000000000000a: 0000 illegal [ 11.635906] 000000000000000c: 0000 illegal [ 11.635906] 000000000000000e: 0000 illegal [ 11.635915] Call Trace: [ 11.635916] [<0000000000000000>] 0x0 [ 11.635918] [<000000008834e4e2>] ap_queue_init_state+0x82/0xb8 [ 11.635921] [<000000008834ba1c>] ap_scan_domains+0x6fc/0x740 [ 11.635923] [<000000008834c092>] ap_scan_adapter+0x632/0x8b0 [ 11.635925] [<000000008834c3e4>] ap_scan_bus+0xd4/0x288 [ 11.635927] [<00000000879a33ba>] process_one_work+0x19a/0x410 [ 11.635930] Discipline DIAG cannot be used without z/VM [ 11.635930] [<00000000879a3a2c>] worker_thread+0x3fc/0x560 [ 11.635933] [<00000000879aea60>] kthread+0x120/0x128 [ 11.635936] [<000000008792afa4>] __ret_from_fork+0x3c/0x58 [ 11.635938] [<00000000885ebe62>] ret_from_fork+0xa/0x30 [ 11.635942] Last Breaking-Event-Address: [ 11.635942] [<000000008834c6d4>] ap_wait+0xcc/0x148 This patch improves the ap_bus_force_rescan() function which is invoked by the config change callback by checking if a first initial AP bus scan has been done. If not, the force rescan request is simple ignored. Anyhow it does not make sense to trigger AP bus re-scans even before the very first bus scan is complete. Cc: stable@vger.kernel.org Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-11-06s390/ap: re-enable interrupt for AP queuesHarald Freudenberger1-2/+12
This patch introduces some code lines which check for interrupt support enabled on an AP queue after a reply has been received. This invocation has been chosen as there is a good chance to have the queue empty at that time. As the enablement of the irq imples a state machine change the queue should not have any pending requests or unreceived replies. Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-11-06s390/ap: rework to use irq info from ap queue statusHarald Freudenberger2-11/+12
This patch reworks the irq handling and reporting code for the AP queue interrupt handling to always use the irq info from the queue status. Until now the interrupt status of an AP queue was stored into a bool variable within the ap_queue struct. This variable was set on a successful interrupt enablement and cleared with kicking a reset. However, it may be that the interrupt state is manipulated outband for example by a hypervisor. This patch removes this variable and instead the irq bit from the AP queue status which is always reflecting the current irq state is used. Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-10-16s390/ap: show APFS value on error reply 0x8BHarald Freudenberger1-2/+16
With Secure Execution the error reply RX 0x8B now carries an APFS value indicating why the request has been filtered by a lower layer of the firmware. So display this value as a warning line in the s390 debug feature trace. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-10-16s390/zcrypt: introduce new internal AP queue se_bound attributeHarald Freudenberger3-6/+55
This patch introduces a new AP queue internal attribute se_bound which reflects the bound state of an APQN within a Secure Execution environment. With introduction of Secure Execution guests now an AP firmware queue needs to be bound to the guest before usage. This patch introduces a new internal attribute reflecting this bound state and some glue code to handle this new field during lifetime of an AP queue device. Together with that now the zcrypt scheduler considers the state of the AP queues when a message is about to be distributed among the existing queues. There is a new function ap_queue_usable() which returns true only when all conditions for using this AP queue device are fulfilled. In details this means: the AP queue needs to be configured, not checkstopped and within an SE environment it needs to be bound. So the new function gives and indication if the AP queue device is ready to serve requests or not. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-10-16s390/ap: re-init AP queues on config onHarald Freudenberger3-13/+18
On a state toggle from config off to config on and on the state toggle from checkstop to not checkstop the queue's internal states was set but the state machine was not nudged. This did not care as on the first enqueue of a request the state machine kick ran. However, within an Secure Execution guest a queue is only chosen by the scheduler when it has been bound. But to bind a queue, it needs to run through the initial states (reset, enable interrupts, ...). So this is like a chicken-and-egg problem and the result was in fact that a queue was unusable after a config off/on toggle. With some slight rework of the handling of these states now the new function _ap_queue_init_state() is called which is the core of the ap_queue_init_state() function but without locking handling. This has the benefit that it can be called on all the places where a (re-)init of the AP queue's state machine is needed. Fixes: 2d72eaf036d2 ("s390/ap: implement SE AP bind, unbind and associate") Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Reviewed-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19s390/zcrypt: update list of EP11 operation modesIngo Franzki1-0/+4
Add additional operation mode strings into the EP11 operation mode table. These strings are returned by sysfs entries /sys/devices/ap/cardxx/op_modes and /sys/devices/ap/cardxx/xx.yyyy/op_modes. Signed-off-by: Ingo Franzki <ifranzki@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-07Merge tag 's390-6.6-2' of ↵Linus Torvalds1-7/+4
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - A couple of virtual vs physical address confusion fixes - Rework locking in dcssblk driver to address a lockdep warning - Remove support for "noexec" kernel command line option since there is no use case where it would make sense - Simplify kernel mapping setup and get rid of quite a bit of code - Add architecture specific __set_memory_yy() functions which allow us to modify kernel mappings. Unlike the set_memory_xx() variants they take void pointer start and end parameters, which allows using them without the usual casts, and also to use them on areas larger than 8TB. Note that the set_memory_xx() family comes with an int num_pages parameter which overflows with 8TB. This could be addressed by changing the num_pages parameter to unsigned long, however requires to change all architectures, since the module code expects an int parameter (see module_set_memory()). This was indeed an issue since for debug_pagealloc() we call set_memory_4k() on the whole identity mapping. Therefore address this for now with the __set_memory_yy() variant, and address common code later - Use dev_set_name() and also fix memory leak in zcrypt driver error handling - Remove unused lsi_mask from airq_struct - Add warning for invalid kernel mapping requests * tag 's390-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/vmem: do not silently ignore mapping limit s390/zcrypt: utilize dev_set_name() ability to use a formatted string s390/zcrypt: don't leak memory if dev_set_name() fails s390/mm: fix MAX_DMA_ADDRESS physical vs virtual confusion s390/airq: remove lsi_mask from airq_struct s390/mm: use __set_memory() variants where useful s390/set_memory: add __set_memory() variant s390/set_memory: generate all set_memory() functions s390/mm: improve description of mapping permissions of prefix pages s390/amode31: change type of __samode31, __eamode31, etc s390/mm: simplify kernel mapping setup s390: remove "noexec" option s390/vmem: fix virtual vs physical address confusion s390/dcssblk: fix lockdep warning s390/monreader: fix virtual vs physical address confusion
2023-09-05s390/zcrypt: utilize dev_set_name() ability to use a formatted stringAndy Shevchenko1-7/+3
With the dev_set_name() prototype it's not obvious that it takes a formatted string as a parameter. Use its facility instead of duplicating the same with strncpy()/snprintf() calls. With this, also prevent return error code to be shadowed. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230831110000.24279-2-andriy.shevchenko@linux.intel.com Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-09-05s390/zcrypt: don't leak memory if dev_set_name() failsAndy Shevchenko1-0/+1
When dev_set_name() fails, zcdn_create() doesn't free the newly allocated resources. Do it. Fixes: 00fab2350e6b ("s390/zcrypt: multiple zcrypt device nodes support") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230831110000.24279-1-andriy.shevchenko@linux.intel.com Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-31Merge tag 'vfio-v6.6-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds1-0/+1
Pull VFIO updates from Alex Williamson: - VFIO direct character device (cdev) interface support. This extracts the vfio device fd from the container and group model, and is intended to be the native uAPI for use with IOMMUFD (Yi Liu) - Enhancements to the PCI hot reset interface in support of cdev usage (Yi Liu) - Fix a potential race between registering and unregistering vfio files in the kvm-vfio interface and extend use of a lock to avoid extra drop and acquires (Dmitry Torokhov) - A new vfio-pci variant driver for the AMD/Pensando Distributed Services Card (PDS) Ethernet device, supporting live migration (Brett Creeley) - Cleanups to remove redundant owner setup in cdx and fsl bus drivers, and simplify driver init/exit in fsl code (Li Zetao) - Fix uninitialized hole in data structure and pad capability structures for alignment (Stefan Hajnoczi) * tag 'vfio-v6.6-rc1' of https://github.com/awilliam/linux-vfio: (53 commits) vfio/pds: Send type for SUSPEND_STATUS command vfio/pds: fix return value in pds_vfio_get_lm_file() pds_core: Fix function header descriptions vfio: align capability structures vfio/type1: fix cap_migration information leak vfio/fsl-mc: Use module_fsl_mc_driver macro to simplify the code vfio/cdx: Remove redundant initialization owner in vfio_cdx_driver vfio/pds: Add Kconfig and documentation vfio/pds: Add support for firmware recovery vfio/pds: Add support for dirty page tracking vfio/pds: Add VFIO live migration support vfio/pds: register with the pds_core PF pds_core: Require callers of register/unregister to pass PF drvdata vfio/pds: Initial support for pds VFIO driver vfio: Commonize combine_ranges for use in other VFIO drivers kvm/vfio: avoid bouncing the mutex when adding and deleting groups kvm/vfio: ensure kvg instance stays around in kvm_vfio_group_add() docs: vfio: Add vfio device cdev description vfio: Compile vfio_group infrastructure optionally vfio: Move the IOMMU_CAP_CACHE_COHERENCY check in __vfio_register_dev() ...
2023-08-23Merge branch 'vfio-ap' into featuresHeiko Carstens2-60/+110
Tony Krowiak says: =================== This patch series is for the changes required in the vfio_ap device driver to facilitate pass-through of crypto devices to a secure execution guest. In particular, it is critical that no data from the queues passed through to the SE guest is leaked when the guest is destroyed. There are also some new response codes returned from the PQAP(ZAPQ) and PQAP(TAPQ) commands that have been added to the architecture in support of pass-through of crypto devices to SE guests; these need to be accounted for when handling the reset of queues. =================== Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18s390/vfio-ap: make sure nib is sharedTony Krowiak1-0/+30
Since the NIB is visible by HW, KVM and the (PV) guest it needs to be in non-secure or secure but shared storage. Return code 6 is used to indicate to a PV guest that its NIB would be on secure, unshared storage and therefore the NIB address is invalid. Unfortunately we have no easy way to check if a page is unshared after vfio_pin_pages() since it will automatically export an unshared page if the UV pin shared call did not succeed due to a page being in unshared state. Therefore we use the fact that UV pinning it a second time is a nop but trying to pin an exported page is an error (0x102). If we encounter this error, we do a vfio unpin and import the page again, since vfio_pin_pages() exported it. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com> Link: https://lore.kernel.org/r/20230815184333.6554-13-akrowiak@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18s390/vfio-ap: check for TAPQ response codes 0x35 and 0x36Tony Krowiak1-1/+12
Check for response codes 0x35 and 0x36 which are asynchronous return codes indicating a failure of the guest to associate a secret with a queue. Since there can be no interaction with this queue from the guest (i.e., the vcpus are out of SIE for hot unplug, the guest is being shut down or an emulated subsystem reset of the guest is taking place), let's go ahead and re-issue the ZAPQ to reset and zeroize the queue. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com> Link: https://lore.kernel.org/r/20230815184333.6554-10-akrowiak@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18s390/vfio-ap: handle queue state change in progress on resetTony Krowiak1-1/+3
A new APQSW response code (0xA) indicating the designated queue is in the process of being bound or associated to a configuration may be returned from the PQAP(ZAPQ) command. This patch introduces code that will verify when the PQAP(ZAPQ) command can be re-issued after receiving response code 0xA. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Acked-by: Halil Pasic <pasic@linux.ibm.com> Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com> Link: https://lore.kernel.org/r/20230815184333.6554-9-akrowiak@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-08-18s390/vfio-ap: use work struct to verify queue resetTony Krowiak2-25/+25
Instead of waiting to verify that a queue is reset in the vfio_ap_mdev_reset_queue function, let's use a wait queue to check the the state of the reset. This way, when resetting all of the queues assigned to a matrix mdev, we don't have to wait for each queue to be reset before initiating a reset on the next queue to be reset. Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Suggested-by: Halil Pasic <pasic@linux.ibm.com> Acked-by: Janosch Frank <frankja@linux.ibm.com> Tested-by: Viktor Mihajlovski <mihajlov@linux.ibm.com> Link: https://lore.kernel.org/r/20230815184333.6554-8-akrowiak@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>