summaryrefslogtreecommitdiff
path: root/drivers/s390/cio
AgeCommit message (Collapse)AuthorFilesLines
2024-03-12Merge tag 's390-6.9-1' of ↵Linus Torvalds7-23/+23
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Various virtual vs physical address usage fixes - Fix error handling in Processor Activity Instrumentation device driver, and export number of counters with a sysfs file - Allow for multiple events when Processor Activity Instrumentation counters are monitored in system wide sampling - Change multiplier and shift values of the Time-of-Day clock source to improve steering precision - Remove a couple of unneeded GFP_DMA flags from allocations - Disable mmap alignment if randomize_va_space is also disabled, to avoid a too small heap - Various changes to allow s390 to be compiled with LLVM=1, since ld.lld and llvm-objcopy will have proper s390 support witch clang 19 - Add __uninitialized macro to Compiler Attributes. This is helpful with s390's FPU code where some users have up to 520 byte stack frames. Clearing such stack frames (if INIT_STACK_ALL_PATTERN or INIT_STACK_ALL_ZERO is enabled) before they are used contradicts the intention (performance improvement) of such code sections. - Convert switch_to() to an out-of-line function, and use the generic switch_to header file - Replace the usage of s390's debug feature with pr_debug() calls within the zcrypt device driver - Improve hotplug support of the Adjunct Processor device driver - Improve retry handling in the zcrypt device driver - Various changes to the in-kernel FPU code: - Make in-kernel FPU sections preemptible - Convert various larger inline assemblies and assembler files to C, mainly by using singe instruction inline assemblies. This increases readability, but also allows makes it easier to add proper instrumentation hooks - Cleanup of the header files - Provide fast variants of csum_partial() and csum_partial_copy_nocheck() based on vector instructions - Introduce and use a lock to synchronize accesses to zpci device data structures to avoid inconsistent states caused by concurrent accesses - Compile the kernel without -fPIE. This addresses the following problems if the kernel is compiled with -fPIE: - It uses dynamic symbols (.dynsym), for which the linker refuses to allow more than 64k sections. This can break features which use '-ffunction-sections' and '-fdata-sections', including kpatch-build and function granular KASLR - It unnecessarily uses GOT relocations, adding an extra layer of indirection for many memory accesses - Fix shared_cpu_list for CPU private L2 caches, which incorrectly were reported as globally shared * tag 's390-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (117 commits) s390/tools: handle rela R_390_GOTPCDBL/R_390_GOTOFF64 s390/cache: prevent rebuild of shared_cpu_list s390/crypto: remove retry loop with sleep from PAES pkey invocation s390/pkey: improve pkey retry behavior s390/zcrypt: improve zcrypt retry behavior s390/zcrypt: introduce retries on in-kernel send CPRB functions s390/ap: introduce mutex to lock the AP bus scan s390/ap: rework ap_scan_bus() to return true on config change s390/ap: clarify AP scan bus related functions and variables s390/ap: rearm APQNs bindings complete completion s390/configs: increase number of LOCKDEP_BITS s390/vfio-ap: handle hardware checkstop state on queue reset operation s390/pai: change sampling event assignment for PMU device driver s390/boot: fix minor comment style damages s390/boot: do not check for zero-termination relocation entry s390/boot: make type of __vmlinux_relocs_64_start|end consistent s390/boot: sanitize kaslr_adjust_relocs() function prototype s390/boot: simplify GOT handling s390: vmlinux.lds.S: fix .got.plt assertion s390/boot: workaround current 'llvm-objdump -t -j ...' behavior ...
2024-02-22s390/cio: fix invalid -EBUSY on ccw_device_startPeter Oberparleiter1-3/+3
The s390 common I/O layer (CIO) returns an unexpected -EBUSY return code when drivers try to start I/O while a path-verification (PV) process is pending. This can lead to failed device initialization attempts with symptoms like broken network connectivity after boot. Fix this by replacing the -EBUSY return code with a deferred condition code 1 reply to make path-verification handling consistent from a driver's point of view. The problem can be reproduced semi-regularly using the following process, while repeating steps 2-3 as necessary (example assumes an OSA device with bus-IDs 0.0.a000-0.0.a002 on CHPID 0.02): 1. echo 0.0.a000,0.0.a001,0.0.a002 >/sys/bus/ccwgroup/drivers/qeth/group 2. echo 0 > /sys/bus/ccwgroup/devices/0.0.a000/online 3. echo 1 > /sys/bus/ccwgroup/devices/0.0.a000/online ; \ echo on > /sys/devices/css0/chp0.02/status Background information: The common I/O layer starts path-verification I/Os when it receives indications about changes in a device path's availability. This occurs for example when hardware events indicate a change in channel-path status, or when a manual operation such as a CHPID vary or configure operation is performed. If a driver attempts to start I/O while a PV is running, CIO reports a successful I/O start (ccw_device_start() return code 0). Then, after completion of PV, CIO synthesizes an interrupt response that indicates an asynchronous status condition that prevented the start of the I/O (deferred condition code 1). If a PV indication arrives while a device is busy with driver-owned I/O, PV is delayed until after I/O completion was reported to the driver's interrupt handler. To ensure that PV can be started eventually, CIO reports a device busy condition (ccw_device_start() return code -EBUSY) if a driver tries to start another I/O while PV is pending. In some cases this -EBUSY return code causes device drivers to consider a device not operational, resulting in failed device initialization. Note: The code that introduced the problem was added in 2003. Symptoms started appearing with the following CIO commit that causes a PV indication when a device is removed from the cio_ignore list after the associated parent subchannel device was probed, but before online processing of the CCW device has started: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") During boot, the cio_ignore list is modified by the cio_ignore dracut module [1] as well as Linux vendor-specific systemd service scripts[2]. When combined, this commit and boot scripts cause a frequent occurrence of the problem during boot. [1] https://github.com/dracutdevs/dracut/tree/master/modules.d/81cio_ignore [2] https://github.com/SUSE/s390-tools/blob/master/cio_ignore.service Cc: stable@vger.kernel.org # v5.15+ Fixes: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") Tested-By: Thorsten Winkler <twinkler@linux.ibm.com> Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-09s390/cio: make scm_bus_type constRicardo B. Marliere1-1/+1
Now that the driver core can properly handle constant struct bus_type, move the scm_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Ricardo B. Marliere" <ricardo@marliere.net> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240203-bus_cleanup-s390-v1-4-ac891afc7282@marliere.net Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-09s390/cio: make ccw_bus_type constRicardo B. Marliere1-2/+2
Now that the driver core can properly handle constant struct bus_type, move the ccw_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Ricardo B. Marliere" <ricardo@marliere.net> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240203-bus_cleanup-s390-v1-3-ac891afc7282@marliere.net Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-09s390/cio: make css_bus_type constRicardo B. Marliere1-2/+2
Now that the driver core can properly handle constant struct bus_type, move the css_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Ricardo B. Marliere" <ricardo@marliere.net> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240203-bus_cleanup-s390-v1-2-ac891afc7282@marliere.net Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-09s390/ccwgroup: make ccwgroup_bus_type constRicardo B. Marliere1-2/+2
Now that the driver core can properly handle constant struct bus_type, move the ccwgroup_bus_type variable to be a constant structure as well, placing it into read-only memory which can not be modified at runtime. Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: "Ricardo B. Marliere" <ricardo@marliere.net> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20240203-bus_cleanup-s390-v1-1-ac891afc7282@marliere.net Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-09s390/cmf: fix virtual vs physical address confusionHeiko Carstens1-1/+2
The measurement block origin address is an absolute address; therefore add a missing virt_to_phys() translation to the cmf_activate() inline assembly. This doesn't fix a bug, since virtual and physical addresses are currently identical. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-09s390/cmf: remove unneeded DMA zone allocationHeiko Carstens1-2/+1
The address of the measurement block can be anywhere in 64 bit absolute space. See description of the schm instruction in the Principles of Operation. Therefore remove the GFP_DMA flag when allocating the block. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-02-09s390/cio: remove unneeded DMA zone allocationPeter Oberparleiter3-13/+13
Remove GFP_DMA flag when allocating memory to be used for CHSC control blocks. The CHSC instruction can access memory beyond the DMA zone. Suggested-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2024-01-11Merge tag 's390-6.8-1' of ↵Linus Torvalds10-115/+99
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Add machine variable capacity information to /proc/sysinfo. - Limit the waste of page tables and always align vmalloc area size and base address on segment boundary. - Fix a memory leak when an attempt to register interruption sub class (ISC) for the adjunct-processor (AP) guest failed. - Reset response code AP_RESPONSE_INVALID_GISA to understandable by guest AP_RESPONSE_INVALID_ADDRESS in response to a failed interruption sub class (ISC) registration attempt. - Improve reaction to adjunct-processor (AP) AP_RESPONSE_OTHERWISE_CHANGED response code when enabling interrupts on behalf of a guest. - Fix incorrect sysfs 'status' attribute of adjunct-processor (AP) queue device bound to the vfio_ap device driver when the mediated device is attached to a guest, but the queue device is not passed through. - Rework struct ap_card to hold the whole adjunct-processor (AP) card hardware information. As result, all the ugly bit checks are replaced by simple evaluations of the required bit fields. - Improve handling of some weird scenarios between service element (SE) host and SE guest with adjunct-processor (AP) pass-through support. - Change local_ctl_set_bit() and local_ctl_clear_bit() so they return the previous value of the to be changed control register. This is useful if a bit is only changed temporarily and the previous content needs to be restored. - The kernel starts with machine checks disabled and is expected to enable it once trap_init() is called. However the implementation allows machine checks early. Consistently enable it in trap_init() only. - local_mcck_disable() and local_mcck_enable() assume that machine checks are always enabled. Instead implement and use local_mcck_save() and local_mcck_restore() to disable machine checks and restore the previous state. - Modification of floating point control (FPC) register of a traced process using ptrace interface may lead to corruption of the FPC register of the tracing process. Fix this. - kvm_arch_vcpu_ioctl_set_fpu() allows to set the floating point control (FPC) register in vCPU, but may lead to corruption of the FPC register of the host process. Fix this. - Use READ_ONCE() to read a vCPU floating point register value from the memory mapped area. This avoids that, depending on code generation, a different value is tested for validity than the one that is used. - Get rid of test_fp_ctl(), since it is quite subtle to use it correctly. Instead copy a new floating point control register value into its save area and test the validity of the new value when loading it. - Remove superfluous save_fpu_regs() call. - Remove s390 support for ARCH_WANTS_DYNAMIC_TASK_STRUCT. All machines provide the vector facility since many years and the need to make the task structure size dependent on the vector facility does not exist. - Remove the "novx" kernel command line option, as the vector code runs without any problems since many years. - Add the vector facility to the z13 architecture level set (ALS). All hypervisors support the vector facility since many years. This allows compile time optimizations of the kernel. - Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx(). As result, the compiled code will have less runtime checks and less code. - Convert pgste_get_lock() and pgste_set_unlock() ASM inlines to C. - Convert the struct subchannel spinlock from pointer to member. * tag 's390-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (24 commits) Revert "s390: update defconfigs" s390/cio: make sch->lock spinlock pointer a member s390: update defconfigs s390/mm: convert pgste locking functions to C s390/fpu: get rid of MACHINE_HAS_VX s390/als: add vector facility to z13 architecture level set s390/fpu: remove "novx" option s390/fpu: remove ARCH_WANTS_DYNAMIC_TASK_STRUCT support KVM: s390: remove superfluous save_fpu_regs() call s390/fpu: get rid of test_fp_ctl() KVM: s390: use READ_ONCE() to read fpc register value KVM: s390: fix setting of fpc register s390/ptrace: handle setting of fpc register correctly s390/nmi: implement and use local_mcck_save() / local_mcck_restore() s390/nmi: consistently enable machine checks in trap_init() s390/ctlreg: return old register contents when changing bits s390/ap: handle outband SE bind state change s390/ap: store TAPQ hwinfo in struct ap_card s390/vfio-ap: fix sysfs status attribute for AP queue devices s390/vfio-ap: improve reaction to response code 07 from PQAP(AQIC) command ...
2023-12-12s390/cio: make sch->lock spinlock pointer a memberHalil Pasic10-115/+99
The lock member of struct subchannel used to be a spinlock, but became a pointer to a spinlock with commit 2ec2298412e1 ("[S390] subchannel lock conversion."). This might have been justified back then, but with the current state of affairs, there is no reason to manage a separate spinlock object. Let's simplify things and pull the spinlock back into struct subchannel. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Link: https://lore.kernel.org/r/20231101115751.2308307-1-pasic@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-11-28eventfd: simplify eventfd_signal()Christian Brauner3-6/+6
Ever since the eventfd type was introduced back in 2007 in commit e1ad7468c77d ("signal/timer/event: eventfd core") the eventfd_signal() function only ever passed 1 as a value for @n. There's no point in keeping that additional argument. Link: https://lore.kernel.org/r/20231122-vfs-eventfd-signal-v2-2-bd549b14ce0c@kernel.org Acked-by: Xu Yilun <yilun.xu@intel.com> Acked-by: Andrew Donnellan <ajd@linux.ibm.com> # ocxl Acked-by: Eric Farman <farman@linux.ibm.com> # s390 Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-03Merge tag 's390-6.7-1' of ↵Linus Torvalds4-9/+9
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Get rid of private VM_FAULT flags - Add word-at-a-time implementation - Add DCACHE_WORD_ACCESS support - Cleanup control register handling - Disallow CPU hotplug of CPU 0 to simplify its handling complexity, following a similar restriction in x86 - Optimize pai crypto map allocation - Update the list of crypto express EP11 coprocessor operation modes - Fixes and improvements for secure guests AP pass-through - Several fixes to address incorrect page marking for address translation with the "cmma no-dat" feature, preventing potential incorrect guest TLB flushes - Fix early IPI handling - Several virtual vs physical address confusion fixes - Various small fixes and improvements all over the code * tag 's390-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (74 commits) s390/cio: replace deprecated strncpy with strscpy s390/sclp: replace deprecated strncpy with strtomem s390/cio: fix virtual vs physical address confusion s390/cio: export CMG value as decimal s390: delete the unused store_prefix() function s390/cmma: fix handling of swapper_pg_dir and invalid_pg_dir s390/cmma: fix detection of DAT pages s390/sclp: handle default case in sclp memory notifier s390/pai_crypto: remove per-cpu variable assignement in event initialization s390/pai: initialize event count once at initialization s390/pai_crypto: use PERF_ATTACH_TASK define for per task detection s390/mm: add missing arch_set_page_dat() call to gmap allocations s390/mm: add missing arch_set_page_dat() call to vmem_crst_alloc() s390/cmma: fix initial kernel address space page table walk s390/diag: add missing virt_to_phys() translation to diag224() s390/mm,fault: move VM_FAULT_ERROR handling to do_exception() s390/mm,fault: remove VM_FAULT_BADMAP and VM_FAULT_BADACCESS s390/mm,fault: remove VM_FAULT_SIGNAL s390/mm,fault: remove VM_FAULT_BADCONTEXT s390/mm,fault: simplify kfence fault handling ...
2023-10-25s390/cio: replace deprecated strncpy with strscpyJustin Stitt1-2/+2
strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. We expect both `params` and `id` to be NUL-terminated based on their usage with format strings: format_node_data(iuparams, iunodeid, &lir->incident_node); format_node_data(auparams, aunodeid, &lir->attached_node); switch (lir->iq.class) { case LIR_IQ_CLASS_DEGRADED: pr_warn("Link degraded: RS=%02x RSID=%04x IC=%02x " "IUPARAMS=%s IUNODEID=%s AUPARAMS=%s AUNODEID=%s\n", sei_area->rs, sei_area->rsid, lir->ic, iuparams, iunodeid, auparams, aunodeid); NUL-padding is not required as both `params` and `id` have been memset to 0: memset(params, 0, PARAMS_LEN); memset(id, 0, NODEID_LEN); Considering the above, a suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Note that there's no overread bugs in the current implementation as the string literal "n/a" has a size much smaller than PARAMS_LEN or NODEID_LEN. Nonetheless, let's favor strscpy(). Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20231023-strncpy-drivers-s390-cio-chsc-c-v1-1-8b76a7b83260@google.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-10-25s390/cio: fix virtual vs physical address confusionPeter Oberparleiter1-2/+2
Fix virtual vs physical address confusion (which currently are the same). Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-10-25s390/cio: export CMG value as decimalPeter Oberparleiter1-1/+1
Change format of the "cmg" sysfs attribute from hex to decimal to make it easier to consume. Note that this should not break any existing users since only values 2 and 3 are currently exported. Also the main user already assumes decimal notation [1]. [1] https://sourceforge.net/p/sblim/gather/ci/master/tree/plugin/metriczCH.c Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-10-16s390/cio: fix a memleak in css_alloc_subchannelDinghao Liu1-2/+4
When dma_set_coherent_mask() fails, sch->lock has not been freed, which is allocated in css_sch_create_locks(), leading to a memleak. Fixes: 4520a91a976e ("s390/cio: use dma helpers for setting masks") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Message-Id: <20230921071412.13806-1-dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/linux-s390/bd38baa8-7b9d-4d89-9422-7e943d626d6e@linux.ibm.com/ Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19s390: use control register bit definesHeiko Carstens1-1/+1
Use control register bit defines instead of plain numbers where possible. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19s390/ctlreg: add local and system prefix to some functionsHeiko Carstens2-3/+3
Add local and system prefix to some functions to clarify they change control register contents on either the local CPU or the on all CPUs. This results in the following API: Two defines which load and save multiple control registers. The defines correlate with the following C prototypes: void __local_ctl_load(unsigned long *, unsigned int cr_low, unsigned int cr_high); void __local_ctl_store(unsigned long *, unsigned int cr_low, unsigned int cr_high); Two functions which locally set or clear one bit for a specified control register: void local_ctl_set_bit(unsigned int cr, unsigned int bit); void local_ctl_clear_bit(unsigned int cr, unsigned int bit); Two functions which set or clear one bit for a specified control register on all CPUs: void system_ctl_set_bit(unsigned int cr, unsigned int bit); void system_ctl_clear_bit(unsigend int cr, unsigned int bit); Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-19s390/ctlreg: rename ctl_reg.h to ctlreg.hHeiko Carstens1-1/+1
Rename ctl_reg.h to ctlreg.h so it matches not only ctlreg.c but also other control register related function, union, and structure names, which all come with a ctlreg prefix. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-09-07Merge tag 's390-6.6-2' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Heiko Carstens: - A couple of virtual vs physical address confusion fixes - Rework locking in dcssblk driver to address a lockdep warning - Remove support for "noexec" kernel command line option since there is no use case where it would make sense - Simplify kernel mapping setup and get rid of quite a bit of code - Add architecture specific __set_memory_yy() functions which allow us to modify kernel mappings. Unlike the set_memory_xx() variants they take void pointer start and end parameters, which allows using them without the usual casts, and also to use them on areas larger than 8TB. Note that the set_memory_xx() family comes with an int num_pages parameter which overflows with 8TB. This could be addressed by changing the num_pages parameter to unsigned long, however requires to change all architectures, since the module code expects an int parameter (see module_set_memory()). This was indeed an issue since for debug_pagealloc() we call set_memory_4k() on the whole identity mapping. Therefore address this for now with the __set_memory_yy() variant, and address common code later - Use dev_set_name() and also fix memory leak in zcrypt driver error handling - Remove unused lsi_mask from airq_struct - Add warning for invalid kernel mapping requests * tag 's390-6.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/vmem: do not silently ignore mapping limit s390/zcrypt: utilize dev_set_name() ability to use a formatted string s390/zcrypt: don't leak memory if dev_set_name() fails s390/mm: fix MAX_DMA_ADDRESS physical vs virtual confusion s390/airq: remove lsi_mask from airq_struct s390/mm: use __set_memory() variants where useful s390/set_memory: add __set_memory() variant s390/set_memory: generate all set_memory() functions s390/mm: improve description of mapping permissions of prefix pages s390/amode31: change type of __samode31, __eamode31, etc s390/mm: simplify kernel mapping setup s390: remove "noexec" option s390/vmem: fix virtual vs physical address confusion s390/dcssblk: fix lockdep warning s390/monreader: fix virtual vs physical address confusion
2023-08-30s390/airq: remove lsi_mask from airq_structBenjamin Block1-3/+1
Remove the field `lsi_mask` from `struct airq_struct` as it is not utilized for any adapter interrupt, other than setting it to the default value of 0xff. Because nobody is using this functionality, all it does is cost a little bit of time with each delivered adapter interrupt. Reviewed-by: Michael Mueller <mimu@linux.ibm.com> Tested-by: Michael Mueller <mimu@linux.ibm.com> Acked-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-07-25vfio-iommufd: Add detach_ioas support for emulated VFIO devicesYi Liu1-0/+1
This prepares for adding DETACH ioctl for emulated VFIO devices. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Terrence Xu <terrence.xu@intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Matthew Rosato <mjrosato@linux.ibm.com> Tested-by: Yanting Jiang <yanting.jiang@intel.com> Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Tested-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20230718135551.6592-16-yi.l.liu@intel.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-07-03s390: fix various typosHeiko Carstens4-6/+6
Fix various typos found with codespell. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-28Merge tag 's390-6.5-1' of ↵Linus Torvalds2-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Alexander Gordeev: - Fix the style of protected key API driver source: use x-mas tree for all local variable declarations - Rework protected key API driver to not use the struct pkey_protkey and pkey_clrkey anymore. Both structures have a fixed size buffer, but with the support of ECC protected key these buffers are not big enough. Use dynamic buffers internally and transparently for userspace - Add support for a new 'non CCA clear key token' with ECC clear keys supported: ECC P256, ECC P384, ECC P521, ECC ED25519 and ECC ED448. This makes it possible to derive a protected key from the ECC clear key input via PKEY_KBLOB2PROTK3 ioctl, while currently the only way to derive is via PCKMO instruction - The s390 PMU of PAI crypto and extension 1 NNPA counters use atomic_t for reference counting. Replace this with the proper data type refcount_t - Select ARCH_SUPPORTS_INT128, but limit this to clang for now, since gcc generates inefficient code, which may lead to stack overflows - Replace one-element array with flexible-array member in struct vfio_ccw_parent and refactor the rest of the code accordingly. Also, prefer struct_size() over sizeof() open- coded versions - Introduce OS_INFO_FLAGS_ENTRY pointing to a flags field and OS_INFO_FLAG_REIPL_CLEAR flag that informs a dumper whether the system memory should be cleared or not once dumped - Fix a hang when a user attempts to remove a VFIO-AP mediated device attached to a guest: add VFIO_DEVICE_GET_IRQ_INFO and VFIO_DEVICE_SET_IRQS IOCTLs and wire up the VFIO bus driver callback to request a release of the device - Fix calculation for R_390_GOTENT relocations for modules - Allow any user space process with CAP_PERFMON capability read and display the CPU Measurement facility counter sets - Rework large statically-defined per-CPU cpu_cf_events data structure and replace it with dynamically allocated structures created when a perf_event_open() system call is invoked or /dev/hwctr device is accessed * tag 's390-6.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cpum_cf: rework PER_CPU_DEFINE of struct cpu_cf_events s390/cpum_cf: open access to hwctr device for CAP_PERFMON privileged process s390/module: fix rela calculation for R_390_GOTENT s390/vfio-ap: wire in the vfio_device_ops request callback s390/vfio-ap: realize the VFIO_DEVICE_SET_IRQS ioctl s390/vfio-ap: realize the VFIO_DEVICE_GET_IRQ_INFO ioctl s390/pkey: add support for ecc clear key s390/pkey: do not use struct pkey_protkey s390/pkey: introduce reverse x-mas trees s390/zcore: conditionally clear memory on reipl s390/ipl: add REIPL_CLEAR flag to os_info vfio/ccw: use struct_size() helper vfio/ccw: replace one-element array with flexible-array member s390: select ARCH_SUPPORTS_INT128 s390/pai_ext: replace atomic_t with refcount_t s390/pai_crypto: replace atomic_t with refcount_t
2023-06-01vfio/ccw: use struct_size() helperGustavo A. R. Silva1-2/+1
Prefer struct_size() over open-coded versions. Link: https://github.com/KSPP/linux/issues/160 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/f657276073630e806e69726a40ad1cc85101448a.1684805398.git.gustavoars@kernel.org Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-06-01vfio/ccw: replace one-element array with flexible-array memberGustavo A. R. Silva2-2/+3
One-element arrays are deprecated, and we are replacing them with flexible array members instead. So, replace one-element array with flexible-array member in struct vfio_ccw_parent and refactor the rest of the code accordingly. Link: https://github.com/KSPP/linux/issues/79 Link: https://github.com/KSPP/linux/issues/297 Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/3c10549ebe1564eade68a2515bde233527376971.1684805398.git.gustavoars@kernel.org Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-05-22s390/cio: unregister device when the only path is goneVineeth Vijayan1-1/+4
Currently, if the device is offline and all the channel paths are either configured or varied offline, the associated subchannel gets unregistered. Don't unregister the subchannel, instead unregister offline device. Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-05-17s390/qdio: fix do_sqbs() inline assembly constraintHeiko Carstens1-1/+1
Use "a" constraint instead of "d" constraint to pass the state parameter to the do_sqbs() inline assembly. This prevents that general purpose register zero is used for the state parameter. If the compiler would select general purpose register zero this would be problematic for the used instruction in rsy format: the register used for the state parameter is a base register. If the base register is general purpose register zero the contents of the register are unexpectedly ignored when the instruction is executed. This only applies to z/VM guests using QIOASSIST with dedicated (pass through) QDIO-based devices such as FCP [zfcp driver] as well as real OSA or HiperSockets [qeth driver]. A possible symptom for this case using zfcp is the following repeating kernel message pattern: zfcp <devbusid>: A QDIO problem occurred zfcp <devbusid>: A QDIO problem occurred zfcp <devbusid>: qdio: ZFCP on SC <sc> using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W zfcp <devbusid>: A QDIO problem occurred zfcp <devbusid>: A QDIO problem occurred Each of the qdio problem message can be accompanied by the following entries for the affected subchannel <sc> in /sys/kernel/debug/s390dbf/qdio_error/hex_ascii for zfcp or qeth: <sc> ccq: 69.... <sc> SQBS ERROR. Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Cc: Steffen Maier <maier@linux.ibm.com> Fixes: 8129ee164267 ("[PATCH] s390: qdio V=V pass-through") Cc: <stable@vger.kernel.org> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-05-15s390/cio: include subchannels without devices also for evaluationVineeth Vijayan1-0/+2
Currently when the new channel-path is enabled, we do evaluation only on the subchannels with a device connected on it. This is because, in the past, if the device in the subchannel is not working or not available, we used to unregister the subchannels. But, from the 'commit 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers")' we allow subchannels with or without an active device connected on it. So, when we do the io_subchannel_verify, make sure that, we are evaluating the subchannels without any device too. Fixes: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2023-04-13s390/cio: replace zero-length array with flexible-array memberHeiko Carstens2-2/+2
There are numerous patches which convert zero-length arrays with a flexible-array member. Convert the remaining s390 occurrences. Suggested-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://github.com/KSPP/linux/issues/78 Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-02-24Merge tag 'driver-core-6.3-rc1' of ↵Linus Torvalds3-7/+7
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.3-rc1. There's a lot of changes this development cycle, most of the work falls into two different categories: - fw_devlink fixes and updates. This has gone through numerous review cycles and lots of review and testing by lots of different devices. Hopefully all should be good now, and Saravana will be keeping a watch for any potential regression on odd embedded systems. - driver core changes to work to make struct bus_type able to be moved into read-only memory (i.e. const) The recent work with Rust has pointed out a number of areas in the driver core where we are passing around and working with structures that really do not have to be dynamic at all, and they should be able to be read-only making things safer overall. This is the contuation of that work (started last release with kobject changes) in moving struct bus_type to be constant. We didn't quite make it for this release, but the remaining patches will be finished up for the release after this one, but the groundwork has been laid for this effort. Other than that we have in here: - debugfs memory leak fixes in some subsystems - error path cleanups and fixes for some never-able-to-be-hit codepaths. - cacheinfo rework and fixes - Other tiny fixes, full details are in the shortlog All of these have been in linux-next for a while with no reported problems" [ Geert Uytterhoeven points out that that last sentence isn't true, and that there's a pending report that has a fix that is queued up - Linus ] * tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits) debugfs: drop inline constant formatting for ERR_PTR(-ERROR) OPP: fix error checking in opp_migrate_dentry() debugfs: update comment of debugfs_rename() i3c: fix device.h kernel-doc warnings dma-mapping: no need to pass a bus_type into get_arch_dma_ops() driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place Revert "driver core: add error handling for devtmpfs_create_node()" Revert "devtmpfs: add debug info to handle()" Revert "devtmpfs: remove return value of devtmpfs_delete_node()" driver core: cpu: don't hand-override the uevent bus_type callback. devtmpfs: remove return value of devtmpfs_delete_node() devtmpfs: add debug info to handle() driver core: add error handling for devtmpfs_create_node() driver core: bus: update my copyright notice driver core: bus: add bus_get_dev_root() function driver core: bus: constify bus_unregister() driver core: bus: constify some internal functions driver core: bus: constify bus_get_kset() driver core: bus: constify bus_register/unregister_notifier() driver core: remove private pointer from struct bus_type ...
2023-02-14vfio/ccw: remove WARN_ON during shutdownEric Farman1-1/+1
The logic in vfio_ccw_sch_shutdown() always assumed that the input subchannel would point to a vfio_ccw_private struct, without checking that one exists. The blamed commit put in a check for this scenario, to prevent the possibility of a missing private. The trouble is that check was put alongside a WARN_ON(), presuming that such a scenario would be a cause for concern. But this can be triggered by binding a subchannel to vfio-ccw, and rebooting the system before starting the mdev (via "mdevctl start" or similar) or after stopping it. In those cases, shutdown doesn't need to worry because either the private was never allocated, or it was cleaned up by vfio_ccw_mdev_remove(). Remove the WARN_ON() piece of this check, since there are plausible scenarios where private would be NULL in this path. Fixes: 9e6f07cd1eaa ("vfio/ccw: create a parent struct") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20230210174227.2256424-1-farman@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-31s390/cio: introduce locking for register/unregister functionsVineeth Vijayan1-0/+9
Unbinding an I/O subchannel with a child-CCW device in disconnected state sometimes causes a kernel-panic. The race condition was seen mostly during testing, when setting all the CHPIDs of a device to offline and at the same time, the unbinding the I/O subchannel driver. The kernel-panic occurs because of double delete, the I/O subchannel driver calls device_del on the CCW device while another device_del invocation for the same device is in-flight. For instance, disabling all the CHPIDs will trigger the ccw_device_remove function, which will call a ccw_device_unregister(), which ends up calling the device_del() which is asynchronous via cdev's todo workqueue. And unbinding the I/O subchannel driver calls io_subchannel_remove() function which calls the ccw_device_unregister() and device_del(). This double delete can be prevented by serializing all CCW device registration/unregistration calls into the driver core. This patch introduces a mutex which will be used for this purpose. Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Reported-by: Boris Fiuczynski <fiuczy@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-27driver core: make struct bus_type.uevent() take a const *Greg Kroah-Hartman3-7/+7
The uevent() callback in struct bus_type should not be modifying the device that is passed into it, so mark it as a const * and propagate the function signature changes out into all relevant subsystems that use this callback. Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-22s390/cio: evaluate devices with non-operational pathsVineeth Vijayan2-7/+16
css_schedule_reprobe() function calls the evaluation for CSS_EVAL_UNREG which is specific to the idset of unregistered subchannels. This evaluation was introduced because, previously, if the underlying device become not-accessible, the subchannel was unregistered. But, in the recent changes in cio,with the commit '2297791c92d0 s390/cio: dont unregister subchannel from child-drivers', we no longer unregister the subchannels just because of a non-operational device. This allows to have subchannels without any operational device connected on it. So, a css_schedule_reprobe function on unregistered subchannel does not have any effect. Change this functionality to evaluate the subchannels which does not have a working path to the device. This could be due the erroneous device or due to the erraneous path. Evaluate based on the values of OPM and PAM&POM. Here we introduced a new idset function,to keep I/O subchannels in the idset when the last seen status indicates that the device has no working path. A device has no working path if all available paths have been tried without success.A failed I/O attempt on a path is indicated as a 0 bit value in the POM mask. By looking at the POM mask bit values of available paths (1 in PAM) that Linux is supposed to use (1 in vary mask OPM), we can identify a non-working device as a device where the bit-wise and of the PAM, POM and OPM mask return 0. css_schedule_reprobe() is being used by dasd-driver and chsc-cio component. dasd driver, when it detects a change in the pathgroup, invokes the re-evaluation of the subchannel. And chsc-cio component upon a CRW event, (resource accessibility event). In both the cases, it makes much better sense to re-evalute the subchannel with no-valid path. Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Reported-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Tested-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: remove old IDA format restrictionsEric Farman1-8/+0
By this point, all the pieces are in place to properly support a 2K Format-2 IDAL, and to convert a guest Format-1 IDAL to the 2K Format-2 variety. Let's remove the fence that prohibits them, and allow a guest to submit them if desired. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: don't group contiguous pages on 2K IDAWsEric Farman1-10/+20
The vfio_pin_pages() interface allows contiguous pages to be pinned as a single request, which is great for the 4K pages that are normally processed. Old IDA formats operate on 2K chunks, which makes this logic more difficult. Since these formats are rare, let's just invoke the page pinning one-at-a-time, instead of trying to group them. We can rework this code at a later date if needed. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: handle a guest Format-1 IDALEric Farman1-3/+16
There are two scenarios that need to be addressed here. First, an ORB that does NOT have the Format-2 IDAL bit set could have both a direct-addressed CCW and an indirect-data-address CCW chained together. This means that the IDA CCW will contain a Format-1 IDAL, and can be easily converted to a 2K Format-2 IDAL. But it also means that the direct-addressed CCW needs to be converted to the same 2K Format-2 IDAL for consistency with the ORB settings. Secondly, a Format-1 IDAL is comprised of 31-bit addresses. Thus, we need to cast this IDAL to a pointer of ints while populating the list of addresses that are sent to vfio. Since the result of both of these is the use of the 2K IDAL variants, and the output of vfio-ccw is always a Format-2 IDAL (in order to use 64-bit addresses), make sure that the correct control bit gets set in the ORB when these scenarios occur. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: allocate/populate the guest idalEric Farman1-24/+52
Today, we allocate memory for a list of IDAWs, and if the CCW being processed contains an IDAL we read that data from the guest into that space. We then copy each IDAW into the pa_iova array, or fabricate that pa_iova array with a list of addresses based on a direct-addressed CCW. Combine the reading of the guest IDAL with the creation of a pseudo-IDAL for direct-addressed CCWs, so that both CCW types have a "guest" IDAL that can be populated straight into the pa_iova array. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: calculate number of IDAWs regardless of formatEric Farman1-0/+16
The idal_nr_words() routine works well for 4K IDAWs, but lost its ability to handle the old 2K formats with the removal of 31-bit builds in commit 5a79859ae0f3 ("s390: remove 31 bit support"). Since there's nothing preventing a guest from generating this IDAW format, let's re-introduce the math for them and use both when calculating the number of IDAWs based on the bits specified in the ORB. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: read only one Format-1 IDAWEric Farman1-3/+11
The intention is to read the first IDAW to determine the starting location of an I/O operation, knowing that the second and any/all subsequent IDAWs will be aligned per architecture. But, this read receives 64-bits of data, which is the size of a Format-2 IDAW. In the event that Format-1 IDAWs are presented, adjust the size of the read to 32-bits. The data will end up occupying the upper word of the target iova variable, so shift it down to the lower word for use as an address. (By definition, this IDAW format uses a 31-bit address, so the "sign" bit will always be off and there is no concern about sign extension.) Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: refactor the idaw counterEric Farman1-9/+30
The rules of an IDAW are fairly simple: Each one can move no more than a defined amount of data, must not cross the boundary defined by that length, and must be aligned to that length as well. The first IDAW in a list is special, in that it does not need to adhere to that alignment, but the other rules still apply. Thus, by reading the first IDAW in a list, the number of IDAWs that will comprise a data transfer of a particular size can be calculated. Let's factor out the reading of that first IDAW with the logic that calculates the length of the list, to simplify the rest of the routine that handles the individual IDAWs. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: populate page_array struct inlineEric Farman1-17/+5
There are two possible ways the list of addresses that get passed to vfio are calculated. One is from a guest IDAL, which would be an array of (probably) non-contiguous addresses. The other is built from contiguous pages that follow the starting address provided by ccw->cda. page_array_alloc() attempts to simplify things by pre-populating this array from the starting address, but that's not needed for a CCW with an IDAL anyway so doesn't need to be in the allocator. Move it to the caller in the non-IDAL case, since it will be overwritten when reading the guest IDAL. Remove the initialization of the pa_page output pointers, since it won't be explicitly needed for either case. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: pass page count to page_array structEric Farman1-10/+13
The allocation of our page_array struct calculates the number of 4K pages that would be needed to hold a certain number of bytes. But, since the number of pages that will be pinned is also calculated by the length of the IDAL, this logic is unnecessary. Let's pass that information in directly, and avoid the math within the allocator. Also, let's make this two allocations instead of one, to make it apparent what's happening within here. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: remove unnecessary malloc alignmentEric Farman1-20/+23
Everything about this allocation is harder than necessary, since the memory allocation is already aligned to our needs. Break them apart for readability, instead of doing the funky arithmetic. Of the structures that are involved, only ch_ccw needs the GFP_DMA flag, so the others can be allocated without it. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: simplify CCW chain fetch routinesEric Farman1-18/+15
The act of processing a fetched CCW has two components: 1) Process a Transfer-in-channel (TIC) CCW 2) Process any other CCW The former needs to look at whether the TIC jumps backwards into the current channel program or forwards into a new segment, while the latter just processes the CCW data address itself. Rather than passing the chain segment and index within it to the handlers for the above, and requiring each to calculate the elements it needs, simply pass the needed pointers directly. For the TIC, that means the CCW being processed and the location of the entire channel program which holds all segments. For the other CCWs, the page_array pointer is also needed to perform the page pinning, etc. While at it, rename ccwchain_fetch_direct to _ccw, to indicate what it is. The name "_direct" is historical, when it used to process a direct-addressed CCW, but IDAs are processed here too. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: replace copy_from_iova with vfio_dma_rwEric Farman1-51/+5
It was suggested [1] that we replace the old copy_from_iova() routine (which pins a page, does a memcpy, and unpins the page) with the newer vfio_dma_rw() interface. This has a modest improvement in the overall time spent through the fsm_io_request() path, and simplifies some of the code to boot. [1] https://lore.kernel.org/r/20220706170553.GK693670@nvidia.com/ Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: move where IDA flag is set in ORBEric Farman1-7/+6
The output of vfio_ccw is always a Format-2 IDAL, but the code that explicitly sets this is buried in cp_init(). In fact the input is often already a Format-2 IDAL, and would be rejected (via the check in ccwchain_calc_length()) if it weren't, so explicitly setting it doesn't do much. Setting it way down here only makes it impossible to make decisions in support of other IDAL formats. Let's move that to where the rest of the ORB is set up, so that the CCW processing in cp_prefetch() is performed according to the contents of the unmodified guest ORB. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-01-09vfio/ccw: allow non-zero storage keysEric Farman1-1/+0
Currently, vfio-ccw copies the ORB from the io_region to the channel_program struct being built. It then adjusts various pieces of that ORB to the values needed to be used by the SSCH issued by vfio-ccw in the host. This includes setting the subchannel key to the default, presumably because Linux doesn't do anything with non-zero storage keys itself. But it seems wrong to convert every I/O to the default key if the guest itself requested a non-zero subchannel (access) key. Any channel program that sets a non-zero key would expect the same key returned in the SCSW of the IRB, not zero, so best to allow that to occur unimpeded. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>