summaryrefslogtreecommitdiff
path: root/arch/s390
AgeCommit message (Collapse)AuthorFilesLines
2020-05-08crypto: s390/sha1 - prefix the "sha1_" functionsEric Biggers1-6/+6
Prefix the s390 SHA-1 functions with "s390_sha1_" rather than "sha1_". This allows us to rename the library function sha_init() to sha1_init() without causing a naming collision. Cc: linux-s390@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2020-05-08s390/module: Use s390_kernel_write() for late relocationsPeter Zijlstra1-59/+88
Because of late module patching, a livepatch module needs to be able to apply some of its relocations well after it has been loaded. Instead of playing games with module_{dis,en}able_ro(), use existing text poking mechanisms to apply relocations after module loading. So far only x86, s390 and Power have HAVE_LIVEPATCH but only the first two also have STRICT_MODULE_RWX. This will allow removal of the last module_disable_ro() usage in livepatch. The ultimate goal is to completely disallow making executable mappings writable. [ jpoimboe: Split up patches. Use mod state to determine whether memcpy() can be used. Test and add fixes. ] Cc: linux-s390@vger.kernel.org Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> # s390 Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-05-08s390: Change s390_kernel_write() return type to match memcpy()Josh Poimboeuf2-4/+7
s390_kernel_write()'s function type is almost identical to memcpy(). Change its return type to "void *" so they can be used interchangeably. Cc: linux-s390@vger.kernel.org Cc: heiko.carstens@de.ibm.com Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Acked-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Miroslav Benes <mbenes@suse.cz> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> # s390 Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2020-05-07Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-1/+4
Pull kvm fixes from Paolo Bonzini: "Bugfixes, mostly for ARM and AMD, and more documentation. Slightly bigger than usual because I couldn't send out what was pending for rc4, but there is nothing worrisome going on. I have more fixes pending for guest debugging support (gdbstub) but I will send them next week" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (22 commits) KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly KVM: selftests: Fix build for evmcs.h kvm: x86: Use KVM CPU capabilities to determine CR4 reserved bits KVM: VMX: Explicitly clear RFLAGS.CF and RFLAGS.ZF in VM-Exit RSB path docs/virt/kvm: Document configuring and running nested guests KVM: s390: Remove false WARN_ON_ONCE for the PQAP instruction kvm: ioapic: Restrict lazy EOI update to edge-triggered interrupts KVM: x86: Fixes posted interrupt check for IRQs delivery modes KVM: SVM: fill in kvm_run->debug.arch.dr[67] KVM: nVMX: Replace a BUG_ON(1) with BUG() to squash clang warning KVM: arm64: Fix 32bit PC wrap-around KVM: arm64: vgic-v4: Initialize GICv4.1 even in the absence of a virtual ITS KVM: arm64: Save/restore sp_el0 as part of __guest_enter KVM: arm64: Delete duplicated label in invalid_vector KVM: arm64: vgic-its: Fix memory leak on the error path of vgic_add_lpi() KVM: arm64: vgic-v3: Retire all pending LPIs on vcpu destroy KVM: arm: vgic-v2: Only use the virtual state when userspace accesses pending bits KVM: arm: vgic: Only use the virtual state when userspace accesses enable bits KVM: arm: vgic: Synchronize the whole guest on GIC{D,R}_I{S,C}ACTIVER read KVM: arm64: PSCI: Forbid 64bit functions for 32bit guests ...
2020-05-07KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properlyPeter Xu1-0/+1
KVM_CAP_SET_GUEST_DEBUG should be supported for x86 however it's not declared as supported. My wild guess is that userspaces like QEMU are using "#ifdef KVM_CAP_SET_GUEST_DEBUG" to check for the capability instead, but that could be wrong because the compilation host may not be the runtime host. The userspace might still want to keep the old "#ifdef" though to not break the guest debug on old kernels. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200505154750.126300-1-peterx@redhat.com> [Do the same for PPC and s390. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller8-11/+27
Conflicts were all overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-06s390: nvme reiplJason J. Herne1-1/+147
Populate sysfs and structs with reipl entries for nvme ipl type. This allows specifying a target nvme device when rebooting/reipling. Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-05-06s390: nvme iplJason J. Herne3-0/+99
Recognize IPL Block's Ipl Type of "nvme". Populate related structs and sysfs entries. Signed-off-by: Jason J. Herne <jjherne@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-05-06s390/pci: removes wrong PCI multifunction assignmentPierre Morel1-3/+1
The assignment of the PCI device multifunction attribute is set during the PCI device probe. There is no need to set it here. Let's do it right and remove this assignment. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-05-06s390: ptrace: hard-code "s390x" instead of UTS_MACHINEMasahiro Yamada2-6/+1
s390 uses the UTS_MACHINE defined arch/s390/Makefile as follows: UTS_MACHINE := s390x We do not need to pass the fixed string from the command line. Hard-code user_regset_view::name, like many other architectures do. Link: https://lkml.kernel.org/r/20200413013113.8529-1-masahiroy@kernel.org Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-05-06Merge tag 'kvm-s390-master-5.7-3' of ↵Paolo Bonzini1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fix for running nested uner z/VM There are circumstances when running nested under z/VM that would trigger a WARN_ON_ONCE. Remove the WARN_ON_ONCE. Long term we certainly want to make this code more robust and flexible, but just returning instead of WARNING makes guest bootable again.
2020-05-06KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properlyPeter Xu1-0/+1
KVM_CAP_SET_GUEST_DEBUG should be supported for x86 however it's not declared as supported. My wild guess is that userspaces like QEMU are using "#ifdef KVM_CAP_SET_GUEST_DEBUG" to check for the capability instead, but that could be wrong because the compilation host may not be the runtime host. The userspace might still want to keep the old "#ifdef" though to not break the guest debug on old kernels. Signed-off-by: Peter Xu <peterx@redhat.com> Message-Id: <20200505154750.126300-1-peterx@redhat.com> [Do the same for PPC and s390. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-05mm: change pmdp_huge_get_and_clear_full take vm_area_struct as argAneesh Kumar K.V1-2/+2
We will use this in later patch to do tlb flush when clearing pmd entries. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200505071729.54912-22-aneesh.kumar@linux.ibm.com
2020-05-05KVM: s390: Remove false WARN_ON_ONCE for the PQAP instructionChristian Borntraeger1-1/+3
In LPAR we will only get an intercept for FC==3 for the PQAP instruction. Running nested under z/VM can result in other intercepts as well as ECA_APIE is an effective bit: If one hypervisor layer has turned this bit off, the end result will be that we will get intercepts for all function codes. Usually the first one will be a query like PQAP(QCI). So the WARN_ON_ONCE is not right. Let us simply remove it. Cc: Pierre Morel <pmorel@linux.ibm.com> Cc: Tony Krowiak <akrowiak@linux.ibm.com> Cc: stable@vger.kernel.org # v5.3+ Fixes: e5282de93105 ("s390: ap: kvm: add PQAP interception for AQIC") Link: https://lore.kernel.org/kvm/20200505083515.2720-1-borntraeger@de.ibm.com Reported-by: Qian Cai <cailca@icloud.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-04-28Merge branch 'work.sysctl' of ↵Daniel Borkmann4-15/+12
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull in Christoph Hellwig's series that changes the sysctl's ->proc_handler methods to take kernel pointers instead. It gets rid of the set_fs address space overrides used by BPF. As per discussion, pull in the feature branch into bpf-next as it relates to BPF sysctl progs. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200427071508.GV23230@ZenIV.linux.org.uk/T/
2020-04-28Merge tag 'cve-2020-11884' from emailed bundleLinus Torvalds2-2/+18
Pull s390 fix from Christian Borntraeger: "Fix a race between page table upgrade and uaccess on s390. This fixes CVE-2020-11884 which allows for a local kernel crash or code execution" * tag 'cve-2020-11884' from emailed bundle: s390/mm: fix page table upgrade vs 2ndary address mode accesses
2020-04-28s390/pci: Handling multifunctionsPierre Morel5-48/+156
We allow multiple functions on a single bus. We suppress the ZPCI_DEVFN definition and replace its occurences with zpci->devfn. We verify the number of device during the registration. There can never be more domains in use than existing devices, so we do not need to verify the count of domain after having verified the count of devices. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-28s390/pci: Adding bus resourcePierre Morel2-0/+6
The current PCI implementation do not provide a bus resource. This leads to a notice being print at boot. Let's do it more nicely and provide the bus resource. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-28s390/pci: adapt events for zbusPierre Morel1-16/+8
Simplify the event handling. Set the zpci state explicitly. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-28s390/pci: create zPCI busPierre Morel7-114/+280
The zPCI bus is in charge to handle common zPCI resources for zPCI devices. Creating the zPCI bus, the PCI bus, the zPCI devices and the PCI devices and hotplug slots done in a specific order: - PCI hotplug slot creation needs a PCI bus - PCI bus needs a PCI domain which is reported by the pci_domain_nr() when setting up the host bridge - PCI domain is set from the zPCI with devfn 0 this is necessary to have a reproducible enumeration Therefore we can not create devices or hotplug slots for any PCI device associated with a zPCI device before having discovered the function zero of the bus. The discovery and initialization of devices can be done at several points in the code: - On Events, serialized in a thread context - On initialization, in the kernel init thread context - When powering on the hotplug slot, in a user thread context The removal of devices and their parent bus may also be done on events or for devices when powering down the slot. To guarantee the existence of the bus and devices until they are no more needed we use kref in zPCI bus and introduce a reference count in the zPCI devices. In this patch the zPCI bus still only accept a device with a devfn 0. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-28s390/pci: define RID and RID availablePierre Morel3-2/+13
Firmware provides the bus/devfn part of the PCI addresses of a zPCI function inside the new field RID of the CLP query PCI function with a bit to know if this field is available to use. Let's add these fields to the clp_rsp_query_pci structure, add corresponding fields to zdev and initialize them. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-28s390/pci: define kernel parameters for PCI multifunctionPierre Morel2-0/+7
Using PCI multifunctions in S390 is a new feature we may want to ignore to continue provide the same topology as in the past to userland even if the configuration supports exposing the topology of a multi-Function device. A new boolean parameters allows to overwrite the kernel pci configuration: - pci=norid when on, disallow the use a new firmware field, RID, which provides the PCI <bus>:<device>.<function> part of the PCI address. To be used in the following patches and satisfy the checkpatch.pl the variable is exposed in pci.h Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-28s390/pci: adaptation of iommu to multifunctionPierre Morel1-0/+5
In the future the bus sysdata may not directly point to the zpci_dev. In preparation of upcoming patches let us abstract the access to the zpci_dev from the device inside the pci device. Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-28s390/pci: Expose new port attribute for PCIe functionsAlexander Schmidt4-1/+6
Add SysFS attribute that provides the port number for PCI functions representing a single port of a multi-port device. Signed-off-by: Alexander Schmidt <alexs@linux.ibm.com> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-27sysctl: pass kernel pointers to ->proc_handlerChristoph Hellwig4-15/+12
Instead of having all the sysctl handlers deal with user pointers, which is rather hairy in terms of the BPF interaction, copy the input to and from userspace in common code. This also means that the strings are always NUL-terminated by the common code, making the API a little bit safer. As most handler just pass through the data to one of the common handlers a lot of the changes are mechnical. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Andrey Ignatov <rdna@fb.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-04-26Merge tag 's390-5.7-3' of ↵Linus Torvalds6-9/+9
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Vasily Gorbik: - Add a few notrace annotations to avoid potential crashes when switching ftrace tracers. - Avoid setting affinity for floating irqs in pci code. - Fix build issue found by kbuild test robot. * tag 's390-5.7-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/protvirt: fix compilation issue s390/pci: do not set affinity for floating irqs s390/ftrace: fix potential crashes when switching tracers
2020-04-25s390/protvirt: fix compilation issueClaudio Imbrenda2-3/+2
The kernel fails to compile with CONFIG_PROTECTED_VIRTUALIZATION_GUEST set but CONFIG_KVM unset. This patch fixes the issue by making the needed variable always available. Link: https://lkml.kernel.org/r/20200423120114.2027410-1-imbrenda@linux.ibm.com Fixes: a0f60f843199 ("s390/protvirt: Add sysfs firmware interface for Ultravisor information") Reported-by: kbuild test robot <lkp@intel.com> Reported-by: Philipp Rudo <prudo@linux.ibm.com> Suggested-by: Philipp Rudo <prudo@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-22s390/pci: do not set affinity for floating irqsNiklas Schnelle1-2/+3
with the introduction of CPU directed interrupts the kernel parameter pci=force_floating was introduced to fall back to the previous behavior using floating irqs. However we were still setting the affinity in that case, both in __irq_alloc_descs() and via the irq_set_affinity callback in struct irq_chip. For the former only set the affinity in the directed case. The latter is explicitly set in zpci_directed_irq_init() so we can just leave it unset for the floating case. Fixes: e979ce7bced2 ("s390/pci: provide support for CPU directed interrupts") Co-developed-by: Alexander Schmidt <alexs@linux.ibm.com> Signed-off-by: Alexander Schmidt <alexs@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-22s390/ftrace: fix potential crashes when switching tracersPhilipp Rudo3-4/+4
Switching tracers include instruction patching. To prevent that a instruction is patched while it's read the instruction patching is done in stop_machine 'context'. This also means that any function called during stop_machine must not be traced. Thus add 'notrace' to all functions called within stop_machine. Fixes: 1ec2772e0c3c ("s390/diag: add a statistic for diagnose calls") Fixes: 38f2c691a4b3 ("s390: improve wait logic of stop_machine") Fixes: 4ecf0a43e729 ("processor: get rid of cpu_relax_yield") Signed-off-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-04-21Merge tag 'kvm-s390-master-5.7-2' of ↵Paolo Bonzini62-2197/+574
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master KVM: s390: Fix for 5.7 and maintainer update - Silence false positive lockdep warning - add Claudio as reviewer
2020-04-21s390/mm: fix page table upgrade vs 2ndary address mode accessesChristian Borntraeger2-2/+18
A page table upgrade in a kernel section that uses secondary address mode will mess up the kernel instructions as follows: Consider the following scenario: two threads are sharing memory. On CPU1 thread 1 does e.g. strnlen_user(). That gets to old_fs = enable_sacf_uaccess(); len = strnlen_user_srst(src, size); and " la %2,0(%1)\n" " la %3,0(%0,%1)\n" " slgr %0,%0\n" " sacf 256\n" "0: srst %3,%2\n" in strnlen_user_srst(). At that point we are in secondary space mode, control register 1 points to kernel page table and instruction fetching happens via c1, rather than usual c13. Interrupts are not disabled, for obvious reasons. On CPU2 thread 2 does MAP_FIXED mmap(), forcing the upgrade of page table from 3-level to e.g. 4-level one. We'd allocated new top-level table, set it up and now we hit this: notify = 1; spin_unlock_bh(&mm->page_table_lock); } if (notify) on_each_cpu(__crst_table_upgrade, mm, 0); OK, we need to actually change over to use of new page table and we need that to happen in all threads that are currently running. Which happens to include the thread 1. IPI is delivered and we have static void __crst_table_upgrade(void *arg) { struct mm_struct *mm = arg; if (current->active_mm == mm) set_user_asce(mm); __tlb_flush_local(); } run on CPU1. That does static inline void set_user_asce(struct mm_struct *mm) { S390_lowcore.user_asce = mm->context.asce; OK, user page table address updated... __ctl_load(S390_lowcore.user_asce, 1, 1); ... and control register 1 set to it. clear_cpu_flag(CIF_ASCE_PRIMARY); } IPI is run in home space mode, so it's fine - insns are fetched using c13, which always points to kernel page table. But as soon as we return from the interrupt, previous PSW is restored, putting CPU1 back into secondary space mode, at which point we no longer get the kernel instructions from the kernel mapping. The fix is to only fixup the control registers that are currently in use for user processes during the page table update. We must also disable interrupts in enable_sacf_uaccess to synchronize the cr and thread.mm_segment updates against the on_each-cpu. Fixes: 0aaba41b58bc ("s390: remove all code using the access register mode") Cc: stable@vger.kernel.org # 4.15+ Reported-by: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> References: CVE-2020-11884 Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-04-21KVM: Remove redundant argument to kvm_arch_vcpu_ioctl_runTianjia Zhang1-1/+2
In earlier versions of kvm, 'kvm_run' was an independent structure and was not included in the vcpu structure. At present, 'kvm_run' is already included in the vcpu structure, so the parameter 'kvm_run' is redundant. This patch simplifies the function definition, removes the extra 'kvm_run' parameter, and extracts it from the 'kvm_vcpu' structure if necessary. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Message-Id: <20200416051057.26526-1-tianjia.zhang@linux.alibaba.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-21kvm_host: unify VM_STAT and VCPU_STAT definitions in a single placeEmanuele Giuseppe Esposito1-103/+100
The macros VM_STAT and VCPU_STAT are redundantly implemented in multiple files, each used by a different architecure to initialize the debugfs entries for statistics. Since they all have the same purpose, they can be unified in a single common definition in include/linux/kvm_host.h Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20200414155625.20559-1-eesposit@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-20KVM: s390: remove unneeded semicolon in gisa_vcpu_kicker()Jason Yan1-1/+1
Fix the following coccicheck warning: arch/s390/kvm/interrupt.c:3085:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200418081926.41666-1-yanaijie@huawei.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-04-20KVM: s390: vsie: gmap_table_walk() simplificationsDavid Hildenbrand1-5/+5
Let's use asce_type where applicable. Also, simplify our sanity check for valid table levels and convert it into a WARN_ON_ONCE(). Check if we even have a valid gmap shadow as the very first step. Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200403153050.20569-6-david@redhat.com Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-04-20KVM: s390: vsie: Move conditional rescheduleDavid Hildenbrand1-2/+1
Let's move it to the outer loop, in case we ever run again into long loops, trying to map the prefix. While at it, convert it to cond_resched(). Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200403153050.20569-5-david@redhat.com Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-04-20KVM: s390: Fix PV check in deliverable_irqs()Eric Farman1-1/+1
The diag 0x44 handler, which handles a directed yield, goes into a a codepath that does a kvm_for_each_vcpu() and ultimately deliverable_irqs(). The new check for kvm_s390_pv_cpu_is_protected() contains an assertion that the vcpu->mutex is held, which isn't going to be the case in this scenario. The result is a plethora of these messages if the lock debugging is enabled, and thus an implication that we have a problem. WARNING: CPU: 9 PID: 16167 at arch/s390/kvm/kvm-s390.h:239 deliverable_irqs+0x1c6/0x1d0 [kvm] ...snip... Call Trace: [<000003ff80429bf2>] deliverable_irqs+0x1ca/0x1d0 [kvm] ([<000003ff80429b34>] deliverable_irqs+0x10c/0x1d0 [kvm]) [<000003ff8042ba82>] kvm_s390_vcpu_has_irq+0x2a/0xa8 [kvm] [<000003ff804101e2>] kvm_arch_dy_runnable+0x22/0x38 [kvm] [<000003ff80410284>] kvm_vcpu_on_spin+0x8c/0x1d0 [kvm] [<000003ff80436888>] kvm_s390_handle_diag+0x3b0/0x768 [kvm] [<000003ff80425af4>] kvm_handle_sie_intercept+0x1cc/0xcd0 [kvm] [<000003ff80422bb0>] __vcpu_run+0x7b8/0xfd0 [kvm] [<000003ff80423de6>] kvm_arch_vcpu_ioctl_run+0xee/0x3e0 [kvm] [<000003ff8040ccd8>] kvm_vcpu_ioctl+0x2c8/0x8d0 [kvm] [<00000001504ced06>] ksys_ioctl+0xae/0xe8 [<00000001504cedaa>] __s390x_sys_ioctl+0x2a/0x38 [<0000000150cb9034>] system_call+0xd8/0x2d8 2 locks held by CPU 2/KVM/16167: #0: 00000001951980c0 (&vcpu->mutex){+.+.}, at: kvm_vcpu_ioctl+0x90/0x8d0 [kvm] #1: 000000019599c0f0 (&kvm->srcu){....}, at: __vcpu_run+0x4bc/0xfd0 [kvm] Last Breaking-Event-Address: [<000003ff80429b34>] deliverable_irqs+0x10c/0x1d0 [kvm] irq event stamp: 11967 hardirqs last enabled at (11975): [<00000001502992f2>] console_unlock+0x4ca/0x650 hardirqs last disabled at (11982): [<0000000150298ee8>] console_unlock+0xc0/0x650 softirqs last enabled at (7940): [<0000000150cba6ca>] __do_softirq+0x422/0x4d8 softirqs last disabled at (7929): [<00000001501cd688>] do_softirq_own_stack+0x70/0x80 Considering what's being done here, let's fix this by removing the mutex assertion rather than acquiring the mutex for every other vcpu. Fixes: 201ae986ead7 ("KVM: s390: protvirt: Implement interrupt injection") Signed-off-by: Eric Farman <farman@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20200415190353.63625-1-farman@linux.ibm.com Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-04-14KVM: s390: Return last valid slot if approx index is out-of-boundsSean Christopherson1-0/+3
Return the index of the last valid slot from gfn_to_memslot_approx() if its binary search loop yielded an out-of-bounds index. The index can be out-of-bounds if the specified gfn is less than the base of the lowest memslot (which is also the last valid memslot). Note, the sole caller, kvm_s390_get_cmma(), ensures used_slots is non-zero. Fixes: afdad61615cc3 ("KVM: s390: Fix storage attributes migration with memory slots") Cc: stable@vger.kernel.org # 4.19.x: 0774a964ef56: KVM: Fix out of range accesses to memslots Cc: stable@vger.kernel.org # 4.19.x Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Message-Id: <20200408064059.8957-3-sean.j.christopherson@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds3-6/+8
Merge yet more updates from Andrew Morton: - Almost all of the rest of MM (memcg, slab-generic, slab, pagealloc, gup, hugetlb, pagemap, memremap) - Various other things (hfs, ocfs2, kmod, misc, seqfile) * akpm: (34 commits) ipc/util.c: sysvipc_find_ipc() should increase position index kernel/gcov/fs.c: gcov_seq_next() should increase position index fs/seq_file.c: seq_read(): add info message about buggy .next functions drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings change email address for Pali Rohár selftests: kmod: test disabling module autoloading selftests: kmod: fix handling test numbers above 9 docs: admin-guide: document the kernel.modprobe sysctl fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() kmod: make request_module() return an error when autoloading is disabled mm/memremap: set caching mode for PCI P2PDMA memory to WC mm/memory_hotplug: add pgprot_t to mhp_params powerpc/mm: thread pgprot_t through create_section_mapping() x86/mm: introduce __set_memory_prot() x86/mm: thread pgprot_t through init_memory_mapping() mm/memory_hotplug: rename mhp_restrictions to mhp_params mm/memory_hotplug: drop the flags field from struct mhp_restrictions mm/special: create generic fallbacks for pte_special() and pte_mkspecial() mm/vma: introduce VM_ACCESS_FLAGS mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS ...
2020-04-11mm/memory_hotplug: add pgprot_t to mhp_paramsLogan Gunthorpe1-0/+3
devm_memremap_pages() is currently used by the PCI P2PDMA code to create struct page mappings for IO memory. At present, these mappings are created with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on x86, an mtrr register will typically override this and force the cache type to be UC-. In the case firmware doesn't set this register it is effectively WB and will typically result in a machine check exception when it's accessed. Other arches are not currently likely to function correctly seeing they don't have any MTRR registers to fall back on. To solve this, provide a way to specify the pgprot value explicitly to arch_add_memory(). Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a simple change to pass the pgprot_t down to their respective functions which set up the page tables. For x86_32, set the page tables explicitly using _set_memory_prot() (seeing they are already mapped). For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this should be fine, for now, seeing these architectures don't support ZONE_DEVICE. A check in __add_pages() is also added to ensure the pgprot parameter was set for all arches. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm/memory_hotplug: rename mhp_restrictions to mhp_paramsLogan Gunthorpe1-3/+3
The mhp_restrictions struct really doesn't specify anything resembling a restriction anymore so rename it to be mhp_params as it is a list of extended parameters. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm/vma: introduce VM_ACCESS_FLAGSAnshuman Khandual1-1/+1
There are many places where all basic VMA access flags (read, write, exec) are initialized or checked against as a group. One such example is during page fault. Existing vma_is_accessible() wrapper already creates the notion of VMA accessibility as a group access permissions. Hence lets just create VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which will not only reduce code duplication but also extend the VMA accessibility concept in general. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Salter <msalter@redhat.com> Cc: Nick Hu <nickhu@andestech.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Springer <rspringer@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/1583391014-8170-3-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm/vma: define a default value for VM_DATA_DEFAULT_FLAGSAnshuman Khandual1-2/+1
There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the existing VM_STACK_DEFAULT_FLAGS. While here, also define some more macros with standard VMA access flag combinations that are used frequently across many platforms. Apart from simplification, this reduces code duplication as well. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Salter <msalter@redhat.com> Cc: Guo Ren <guoren@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Brian Cain <bcain@codeaurora.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paulburton@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeff Dike <jdike@addtoit.com> Cc: Chris Zankel <chris@zankel.net> Link: http://lkml.kernel.org/r/1583391014-8170-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10Merge tag 's390-5.7-2' of ↵Linus Torvalds2-10/+8
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull more s390 updates from Vasily Gorbik: "Second round of s390 fixes and features for 5.7: - The rest of fallthrough; annotations conversion - Couple of fixes for ADD uevents in the common I/O layer - Minor refactoring of the queued direct I/O code" * tag 's390-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cio: generate delayed uevent for vfio-ccw subchannels s390/cio: avoid duplicated 'ADD' uevents s390/qdio: clear DSCI early for polling drivers s390/qdio: inline shared_ind() s390/qdio: remove cdev from init_data s390/qdio: allow for non-contiguous SBAL array in init_data zfcp: inline zfcp_qdio_setup_init_data() s390/qdio: cleanly split alloc and establish s390/mm: use fallthrough;
2020-04-08Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds2-1/+7
Pull more kvm updates from Paolo Bonzini: "s390: - nested virtualization fixes x86: - split svm.c - miscellaneous fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: fix crash cleanup when KVM wasn't used KVM: X86: Filter out the broadcast dest for IPI fastpath KVM: s390: vsie: Fix possible race when shadowing region 3 tables KVM: s390: vsie: Fix delivery of addressing exceptions KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checks KVM: nVMX: don't clear mtf_pending when nested events are blocked KVM: VMX: Remove unnecessary exception trampoline in vmx_vmenter KVM: SVM: Split svm_vcpu_run inline assembly to separate file KVM: SVM: Move SEV code to separate file KVM: SVM: Move AVIC code to separate file KVM: SVM: Move Nested SVM Implementation to nested.c kVM SVM: Move SVM related files to own sub-directory
2020-04-08Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-4/+0
Pull virtio updates from Michael Tsirkin: - Some bug fixes - The new vdpa subsystem with two first drivers * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio-balloon: Revert "virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM" vdpa: move to drivers/vdpa virtio: Intel IFC VF driver for VDPA vdpasim: vDPA device simulator vhost: introduce vDPA-based backend virtio: introduce a vDPA based transport vDPA: introduce vDPA bus vringh: IOTLB support vhost: factor out IOTLB vhost: allow per device message handler vhost: refine vhost and vringh kconfig virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM virtio-net: Introduce hash report feature virtio-net: Introduce RSS receive steering feature virtio-net: Introduce extended RSC feature tools/virtio: option to build an out of tree module
2020-04-07KVM: s390: vsie: Fix possible race when shadowing region 3 tablesDavid Hildenbrand1-0/+1
We have to properly retry again by returning -EINVAL immediately in case somebody else instantiated the table concurrently. We missed to add the goto in this function only. The code now matches the other, similar shadowing functions. We are overwriting an existing region 2 table entry. All allocated pages are added to the crst_list to be freed later, so they are not lost forever. However, when unshadowing the region 2 table, we wouldn't trigger unshadowing of the original shadowed region 3 table that we replaced. It would get unshadowed when the original region 3 table is modified. As it's not connected to the page table hierarchy anymore, it's not going to get used anymore. However, for a limited time, this page table will stick around, so it's in some sense a temporary memory leak. Identified by manual code inspection. I don't think this classifies as stable material. Fixes: 998f637cc4b9 ("s390/mm: avoid races on region/segment/page table shadowing") Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200403153050.20569-4-david@redhat.com Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-04-07KVM: s390: vsie: Fix delivery of addressing exceptionsDavid Hildenbrand1-0/+1
Whenever we get an -EFAULT, we failed to read in guest 2 physical address space. Such addressing exceptions are reported via a program intercept to the nested hypervisor. We faked the intercept, we have to return to guest 2. Instead, right now we would be returning -EFAULT from the intercept handler, eventually crashing the VM. the correct thing to do is to return 1 as rc == 1 is the internal representation of "we have to go back into g2". Addressing exceptions can only happen if the g2->g3 page tables reference invalid g2 addresses (say, either a table or the final page is not accessible - so something that basically never happens in sane environments. Identified by manual code inspection. Fixes: a3508fbe9dc6 ("KVM: s390: vsie: initial support for nested virtualization") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200403153050.20569-3-david@redhat.com Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [borntraeger@de.ibm.com: fix patch description] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-04-07KVM: s390: vsie: Fix region 1 ASCE sanity shadow address checksDavid Hildenbrand1-1/+5
In case we have a region 1 the following calculation (31 + ((gmap->asce & _ASCE_TYPE_MASK) >> 2)*11) results in 64. As shifts beyond the size are undefined the compiler is free to use instructions like sllg. sllg will only use 6 bits of the shift value (here 64) resulting in no shift at all. That means that ALL addresses will be rejected. The can result in endless loops, e.g. when prefix cannot get mapped. Fixes: 4be130a08420 ("s390/mm: add shadow gmap support") Tested-by: Janosch Frank <frankja@linux.ibm.com> Reported-by: Janosch Frank <frankja@linux.ibm.com> Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/20200403153050.20569-2-david@redhat.com Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [borntraeger@de.ibm.com: fix patch description, remove WARN_ON_ONCE] Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-04-06s390/qdio: remove cdev from init_dataJulian Wiedmann1-3/+2
It's no longer needed. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>