summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-02-16kvm/x86: Rename Hyper-V long spin wait hypercallAndrey Smetanin2-2/+2
Rename HV_X64_HV_NOTIFY_LONG_SPIN_WAIT by HVCALL_NOTIFY_LONG_SPIN_WAIT, so the name is more consistent with the other hypercalls. Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> CC: Gleb Natapov <gleb@kernel.org> CC: Paolo Bonzini <pbonzini@redhat.com> CC: Joerg Roedel <joro@8bytes.org> CC: "K. Y. Srinivasan" <kys@microsoft.com> CC: Haiyang Zhang <haiyangz@microsoft.com> CC: Roman Kagan <rkagan@virtuozzo.com> CC: Denis V. Lunev <den@openvz.org> CC: qemu-devel@nongnu.org [Change name, Andrey used HV_X64_HCALL_NOTIFY_LONG_SPIN_WAIT. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: x86: fix missed hardware breakpointsPaolo Bonzini1-0/+1
Sometimes when setting a breakpoint a process doesn't stop on it. This is because the debug registers are not loaded correctly on VCPU load. The following simple reproducer from Oleg Nesterov tries using debug registers in both the host and the guest, for example by running "./bp 0 1" on the host and "./bp 14 15" under QEMU. #include <unistd.h> #include <signal.h> #include <stdlib.h> #include <stdio.h> #include <sys/wait.h> #include <sys/ptrace.h> #include <sys/user.h> #include <asm/debugreg.h> #include <assert.h> #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER) unsigned long encode_dr7(int drnum, int enable, unsigned int type, unsigned int len) { unsigned long dr7; dr7 = ((len | type) & 0xf) << (DR_CONTROL_SHIFT + drnum * DR_CONTROL_SIZE); if (enable) dr7 |= (DR_GLOBAL_ENABLE << (drnum * DR_ENABLE_SIZE)); return dr7; } int write_dr(int pid, int dr, unsigned long val) { return ptrace(PTRACE_POKEUSER, pid, offsetof (struct user, u_debugreg[dr]), val); } void set_bp(pid_t pid, void *addr) { unsigned long dr7; assert(write_dr(pid, 0, (long)addr) == 0); dr7 = encode_dr7(0, 1, DR_RW_EXECUTE, DR_LEN_1); assert(write_dr(pid, 7, dr7) == 0); } void *get_rip(int pid) { return (void*)ptrace(PTRACE_PEEKUSER, pid, offsetof(struct user, regs.rip), 0); } void test(int nr) { void *bp_addr = &&label + nr, *bp_hit; int pid; printf("test bp %d\n", nr); assert(nr < 16); // see 16 asm nops below pid = fork(); if (!pid) { assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0); kill(getpid(), SIGSTOP); for (;;) { label: asm ( "nop; nop; nop; nop;" "nop; nop; nop; nop;" "nop; nop; nop; nop;" "nop; nop; nop; nop;" ); } } assert(pid == wait(NULL)); set_bp(pid, bp_addr); for (;;) { assert(ptrace(PTRACE_CONT, pid, 0, 0) == 0); assert(pid == wait(NULL)); bp_hit = get_rip(pid); if (bp_hit != bp_addr) fprintf(stderr, "ERR!! hit wrong bp %ld != %d\n", bp_hit - &&label, nr); } } int main(int argc, const char *argv[]) { while (--argc) { int nr = atoi(*++argv); if (!fork()) test(nr); } while (wait(NULL) > 0) ; return 0; } Cc: stable@vger.kernel.org Suggested-by: Nadadv Amit <namit@cs.technion.ac.il> Reported-by: Andrey Wagin <avagin@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: x86: fix *NULL on invalid low-prio irqRadim Krčmář1-18/+13
Smatch noticed a NULL dereference in kvm_intr_is_single_vcpu_fast that happens if VM already warned about invalid lowest-priority interrupt. Create a function for common code while fixing it. Fixes: 6228a0da8057 ("KVM: x86: Add lowest-priority support for vt-d posted-interrupts") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: x86: rewrite handling of scaled TSC for kvmclockPaolo Bonzini1-7/+8
This is the same as before: kvm_scale_tsc(tgt_tsc_khz) = tgt_tsc_khz * ratio = tgt_tsc_khz * user_tsc_khz / tsc_khz (see set_tsc_khz) = user_tsc_khz (see kvm_guest_time_update) = vcpu->arch.virtual_tsc_khz (see kvm_set_tsc_khz) However, computing it through kvm_scale_tsc will make it possible to include the NTP correction in tgt_tsc_khz. Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: x86: rename argument to kvm_set_tsc_khzPaolo Bonzini1-7/+7
This refers to the desired (scaled) frequency, which is called user_tsc_khz in the rest of the file. Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: VMX: Fix guest debugging while in L2Jan Kiszka1-0/+17
When we take a #DB or #BP vmexit while in guest mode, we first of all need to check if there is ongoing guest debugging that might be interested in the event. Currently, we unconditionally leave L2 and inject the event into L1 if it is intercepting the exceptions. That breaks things marvelously. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: VMX: Factor out is_exception_n helperJan Kiszka1-8/+9
There is quite some common code in all these is_<exception>() helpers. Factor it out before adding even more of them. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16KVM: halt_polling: improve grow/shrink settingsChristian Borntraeger2-12/+15
Right now halt_poll_ns can be change during runtime. The grow and shrink factors can only be set during module load. Lets fix several aspects of grow shrink: - make grow/shrink changeable by root - make all variables unsigned int - read the variables once to prevent races Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16Merge tag 'kvm-s390-next-4.6-1' of ↵Paolo Bonzini12-122/+254
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes and features for kvm/next (4.6) 1. also provide the floating point registers via sync regs 2. Separate out intruction vs. data accesses 3. Fix program interrupts in some cases 4. Documentation fixes 5. dirty log improvements for huge guests
2016-02-10KVM: s390: bail out early on fatal signal in dirty loggingChristian Borntraeger1-0/+2
A KVM_GET_DIRTY_LOG ioctl might take a long time. This can result in fatal signals seemingly being ignored. Lets bail out during the dirty bit sync, if a fatal signal is pending. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: do not block CPU on dirty loggingChristian Borntraeger1-0/+1
When doing dirty logging on huge guests (e.g.600GB) we sometimes get rcu stall timeouts with backtraces like [ 2753.194083] ([<0000000000112fb2>] show_trace+0x12a/0x130) [ 2753.194092] [<0000000000113024>] show_stack+0x6c/0xe8 [ 2753.194094] [<00000000001ee6a8>] rcu_pending+0x358/0xa48 [ 2753.194099] [<00000000001f20cc>] rcu_check_callbacks+0x84/0x168 [ 2753.194102] [<0000000000167654>] update_process_times+0x54/0x80 [ 2753.194107] [<00000000001bdb5c>] tick_sched_handle.isra.16+0x4c/0x60 [ 2753.194113] [<00000000001bdbd8>] tick_sched_timer+0x68/0x90 [ 2753.194115] [<0000000000182a88>] __run_hrtimer+0x88/0x1f8 [ 2753.194119] [<00000000001838ba>] hrtimer_interrupt+0x122/0x2b0 [ 2753.194121] [<000000000010d034>] do_extint+0x16c/0x170 [ 2753.194123] [<00000000005e206e>] ext_skip+0x38/0x3e [ 2753.194129] [<000000000012157c>] gmap_test_and_clear_dirty+0xcc/0x118 [ 2753.194134] ([<00000000001214ea>] gmap_test_and_clear_dirty+0x3a/0x118) [ 2753.194137] [<0000000000132da4>] kvm_vm_ioctl_get_dirty_log+0xd4/0x1b0 [ 2753.194143] [<000000000012ac12>] kvm_vm_ioctl+0x21a/0x548 [ 2753.194146] [<00000000002b57f6>] do_vfs_ioctl+0x30e/0x518 [ 2753.194149] [<00000000002b5a9c>] SyS_ioctl+0x9c/0xb0 [ 2753.194151] [<00000000005e1ae6>] sysc_tracego+0x14/0x1a [ 2753.194153] [<000003ffb75f3972>] 0x3ffb75f3972 We should do a cond_resched in here. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: do not take mmap_sem on dirty log queryChristian Borntraeger1-2/+0
Dirty log query can take a long time for huge guests. Holding the mmap_sem for very long times can cause some unwanted latencies. Turns out that we do not need to hold the mmap semaphore. We hold the slots_lock for gfn->hva translation and walk the page tables with that address, so no need to look at the VMAs. KVM also holds a reference to the mm, which should prevent other things going away. During the walk we take the necessary ptl locks. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: usage hint for adapter mappingsCornelia Huck1-0/+2
The interface for adapter mappings was designed with code in mind that maps each address only once; let's document this. Otherwise, duplicate mappings are added to the list, which makes the code ineffective and uses up the limited amount of mapping needlessly. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: add documentation of KVM_S390_VM_CRYPTODavid Hildenbrand1-0/+33
Let's properly document KVM_S390_VM_CRYPTO and its attributes. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: add documentation of KVM_S390_VM_TODDavid Hildenbrand1-0/+19
Let's properly document KVM_S390_VM_TOD and its attributes. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: remove old fragment of vector registersDavid Hildenbrand1-7/+1
Since commit 9977e886cbbc ("s390/kernel: lazy restore fpu registers"), vregs in struct sie_page is unsed. We can safely remove the field and the definition. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: instruction-fetching exceptions on SIE faultsDavid Hildenbrand1-2/+10
On instruction-fetch exceptions, we have to forward the PSW by any valid ilc and correctly use that ilc when injecting the irq. Injection will already take care of rewinding the PSW if we injected a nullifying program irq, so we don't need special handling prior to injection. Until now, autodetection would have guessed an ilc of 0. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: provide prog irq ilc on SIE faultsDavid Hildenbrand1-4/+8
On SIE faults, the ilc cannot be detected automatically, as the icptcode is 0. The ilc indicated in the program irq will always be 0. Therefore we have to manually specify the ilc in order to tell the guest which ilen was used when forwarding the PSW. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: irq delivery should not rely on icptcodeDavid Hildenbrand3-1/+4
Program irq injection during program irq intercepts is the last candidates that injects nullifying irqs and relies on delivery to do the right thing. As we should not rely on the icptcode during any delivery (because that value will not be migrated), let's add a flag, telling prog IRQ delivery to not rewind the PSW in case of nullifying prog IRQs. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: clean up prog irq injection on prog irq icptsDavid Hildenbrand1-21/+20
__extract_prog_irq() is used only once for getting the program check data in one place. Let's combine it with an injection function to avoid a memset and to prevent misuse on injection by simplifying the interface to only have the VCPU as parameter. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: read the correct opcode on SIE faultsDavid Hildenbrand1-2/+1
Let's use our fresh new function read_guest_instr() to access guest storage via the correct addressing schema. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: gaccess: implement instruction fetching modeDavid Hildenbrand2-4/+28
When an instruction is to be fetched, special handling applies to secondary-space mode and access-register mode. The instruction is to be fetched from primary space. We can easily support this by selecting the right asce for translation. Access registers will never be used during translation, so don't include them in the interface. As we only want to read from the current PSW address for now, let's also hide that detail. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: gaccess: introduce access modesDavid Hildenbrand5-35/+43
We will need special handling when fetching instructions, so let's introduce new guest access modes GACC_FETCH and GACC_STORE instead of a write flag. An additional patch will then introduce GACC_IFETCH. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: migration / injection of prog irq ilcDavid Hildenbrand2-2/+17
We have to migrate the program irq ilc and someday we will have to specify the ilc without KVM trying to autodetect the value. Let's reuse one of the spare fields in our program irq that should always be set to 0 by user space. Because we also want to make use of 0 ilcs ("not available"), we need a validity indicator. If no valid ilc is given, we try to autodetect the ilc via the current icptcode and icptstatus + parameter and store the valid ilc in the irq structure. This has a nice effect: QEMU's making use of KVM_S390_IRQ / KVM_S390_SET_IRQ_STATE / KVM_S390_GET_IRQ_STATE for migration will directly migrate the ilc without any changes. Please note that we use bit 0 as validity and bit 1,2 for the ilc, so by applying the ilc mask we directly get the ilen which is usually what we work with. Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: PSW forwarding / rewinding / ilc reworkDavid Hildenbrand5-36/+50
We have some confusion about ilc vs. ilen in our current code. So let's correctly use the term ilen when dealing with (ilc << 1). Program irq injection didn't take care of the correct ilc in case of irqs triggered by EXECUTE functions, let's provide one function kvm_s390_get_ilen() to take care of all that. Also, manually specifying in intercept handlers the size of the instruction (and sometimes overwriting that value for EXECUTE internally) doesn't make too much sense. So also provide the functions: - kvm_s390_retry_instr to retry the currently intercepted instruction - kvm_s390_rewind_psw to rewind the PSW without internal overwrites - kvm_s390_forward_psw to forward the PSW Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: sync of fp registers via kvm_runDavid Hildenbrand2-7/+13
As we already store the floating point registers in the vector save area in floating point register format when we don't have MACHINE_HAS_VX, we can directly expose them to user space using a new sync flag. The floating point registers will be valid when KVM_SYNC_FPRS is set. The fpc will also be valid when KVM_SYNC_FPRS is set. Either KVM_SYNC_FPRS or KVM_SYNC_VRS will be enabled, never both. Let's also change two positions where we access vrs, making the code easier to read and one comment superfluous. Suggested-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-10KVM: s390: allow sync of fp registers via vregsDavid Hildenbrand1-1/+4
If we have MACHINE_HAS_VX, the floating point registers are stored in the vector register format, event if the guest isn't enabled for vector registers. So we can allow KVM_SYNC_VRS as soon as MACHINE_HAS_VX is available. This can in return be used by user space to support floating point registers via struct kvm_run when the machine has vector registers. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2016-02-09KVM: x86: consolidate different ways to test for in-kernel LAPICPaolo Bonzini5-29/+22
Different pieces of code checked for vcpu->arch.apic being (non-)NULL, or used kvm_vcpu_has_lapic (more optimized) or lapic_in_kernel. Replace everything with lapic_in_kernel's name and kvm_vcpu_has_lapic's implementation. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: consolidate "has lapic" checks into irq.cPaolo Bonzini2-8/+7
Do for kvm_cpu_has_pending_timer and kvm_inject_pending_timer_irqs what the other irq.c routines have been doing. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: APIC: remove unnecessary double checks on APIC existencePaolo Bonzini1-16/+3
Usually the in-kernel APIC's existence is checked in the caller. Do not bother checking it again in lapic.c. Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM/VMX: Add host irq information in trace event when updating IRTE for ↵Feng Wu2-5/+9
posted interrupts Add host irq information in trace event, so we can better understand which irq is in posted mode. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krcmar <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: Add lowest-priority support for vt-d posted-interruptsFeng Wu1-7/+49
Use vector-hashing to deliver lowest-priority interrupts for VT-d posted-interrupts. This patch extends kvm_intr_is_single_vcpu() to support lowest-priority handling. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: Use vector-hashing to deliver lowest-priority interruptsFeng Wu6-7/+82
Use vector-hashing to deliver lowest-priority interrupts, As an example, modern Intel CPUs in server platform use this method to handle lowest-priority interrupts. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: Recover IRTE to remapped mode if the interrupt is not single-destinationFeng Wu1-1/+14
When the interrupt is not single destination any more, we need to change back IRTE to remapped mode explicitly. Signed-off-by: Feng Wu <feng.wu@intel.com> Reviewed-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09KVM: x86: introduce do_shl32_div32Paolo Bonzini2-8/+17
This is similar to the existing div_frac function, but it returns the remainder too. Unlike div_frac, it can be used to implement long division, e.g. (a << 64) / b for 32-bit a and b. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-08Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds5-12/+52
Pull KVM fixes from Paolo Bonzini: "KVM-ARM fixes, mostly coming from the PMU work" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: arm64: KVM: Fix guest dead loop when register accessor returns false arm64: KVM: Fix comments of the CP handler arm64: KVM: Fix wrong use of the CPSR MODE mask for 32bit guests arm64: KVM: Obey RES0/1 reserved bits when setting CPTR_EL2 arm64: KVM: Fix AArch64 guest userspace exception injection
2016-02-08Merge tag 'regmap-fix-v4.5-big-endian' of ↵Linus Torvalds10-8/+17
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap Pull regmap fix from Mark Brown: "A single revert back to v4.4 endianness handling. Commit 29bb45f25ff3 ("regmap-mmio: Use native endianness for read/write") attempted to fix some long standing bugs in the MMIO implementation for big endian systems caused by duplicate byte swapping in both regmap and readl()/writel(). Sadly the fix makes things worse rather than better, so revert it for now" * tag 'regmap-fix-v4.5-big-endian' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: regmap: mmio: Revert to v4.4 endianness handling
2016-02-08scatterlist: fix a typo in comment block of sg_miter_stop()Masahiro Yamada1-3/+3
Fix the doubled "started" and tidy up the following sentences. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-08Merge tag 'kvm-arm-for-4.5-rc2' of ↵Paolo Bonzini5-12/+52
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/ARM fixes for v4.5-rc2 A few random fixes, mostly coming from the PMU work by Shannon: - fix for injecting faults coming from the guest's userspace - cleanup for our CPTR_EL2 accessors (reserved bits) - fix for a bug impacting perf (user/kernel discrimination) - fix for a 32bit sysreg handling bug
2016-02-08Linux 4.5-rc3Linus Torvalds1-1/+1
2016-02-08Merge tag 'armsoc-fixes' of ↵Linus Torvalds33-149/+230
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "The first real batch of fixes for this release cycle, so there are a few more than usual. Most of these are fixes and tweaks to board support (DT bugfixes, etc). I've also picked up a couple of small cleanups that seemed innocent enough that there was little reason to wait (const/ __initconst and Kconfig deps). Quite a bit of the changes on OMAP were due to fixes to no longer write to rodata from assembly when ARM_KERNMEM_PERMS was enabled, but there were also other fixes. Kirkwood had a bunch of gpio fixes for some boards. OMAP had RTC fixes on OMAP5, and Nomadik had changes to MMC parameters in DT. All in all, mostly the usual mix of various fixes" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits) ARM: multi_v7_defconfig: enable DW_WATCHDOG ARM: nomadik: fix up SD/MMC DT settings ARM64: tegra: Add chosen node for tegra132 norrin ARM: realview: use "depends on" instead of "if" after prompt ARM: tango: use "depends on" instead of "if" after prompt ARM: tango: use const and __initconst for smp_operations ARM: realview: use const and __initconst for smp_operations bus: uniphier-system-bus: revive tristate prompt arm64: dts: Add missing DMA Abort interrupt to Juno bus: vexpress-config: Add missing of_node_put ARM: dts: am57xx: sbc-am57x: correct Eth PHY settings ARM: dts: am57xx: cl-som-am57x: fix CPSW EMAC pinmux ARM: dts: am57xx: sbc-am57x: fix UART3 pinmux ARM: dts: am57xx: cl-som-am57x: update SPI Flash frequency ARM: dts: am57xx: cl-som-am57x: set HOST mode for USB2 ARM: dts: am57xx: sbc-am57x: fix SB-SOM EEPROM I2C address ARM: dts: LogicPD Torpedo: Revert Duplicative Entries ARM: dts: am437x: pixcir_tangoc: use correct flags for irq types ARM: dts: am4372: fix irq type for arm twd and global timer ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type ...
2016-02-08Merge branch 'mailbox-devel' of ↵Linus Torvalds2-7/+2
git://git.linaro.org/landing-teams/working/fujitsu/integration Pull mailbox fixes from Jassi Brar: - fix getting element from the pcc-channels array by simply indexing into it - prevent building mailbox-test driver for archs that don't have IOMEM * 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration: mailbox: Fix dependencies for !HAS_IOMEM archs mailbox: pcc: fix channel calculation in get_pcc_channel()
2016-02-07Merge tag 'usb-4.5-rc3' of ↵Linus Torvalds15-65/+131
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some USB fixes for 4.5-rc3. The usual, xhci fixes for reported issues, combined with some small gadget driver fixes, and a MAINTAINERS file update. All have been in linux-next with no reported issues" * tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: xhci: harden xhci_find_next_ext_cap against device removal xhci: Fix list corruption in urb dequeue at host removal usb: host: xhci-plat: fix NULL pointer in probe for device tree case usb: xhci-mtk: fix AHB bus hang up caused by roothubs polling usb: xhci-mtk: fix bpkts value of LS/HS periodic eps not behind TT usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Broxton-M platforms usb: xhci: set SSIC port unused only if xhci_suspend succeeds usb: xhci: add a quirk bit for ssic port unused usb: xhci: handle both SSIC ports in PME stuck quirk usb: dwc3: gadget: set the OTG flag in dwc3 gadget driver. Revert "xhci: don't finish a TD if we get a short-transfer event mid TD" MAINTAINERS: fix my email address usb: dwc2: Fix probe problem on bcm2835 Revert "usb: dwc2: Move reset into dwc2_get_hwparams()" usb: musb: ux500: Fix NULL pointer dereference at system PM usb: phy: mxs: declare variable with initialized value usb: phy: msm: fix error handling in probe.
2016-02-07Merge tag 'staging-4.5-rc3' of ↵Linus Torvalds14-14/+32
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO driver fixes from Greg KH: "Here are some IIO and staging driver fixes for 4.5-rc3. All of them, except one, are for IIO drivers, and one is for a speakup driver fix caused by some earlier patches, to resolve a reported build failure" * tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: Staging: speakup: Fix allyesconfig build on mn10300 iio: dht11: Use boottime iio: ade7753: avoid uninitialized data iio: pressure: mpl115: fix temperature offset sign iio: imu: Fix dependencies for !HAS_IOMEM archs staging: iio: Fix dependencies for !HAS_IOMEM archs iio: adc: Fix dependencies for !HAS_IOMEM archs iio: inkern: fix a NULL dereference on error iio:adc:ti_am335x_adc Fix buffered mode by identifying as software buffer. iio: light: acpi-als: Report data as processed iio: dac: mcp4725: set iio name property in sysfs iio: add HAS_IOMEM dependency to VF610_ADC iio: add IIO_TRIGGER dependency to STK8BA50 iio: proximity: lidar: correct return value iio-light: Use a signed return type for ltr501_match_samp_freq()
2016-02-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds23-126/+166
Merge fixes from Andrew Morton: "22 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits) epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT radix-tree: fix oops after radix_tree_iter_retry MAINTAINERS: trim the file triggers for ABI/API dax: dirty inode only if required thp: make deferred_split_scan() work again mm: replace vma_lock_anon_vma with anon_vma_lock_read/write ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup um: asm/page.h: remove the pte_high member from struct pte_t mm, hugetlb: don't require CMA for runtime gigantic pages mm/hugetlb: fix gigantic page initialization/allocation mm: downgrade VM_BUG in isolate_lru_page() to warning mempolicy: do not try to queue pages from !vma_migratable() mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress vmstat: make vmstat_update deferrable mm, vmstat: make quiet_vmstat lighter mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT memblock: don't mark memblock_phys_mem_size() as __init dump_stack: avoid potential deadlocks mm: validate_mm browse_rb SMP race condition m32r: fix build failure due to SMP and MMU ...
2016-02-06Merge branch 'for-linus' of ↵Linus Torvalds6-17/+75
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "We have a few wire protocol compatibility fixes, ports of a few recent CRUSH mapping changes, and a couple error path fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: MOSDOpReply v7 encoding libceph: advertise support for TUNABLES5 crush: decode and initialize chooseleaf_stable crush: add chooseleaf_stable tunable crush: ensure take bucket value is valid crush: ensure bucket id is valid before indexing buckets array ceph: fix snap context leak in error path ceph: checking for IS_ERR instead of NULL
2016-02-06Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds28-252/+461
Pull drm fixes from Dave Airlie: "Fixes all over the place: - amdkfd: two static checker fixes - mst: a bunch of static checker and spec/hw interaction fixes - amdgpu: fix Iceland hw properly, and some fiji bugs, along with some write-combining fixes. - exynos: some regression fixes - adv7511: fix some EDID reading issues" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (38 commits) drm/dp/mst: deallocate payload on port destruction drm/dp/mst: Reverse order of MST enable and clearing VC payload table. drm/dp/mst: move GUID storage from mgr, port to only mst branch drm/dp/mst: change MST detection scheme drm/dp/mst: Calculate MST PBN with 31.32 fixed point drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil drm/mst: Add range check for max_payloads during init drm/mst: Don't ignore the MST PBN self-test result drm: fix missing reference counting decrease drm/amdgpu: disable uvd and vce clockgating on Fiji drm/amdgpu: remove exp hardware support from iceland drm/amdgpu: load MEC ucode manually on iceland drm/amdgpu: don't load MEC2 on topaz drm/amdgpu: drop topaz support from gmc8 module drm/amdgpu: pull topaz gmc bits into gmc_v7 drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above drm/amdgpu: iceland use CI based MC IP drm/amdgpu: move gmc7 support out of CIK dependency drm/amdgpu/gfx7: enable cp inst/reg error interrupts drm/amdgpu/gfx8: enable cp inst/reg error interrupts ...
2016-02-06Merge tag 'pm+acpi-4.5-rc3' of ↵Linus Torvalds5-23/+10
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: "These are: a fix for a recently introduced false-positive warnings about PM domain pointers being changed inappropriately (harmless but annoying), an MCH size workaround quirk for one more platform, a compiler warning fix (generic power domains framework), an ACPI LPSS (Intel SoCs) driver fixup and a cleanup of the ACPI CPPC core code. Specifics: - PM core fix to avoid false-positive warnings generated when the pm_domain field is cleared for a device that appears to be bound to a driver (Rafael Wysocki). - New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer). - Fix for an "unused function" compiler warning in the generic power domains framework (Ulf Hansson). - Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set the PM domain pointer of a device properly in one place that was overlooked by a recent PM core update (Andy Shevchenko). - Removal of a redundant function declaration in the ACPI CPPC core code (Timur Tabi)" * tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: Avoid false-positive warnings in dev_pm_domain_set() PM / Domains: Silence compiler warning for an unused function ACPI / CPPC: remove redundant mbox_send_message() declaration ACPI / LPSS: set PM domain via helper setter PNP: Add Haswell-ULT to Intel MCH size workaround
2016-02-06epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUTJason Baron1-6/+32
In the current implementation of the EPOLLEXCLUSIVE flag (added for 4.5-rc1), if epoll waiters create different POLL* sets and register them as exclusive against the same target fd, the current implementation will stop waking any further waiters once it finds the first idle waiter. This means that waiters could miss wakeups in certain cases. For example, when we wake up a pipe for reading we do: wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM); So if one epoll set or epfd is added to pipe p with POLLIN and a second set epfd2 is added to pipe p with POLLRDNORM, only epfd may receive the wakeup since the current implementation will stop after it finds any intersection of events with a waiter that is blocked in epoll_wait(). We could potentially address this by requiring all epoll waiters that are added to p be required to pass the same set of POLL* events. IE the first EPOLL_CTL_ADD that passes EPOLLEXCLUSIVE establishes the set POLL* flags to be used by any other epfds that are added as EPOLLEXCLUSIVE. However, I think it might be somewhat confusing interface as we would have to reference count the number of users for that set, and so userspace would have to keep track of that count, or we would need a more involved interface. It also adds some shared state that we'd have store somewhere. I don't think anybody will want to bloat __wait_queue_head for this. I think what we could do instead, is to simply restrict EPOLLEXCLUSIVE such that it can only be specified with EPOLLIN and/or EPOLLOUT. So that way if the wakeup includes 'POLLIN' and not 'POLLOUT', we can stop once we hit the first idle waiter that specifies the EPOLLIN bit, since any remaining waiters that only have 'POLLOUT' set wouldn't need to be woken. Likewise, we can do the same thing if 'POLLOUT' is in the wakeup bit set and not 'POLLIN'. If both 'POLLOUT' and 'POLLIN' are set in the wake bit set (there is at least one example of this I saw in fs/pipe.c), then we just wake the entire exclusive list. Having both 'POLLOUT' and 'POLLIN' both set should not be on any performance critical path, so I think that's ok (in fs/pipe.c its in pipe_release()). We also continue to include EPOLLERR and EPOLLHUP by default in any exclusive set. Thus, the user can specify EPOLLERR and/or EPOLLHUP but is not required to do so. Since epoll waiters may be interested in other events as well besides EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP, these can still be added by doing a 'dup' call on the target fd and adding that as one normally would with EPOLL_CTL_ADD. Since I think that the POLLIN and POLLOUT events are what we are interest in balancing, I think that the 'dup' thing could perhaps be added to only one of the waiter threads. However, I think that EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP should be sufficient for the majority of use-cases. Since EPOLLEXCLUSIVE is intended to be used with a target fd shared among multiple epfds, where between 1 and n of the epfds may receive an event, it does not satisfy the semantics of EPOLLONESHOT where only 1 epfd would get an event. Thus, it is not allowed to be specified in conjunction with EPOLLEXCLUSIVE. EPOLL_CTL_MOD is also not allowed if the fd was previously added as EPOLLEXCLUSIVE. It seems with the limited number of flags to not be as interesting, but this could be relaxed at some further point. Signed-off-by: Jason Baron <jbaron@akamai.com> Tested-by: Madars Vitolins <m@silodev.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Eric Wong <normalperson@yhbt.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-02-06radix-tree: fix oops after radix_tree_iter_retryKonstantin Khlebnikov1-3/+3
Helper radix_tree_iter_retry() resets next_index to the current index. In following radix_tree_next_slot current chunk size becomes zero. This isn't checked and it tries to dereference null pointer in slot. Tagged iterator is fine because retry happens only at slot 0 where tag bitmask in iter->tags is filled with single bit. Fixes: 46437f9a554f ("radix-tree: fix race in gang lookup") Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ohad Ben-Cohen <ohad@wizery.com> Cc: Jeremiah Mahler <jmmahler@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>