summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)AuthorFilesLines
2018-01-31Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds1-0/+2
Pull RDMA subsystem updates from Jason Gunthorpe: "Overall this cycle did not have any major excitement, and did not require any shared branch with netdev. Lots of driver updates, particularly of the scale-up and performance variety. The largest body of core work was Parav's patches fixing and restructing some of the core code to make way for future RDMA containerization. Summary: - misc small driver fixups to bnxt_re/hfi1/qib/hns/ocrdma/rdmavt/vmw_pvrdma/nes - several major feature adds to bnxt_re driver: SRIOV VF RoCE support, HugePages support, extended hardware stats support, and SRQ support - a notable number of fixes to the i40iw driver from debugging scale up testing - more work to enable the new hip08 chip in the hns driver - misc small ULP fixups to srp/srpt//ipoib - preparation for srp initiator and target to support the RDMA-CM protocol for connections - add RDMA-CM support to srp initiator, srp target is still a WIP - fixes for a couple of places where ipoib could spam the dmesg log - fix encode/decode of FDR/EDR data rates in the core - many patches from Parav with ongoing work to clean up inconsistencies and bugs in RoCE support around the rdma_cm - mlx5 driver support for the userspace features 'thread domain', 'wallclock timestamps' and 'DV Direct Connected transport'. Support for the firmware dual port rocee capability - core support for more than 32 rdma devices in the char dev allocation - kernel doc updates from Randy Dunlap - new netlink uAPI for inspecting RDMA objects similar in spirit to 'ss' - one minor change to the kobject code acked by Greg KH" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (259 commits) RDMA/nldev: Provide detailed QP information RDMA/nldev: Provide global resource utilization RDMA/core: Add resource tracking for create and destroy PDs RDMA/core: Add resource tracking for create and destroy CQs RDMA/core: Add resource tracking for create and destroy QPs RDMA/restrack: Add general infrastructure to track RDMA resources RDMA/core: Save kernel caller name when creating PD and CQ objects RDMA/core: Use the MODNAME instead of the function name for pd callers RDMA: Move enum ib_cq_creation_flags to uapi headers IB/rxe: Change RDMA_RXE kconfig to use select IB/qib: remove qib_keys.c IB/mthca: remove mthca_user.h RDMA/cm: Fix access to uninitialized variable RDMA/cma: Use existing netif_is_bond_master function IB/core: Avoid SGID attributes query while converting GID from OPA to IB RDMA/mlx5: Avoid memory leak in case of XRCD dealloc failure IB/umad: Fix use of unprotected device pointer IB/iser: Combine substrings for three messages IB/iser: Delete an unnecessary variable initialisation in iser_send_data_out() IB/iser: Delete an error message for a failed memory allocation in iser_send_data_out() ...
2018-01-31Merge tag 'dma-mapping-4.16' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds5-148/+285
Pull dma mapping updates from Christoph Hellwig: "Except for a runtime warning fix from Christian this is all about consolidation of the generic no-IOMMU code, a well as the glue code for swiotlb. All the code is based on the x86 implementation with hooks to allow all architectures that aren't cache coherent to use it. The x86 conversion itself has been deferred because the x86 maintainers were a little busy in the last months" * tag 'dma-mapping-4.16' of git://git.infradead.org/users/hch/dma-mapping: (57 commits) MAINTAINERS: add the iommu list for swiotlb and xen-swiotlb arm64: use swiotlb_alloc and swiotlb_free arm64: replace ZONE_DMA with ZONE_DMA32 mips: use swiotlb_{alloc,free} mips/netlogic: remove swiotlb support tile: use generic swiotlb_ops tile: replace ZONE_DMA with ZONE_DMA32 unicore32: use generic swiotlb_ops ia64: remove an ifdef around the content of pci-dma.c ia64: clean up swiotlb support ia64: use generic swiotlb_ops ia64: replace ZONE_DMA with ZONE_DMA32 swiotlb: remove various exports swiotlb: refactor coherent buffer allocation swiotlb: refactor coherent buffer freeing swiotlb: wire up ->dma_supported in swiotlb_dma_ops swiotlb: add common swiotlb_map_ops swiotlb: rename swiotlb_free to swiotlb_exit x86: rename swiotlb_dma_ops powerpc: rename swiotlb_dma_ops ...
2018-01-31Merge branch 'work.misc' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull misc vfs updates from Al Viro: "All kinds of misc stuff, without any unifying topic, from various people. Neil's d_anon patch, several bugfixes, introduction of kvmalloc analogue of kmemdup_user(), extending bitfield.h to deal with fixed-endians, assorted cleanups all over the place..." * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (28 commits) alpha: osf_sys.c: use timespec64 where appropriate alpha: osf_sys.c: fix put_tv32 regression jffs2: Fix use-after-free bug in jffs2_iget()'s error handling path dcache: delete unused d_hash_mask dcache: subtract d_hash_shift from 32 in advance fs/buffer.c: fold init_buffer() into init_page_buffers() fs: fold __inode_permission() into inode_permission() fs: add RWF_APPEND sctp: use vmemdup_user() rather than badly open-coding memdup_user() snd_ctl_elem_init_enum_names(): switch to vmemdup_user() replace_user_tlv(): switch to vmemdup_user() new primitive: vmemdup_user() memdup_user(): switch to GFP_USER eventfd: fold eventfd_ctx_get() into eventfd_ctx_fileget() eventfd: fold eventfd_ctx_read() into eventfd_read() eventfd: convert to use anon_inode_getfd() nfs4file: get rid of pointless include of btrfs.h uvc_v4l2: clean copyin/copyout up vme_user: don't use __copy_..._user() usx2y: don't bother with memdup_user() for 16-byte structure ...
2018-01-30Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds2-29/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: "The main RCU changes in this cycle were: - Updates to use cond_resched() instead of cond_resched_rcu_qs() where feasible (currently everywhere except in kernel/rcu and in kernel/torture.c). Also a couple of fixes to avoid sending IPIs to offline CPUs. - Updates to simplify RCU's dyntick-idle handling. - Updates to remove almost all uses of smp_read_barrier_depends() and read_barrier_depends(). - Torture-test updates. - Miscellaneous fixes" * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (72 commits) torture: Save a line in stutter_wait(): while -> for torture: Eliminate torture_runnable and perf_runnable torture: Make stutter less vulnerable to compilers and races locking/locktorture: Fix num reader/writer corner cases locking/locktorture: Fix rwsem reader_delay torture: Place all torture-test modules in one MAINTAINERS group rcutorture/kvm-build.sh: Skip build directory check rcutorture: Simplify functions.sh include path rcutorture: Simplify logging rcutorture/kvm-recheck-*: Improve result directory readability check rcutorture/kvm.sh: Support execution from any directory rcutorture/kvm.sh: Use consistent help text for --qemu-args rcutorture/kvm.sh: Remove unused variable, `alldone` rcutorture: Remove unused script, config2frag.sh rcutorture/configinit: Fix build directory error message rcutorture: Preempt RCU-preempt readers more vigorously torture: Reduce #ifdefs for preempt_schedule() rcu: Remove have_rcu_nocb_mask from tree_plugin.h rcu: Add comment giving debug strategy for double call_rcu() tracing, rcu: Hide trace event rcu_nocb_wake when not used ...
2018-01-30Merge branch 'core-debug-for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull STRICT_DEVMEM default from Ingo Molnar: "Make CONFIG_STRICT_DEVMEM default-y on x86 and arm64 as well, to follow the distro status quo" * 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Kconfig: Make STRICT_DEVMEM default-y on x86 and arm64
2018-01-30Merge tag v4.15 of ↵Jason Gunthorpe9-78/+148
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git To resolve conflicts in: drivers/infiniband/hw/mlx5/main.c drivers/infiniband/hw/mlx5/qp.c From patches merged into the -rc cycle. The conflict resolution matches what linux-next has been carrying. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-29Merge branch 'for-4.16/block' of git://git.kernel.dk/linux-blockLinus Torvalds3-1/+132
Pull block updates from Jens Axboe: "This is the main pull request for block IO related changes for the 4.16 kernel. Nothing major in this pull request, but a good amount of improvements and fixes all over the map. This contains: - BFQ improvements, fixes, and cleanups from Angelo, Chiara, and Paolo. - Support for SMR zones for deadline and mq-deadline from Damien and Christoph. - Set of fixes for bcache by way of Michael Lyle, including fixes from himself, Kent, Rui, Tang, and Coly. - Series from Matias for lightnvm with fixes from Hans Holmberg, Javier, and Matias. Mostly centered around pblk, and the removing rrpc 1.2 in preparation for supporting 2.0. - A couple of NVMe pull requests from Christoph. Nothing major in here, just fixes and cleanups, and support for command tracing from Johannes. - Support for blk-throttle for tracking reads and writes separately. From Joseph Qi. A few cleanups/fixes also for blk-throttle from Weiping. - Series from Mike Snitzer that enables dm to register its queue more logically, something that's alwways been problematic on dm since it's a stacked device. - Series from Ming cleaning up some of the bio accessor use, in preparation for supporting multipage bvecs. - Various fixes from Ming closing up holes around queue mapping and quiescing. - BSD partition fix from Richard Narron, fixing a problem where we can't mount newer (10/11) FreeBSD partitions. - Series from Tejun reworking blk-mq timeout handling. The previous scheme relied on atomic bits, but it had races where we would think a request had timed out if it to reused at the wrong time. - null_blk now supports faking timeouts, to enable us to better exercise and test that functionality separately. From me. - Kill the separate atomic poll bit in the request struct. After this, we don't use the atomic bits on blk-mq anymore at all. From me. - sgl_alloc/free helpers from Bart. - Heavily contended tag case scalability improvement from me. - Various little fixes and cleanups from Arnd, Bart, Corentin, Douglas, Eryu, Goldwyn, and myself" * 'for-4.16/block' of git://git.kernel.dk/linux-block: (186 commits) block: remove smart1,2.h nvme: add tracepoint for nvme_complete_rq nvme: add tracepoint for nvme_setup_cmd nvme-pci: introduce RECONNECTING state to mark initializing procedure nvme-rdma: remove redundant boolean for inline_data nvme: don't free uuid pointer before printing it nvme-pci: Suspend queues after deleting them bsg: use pr_debug instead of hand crafted macros blk-mq-debugfs: don't allow write on attributes with seq_operations set nvme-pci: Fix queue double allocations block: Set BIO_TRACE_COMPLETION on new bio during split blk-throttle: use queue_is_rq_based block: Remove kblockd_schedule_delayed_work{,_on}() blk-mq: Avoid that blk_mq_delay_run_hw_queue() introduces unintended delays blk-mq: Rename blk_mq_request_direct_issue() into blk_mq_request_issue_directly() lib/scatterlist: Fix chaining support in sgl_alloc_order() blk-throttle: track read and write request individually block: add bdev_read_only() checks to common helpers block: fail op_is_write() requests to read-only partitions blk-throttle: export io_serviced_recursive, io_service_bytes_recursive ...
2018-01-29Merge tag 'mfd-next-4.16' of ↵Linus Torvalds1-1/+57
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD updates from Lee Jones: "New Drivers: - Add support for RAVE Supervisory Processor Moved drivers: - Move Realtek Card Reader Driver to Misc New Device Support: - Add support for Pinctrl to axp20x New Functionality: - Add resume support to atmel-flexcom Fix-ups: - Split MFD (mfd) and userspace handlers (platform) in cros_ec - Fix trivial (whitespace, spelling) issue(s) in pcf50633-core - Clean-up error handling in ab8500-debugfs - General tidying up in tmio_core - Kconfig fix-ups for qcom-pm8xxx - Licensing changes (SPDX) to stm32-lptimer, stm32-timers - Device Tree fixups in mc13xxx - Simplify/remove unused code in cros_ec_spi, axp20x, ti_am335x_tscadc, kempld-core, intel_soc_pmic_core.c, ab8500-debugfs" * tag 'mfd-next-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: (32 commits) mfd: lpc_ich: Do not touch SPI-NOR write protection bit on Apollo Lake mfd: axp20x: Mark axp288 CHRG_BAK_CTRL register volatile mfd: ab8500: Introduce DEFINE_SHOW_ATTRIBUTE() macro atmel_flexcom: Support resuming after a chip reset mfd: Remove duplicate includes dt-bindings: mfd: mc13xxx: Add the unit address to sysled mfd: stm32: Adopt SPDX identifier mfd: axp20x: Add pinctrl cell for AXP813 mfd: pm8xxx: Make elegible for COMPILE_TEST mfd: kempld-core: Use resource_size function on resource object mfd: tmio: Move register macros to tmio_core.c mfd: cros ec: spi: Simplify delay handling between SPI messages mfd: palmas: Assign the right powerhold mask for tps65917 mfd: ab8500-debugfs: Use common error handling code in ab8500_print_modem_registers() mfd: ti_am335x_tscadc: Remove redundant assignment to node mfd: pcf50633: Fix spelling mistake: 'Falied' -> 'Failed' dt-bindings: watchdog: Add bindings for RAVE SP watchdog driver watchdog: Add RAVE SP watchdog driver mfd: Add driver for RAVE Supervisory Processor serdev: Introduce devm_serdev_device_open() ...
2018-01-23kobject: Export kobj_ns_grab_current() and kobj_ns_drop()Bart Van Assche1-0/+2
Make it possible to call these two functions from a kernel module. Note: despite their name, these two functions can be used meaningfully independent of kobjects. A later patch will add calls to these functions from the SRP driver because this patch series modifies the SRP driver such that it can hold a reference to a namespace that can last longer than the lifetime of the process through which the namespace reference was obtained. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-19lib/scatterlist: Fix chaining support in sgl_alloc_order()Bart Van Assche1-5/+27
This patch avoids that workloads with large block sizes (megabytes) can trigger the following call stack with the ib_srpt driver (that driver is the only driver that chains scatterlists allocated by sgl_alloc_order()): BUG: Bad page state in process kworker/0:1H pfn:2423a78 page:fffffb03d08e9e00 count:-3 mapcount:0 mapping: (null) index:0x0 flags: 0x57ffffc0000000() raw: 0057ffffc0000000 0000000000000000 0000000000000000 fffffffdffffffff raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000 page dumped because: nonzero _count CPU: 0 PID: 733 Comm: kworker/0:1H Tainted: G I 4.15.0-rc7.bart+ #1 Hardware name: HP ProLiant DL380 G7, BIOS P67 08/16/2015 Workqueue: ib-comp-wq ib_cq_poll_work [ib_core] Call Trace: dump_stack+0x5c/0x83 bad_page+0xf5/0x10f get_page_from_freelist+0xa46/0x11b0 __alloc_pages_nodemask+0x103/0x290 sgl_alloc_order+0x101/0x180 target_alloc_sgl+0x2c/0x40 [target_core_mod] srpt_alloc_rw_ctxs+0x173/0x2d0 [ib_srpt] srpt_handle_new_iu+0x61e/0x7f0 [ib_srpt] __ib_process_cq+0x55/0xa0 [ib_core] ib_cq_poll_work+0x1b/0x60 [ib_core] process_one_work+0x141/0x340 worker_thread+0x47/0x3e0 kthread+0xf5/0x130 ret_from_fork+0x1f/0x30 Fixes: e80a0af4759a ("lib/scatterlist: Introduce sgl_alloc() and sgl_free()") Reported-by: Laurence Oberman <loberman@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Cc: Nicholas A. Bellinger <nab@linux-iscsi.org> Cc: Laurence Oberman <loberman@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-15swiotlb: remove various exportsChristoph Hellwig1-13/+0
All these symbols are only used by arch dma_ops implementations or xen-swiotlb. None of which can be modular. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian König <christian.koenig@amd.com>
2018-01-15swiotlb: refactor coherent buffer allocationChristoph Hellwig1-57/+65
Factor out a new swiotlb_alloc_buffer helper that allocates DMA coherent memory from the swiotlb bounce buffer. This allows to simplify the swiotlb_alloc implemenation that uses dma_direct_alloc to try to allocate a reachable buffer first. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian König <christian.koenig@amd.com>
2018-01-15swiotlb: refactor coherent buffer freeingChristoph Hellwig1-14/+21
Factor out a new swiotlb_free_buffer helper that checks if an address is allocated from the swiotlb bounce buffer, and if yes frees it. This allows to simplify the swiotlb_free implemenation that uses dma_direct_free to free the non-bounce buffer allocations. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian König <christian.koenig@amd.com>
2018-01-15swiotlb: wire up ->dma_supported in swiotlb_dma_opsChristoph Hellwig1-0/+1
To properly reject too small DMA masks based on the addressability of the bounce buffer. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian König <christian.koenig@amd.com>
2018-01-15swiotlb: add common swiotlb_map_opsChristoph Hellwig1-0/+43
Currently all architectures that want to use swiotlb have to implement their own dma_map_ops instances. Provide a generic one based on the x86 implementation which first calls into dma_direct to try a full blown direct mapping implementation (including e.g. CMA) before falling back allocating from the swiotlb buffer. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian König <christian.koenig@amd.com>
2018-01-15swiotlb: rename swiotlb_free to swiotlb_exitChristoph Hellwig1-1/+1
Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2018-01-15swiotlb: suppress warning when __GFP_NOWARN is setChristian König1-6/+9
TTM tries to allocate coherent memory in chunks of 2MB first to improve TLB efficiency and falls back to allocating 4K pages if that fails. Suppress the warning when the 2MB allocations fails since there is a valid fall back path. Signed-off-by: Christian König <christian.koenig@amd.com> Reported-by: Mike Galbraith <efault@gmx.de> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Bug: https://bugs.freedesktop.org/show_bug.cgi?id=104082 CC: stable@vger.kernel.org Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-15dma-direct: reject too small dma masksChristoph Hellwig1-0/+19
Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2018-01-15dma-direct: make dma_direct_{alloc,free} available to other implementationsChristoph Hellwig1-3/+3
So that they don't need to indirect through the operation vector. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
2018-01-15dma-direct: retry allocations using GFP_DMA for small masksChristoph Hellwig1-1/+24
If an attempt to allocate memory succeeded, but isn't inside the supported DMA mask, retry the allocation with GFP_DMA set as a last resort. Based on the x86 code, but an off by one error in what is now dma_coherent_ok has been fixed vs the x86 code. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-15dma-direct: add support for allocation from ZONE_DMA and ZONE_DMA32Christoph Hellwig1-0/+14
This allows to dip into zones for lower memory if they are available. If one of the zones is not available the corresponding GFP_* flag will evaluate to 0 so they won't change anything. We provide an arch tunable for those architectures that do not use GFP_DMA for the lowest 24-bits, given that there are a few. Roughly based on the x86 code. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-15dma-direct: use node local allocations for coherent memoryChristoph Hellwig1-1/+1
To preserve the x86 behavior. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2018-01-15dma-direct: add support for CMA allocationChristoph Hellwig1-6/+18
Try the CMA allocator for coherent allocations if supported. Roughly modelled after the x86 code. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-15dma-direct: add dma address sanity checksChristoph Hellwig1-1/+30
Roughly based on the x86 pci-nommu implementation. Signed-off-by: Christoph Hellwig <hch@lst.de>
2018-01-15dma-direct: use phys_to_dmaChristoph Hellwig1-11/+7
This means it uses whatever linear remapping scheme that the architecture provides is used in the generic dma_direct ops. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
2018-01-15dma-direct: rename dma_noop to dma_directChristoph Hellwig3-22/+17
The trivial direct mapping implementation already does a virtual to physical translation which isn't strictly a noop, and will soon learn to do non-direct but linear physical to dma translations through the device offset and a few small tricks. Rename it to a better fitting name. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Vladimir Murzin <vladimir.murzin@arm.com>
2018-01-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-4/+7
Daniel Borkmann says: ==================== pull-request: bpf 2018-01-09 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Prevent out-of-bounds speculation in BPF maps by masking the index after bounds checks in order to fix spectre v1, and add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for removing the BPF interpreter from the kernel in favor of JIT-only mode to make spectre v2 harder, from Alexei. 2) Remove false sharing of map refcount with max_entries which was used in spectre v1, from Daniel. 3) Add a missing NULL psock check in sockmap in order to fix a race, from John. 4) Fix test_align BPF selftest case since a recent change in verifier rejects the bit-wise arithmetic on pointers earlier but test_align update was missing, from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-10dma-mapping: move swiotlb arch helpers to a new headerChristoph Hellwig1-1/+1
phys_to_dma, dma_to_phys and dma_capable are helpers published by architecture code for use of swiotlb and xen-swiotlb only. Drivers are not supposed to use these directly, but use the DMA API instead. Move these to a new asm/dma-direct.h helper, included by a linux/dma-direct.h wrapper that provides the default linear mapping unless the architecture wants to override it. In the MIPS case the existing dma-coherent.h is reused for now as untangling it will take a bit of work. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Robin Murphy <robin.murphy@arm.com>
2018-01-10bpf: introduce BPF_JIT_ALWAYS_ON configAlexei Starovoitov1-4/+7
The BPF interpreter has been used as part of the spectre 2 attack CVE-2017-5715. A quote from goolge project zero blog: "At this point, it would normally be necessary to locate gadgets in the host kernel code that can be used to actually leak data by reading from an attacker-controlled location, shifting and masking the result appropriately and then using the result of that as offset to an attacker-controlled address for a load. But piecing gadgets together and figuring out which ones work in a speculation context seems annoying. So instead, we decided to use the eBPF interpreter, which is built into the host kernel - while there is no legitimate way to invoke it from inside a VM, the presence of the code in the host kernel's text section is sufficient to make it usable for the attack, just like with ordinary ROP gadgets." To make attacker job harder introduce BPF_JIT_ALWAYS_ON config option that removes interpreter from the kernel in favor of JIT-only mode. So far eBPF JIT is supported by: x64, arm64, arm32, sparc64, s390, powerpc64, mips64 The start of JITed program is randomized and code page is marked as read-only. In addition "constant blinding" can be turned on with net.core.bpf_jit_harden v2->v3: - move __bpf_prog_ret0 under ifdef (Daniel) v1->v2: - fix init order, test_bpf and cBPF (Daniel's feedback) - fix offloaded bpf (Jakub's feedback) - add 'return 0' dummy in case something can invoke prog->bpf_func - retarget bpf tree. For bpf-next the patch would need one extra hunk. It will be sent when the trees are merged back to net-next Considered doing: int bpf_jit_enable __read_mostly = BPF_EBPF_JIT_DEFAULT; but it seems better to land the patch as-is and in bpf-next remove bpf_jit_enable global variable from all JITs, consolidate in one place and remove this jit_init() function. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-08lib/crc-ccitt: Add CCITT-FALSE CRC16 variantAndrew Morton1-1/+57
In support of a soon to be published MFD driver using serdev to talk to a supervisory processor that uses the CCITT-FALSE CRC16 variant in it's protocol, this patch was tested successfully on an i.MX6 ARM platform. Link: http://lkml.kernel.org/r/20170413142932.27287-1-andrew.smirnov@gmail.com Signed-off-by: Andrey Vostrikov <andrey.vostrikov@cogentembedded.com> Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com> Tested-by: Chris Healy <cphealy@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2018-01-06lib/scatterlist: Introduce sgl_alloc() and sgl_free()Bart Van Assche2-0/+109
Many kernel drivers contain code that allocates and frees both a scatterlist and the pages that populate that scatterlist. Introduce functions in lib/scatterlist.c that perform these tasks instead of duplicating this functionality in multiple drivers. Only include these functions in the build if CONFIG_SGL_ALLOC=y to avoid that the kernel size increases if this functionality is not used. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-01-05Merge branch 'linus' of ↵Linus Torvalds1-1/+17
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "This fixes the following issues: - racy use of ctx->rcvused in af_alg - algif_aead crash in chacha20poly1305 - freeing bogus pointer in pcrypt - build error on MIPS in mpi - memory leak in inside-secure - memory overwrite in inside-secure - NULL pointer dereference in inside-secure - state corruption in inside-secure - build error without CRYPTO_GF128MUL in chelsio - use after free in n2" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: inside-secure - do not use areq->result for partial results crypto: inside-secure - fix request allocations in invalidation path crypto: inside-secure - free requests even if their handling failed crypto: inside-secure - per request invalidation lib/mpi: Fix umul_ppmm() for MIPS64r6 crypto: pcrypt - fix freeing pcrypt instances crypto: n2 - cure use after free crypto: af_alg - Fix race around ctx->rcvused by making it atomic_t crypto: chacha20poly1305 - validate the digest size crypto: chelsio - select CRYPTO_GF128MUL
2018-01-03Merge branch 'for-mingo' of ↵Ingo Molnar2-29/+16
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Updates to use cond_resched() instead of cond_resched_rcu_qs() where feasible (currently everywhere except in kernel/rcu and in kernel/torture.c). Also a couple of fixes to avoid sending IPIs to offline CPUs. - Updates to simplify RCU's dyntick-idle handling. - Updates to remove almost all uses of smp_read_barrier_depends() and read_barrier_depends(). - Miscellaneous fixes. - Torture-test updates. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-31Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds1-3/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A pile of fixes for long standing issues with the timer wheel and the NOHZ code: - Prevent timer base confusion accross the nohz switch, which can cause unlocked access and data corruption - Reinitialize the stale base clock on cpu hotplug to prevent subtle side effects including rollovers on 32bit - Prevent an interrupt storm when the timer softirq is already pending caused by tick_nohz_stop_sched_tick() - Move the timer start tracepoint to a place where it actually makes sense - Add documentation to timerqueue functions as they caused confusion several times now" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timerqueue: Document return values of timerqueue_add/del() timers: Invoke timer_start_debug() where it makes sense nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick() timers: Reinitialize per cpu bases on hotplug timers: Use deferrable base independent of base::nohz_active
2017-12-31Merge tag 'driver-core-4.15-rc6' of ↵Linus Torvalds1-4/+12
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver core fixes for 4.15-rc6, resolving some reported issues. The first is a cacheinfo fix for DT based systems to resolve a reported issue that has been around for a while, and the other is to resolve a regression in the kobject uevent code that showed up in 4.15-rc1. Both have been in linux-next for a while with no reported issues" * tag 'driver-core-4.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: kobject: fix suppressing modalias in uevents delivered over netlink drivers: base: cacheinfo: fix cache type for non-architected system cache
2017-12-30timerqueue: Document return values of timerqueue_add/del()Thomas Gleixner1-3/+5
The return values of timerqueue_add/del() are not documented in the kernel doc comment. Add proper documentation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Sebastian Siewior <bigeasy@linutronix.de> Cc: rt@linutronix.de Cc: Paul McKenney <paulmck@linux.vnet.ibm.com> Cc: Anna-Maria Gleixner <anna-maria@linutronix.de> Link: https://lkml.kernel.org/r/20171222145337.872681338@linutronix.de
2017-12-22blk-mq: improve heavily contended tag caseJens Axboe1-1/+1
Even with a number of waitqueues, we can get into a situation where we are heavily contended on the waitqueue lock. I got a report on spc1 where we're spending seconds doing this. Arguably the use case is nasty, I reproduce it with one device and 1000 threads banging on the device. But that doesn't mean we shouldn't be handling it better. What ends up happening is that a thread will fail to get a tag, add itself to the waitqueue, and subsequently get woken up when a tag is freed - only to find itself going back to sleep on the waitqueue. Instead of waking all threads, use an exclusive wait and wake up our sbitmap batch count instead. This seems to work well for me (massive improvement for this use case), and it survives basic testing. But I haven't fully verified it yet. An additional improvement is running the queue and checking for a new tag BEFORE needing to add ourselves to the waitqueue. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2017-12-22lib/mpi: Fix umul_ppmm() for MIPS64r6James Hogan1-1/+17
Current MIPS64r6 toolchains aren't able to generate efficient DMULU/DMUHU based code for the C implementation of umul_ppmm(), which performs an unsigned 64 x 64 bit multiply and returns the upper and lower 64-bit halves of the 128-bit result. Instead it widens the 64-bit inputs to 128-bits and emits a __multi3 intrinsic call to perform a 128 x 128 multiply. This is both inefficient, and it results in a link error since we don't include __multi3 in MIPS linux. For example commit 90a53e4432b1 ("cfg80211: implement regdb signature checking") merged in v4.15-rc1 recently broke the 64r6_defconfig and 64r6el_defconfig builds by indirectly selecting MPILIB. The same build errors can be reproduced on older kernels by enabling e.g. CRYPTO_RSA: lib/mpi/generic_mpih-mul1.o: In function `mpihelp_mul_1': lib/mpi/generic_mpih-mul1.c:50: undefined reference to `__multi3' lib/mpi/generic_mpih-mul2.o: In function `mpihelp_addmul_1': lib/mpi/generic_mpih-mul2.c:49: undefined reference to `__multi3' lib/mpi/generic_mpih-mul3.o: In function `mpihelp_submul_1': lib/mpi/generic_mpih-mul3.c:49: undefined reference to `__multi3' lib/mpi/mpih-div.o In function `mpihelp_divrem': lib/mpi/mpih-div.c:205: undefined reference to `__multi3' lib/mpi/mpih-div.c:142: undefined reference to `__multi3' Therefore add an efficient MIPS64r6 implementation of umul_ppmm() using inline assembly and the DMULU/DMUHU instructions, to prevent __multi3 calls being emitted. Fixes: 7fd08ca58ae6 ("MIPS: Add build support for the MIPS R6 ISA") Signed-off-by: James Hogan <jhogan@kernel.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: "David S. Miller" <davem@davemloft.net> Cc: linux-mips@linux-mips.org Cc: linux-crypto@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2017-12-21kobject: fix suppressing modalias in uevents delivered over netlinkDmitry Torokhov1-4/+12
The commit 4a336a23d619 ("kobject: copy env blob in one go") optimized constructing uevent data for delivery over netlink by using the raw environment buffer, instead of reconstructing it from individual environment pointers. Unfortunately in doing so it broke suppressing MODALIAS attribute for KOBJ_UNBIND events, as the code that suppressed this attribute only adjusted the environment pointers, but left the buffer itself alone. Let's fix it by making sure the offending attribute is obliterated form the buffer as well. Reported-by: Tariq Toukan <tariqt@mellanox.com> Reported-by: Casey Leedom <leedom@chelsio.com> Fixes: 4a336a23d619 ("kobject: copy env blob in one go") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-12-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-0/+43
Daniel Borkmann says: ==================== pull-request: bpf 2017-12-17 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a corner case in generic XDP where we have non-linear skbs but enough tailroom in the skb to not miss to linearizing there, from Song. 2) Fix BPF JIT bugs in s390x and ppc64 to not recache skb data when BPF context is not skb, from Daniel. 3) Fix a BPF JIT bug in sparc64 where recaching skb data after helper call would use the wrong register for the skb, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15Merge branch 'locking-urgent-for-linus' of ↵Linus Torvalds1-33/+0
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Ingo Molnar: "Misc fixes: - Fix a S390 boot hang that was caused by the lock-break logic. Remove lock-break to begin with, as review suggested it was unreasonably fragile and our confidence in its continued good health is lower than our confidence in its removal. - Remove the lockdep cross-release checking code for now, because of unresolved false positive warnings. This should make lockdep work well everywhere again. - Get rid of the final (and single) ACCESS_ONCE() straggler and remove the API from v4.15. - Fix a liblockdep build warning" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools/lib/lockdep: Add missing declaration of 'pr_cont()' checkpatch: Remove ACCESS_ONCE() warning compiler.h: Remove ACCESS_ONCE() tools/include: Remove ACCESS_ONCE() tools/perf: Convert ACCESS_ONCE() to READ_ONCE() locking/lockdep: Remove the cross-release locking checks locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y locking/core: Fix deadlock during boot on systems with GENERIC_LOCKBREAK
2017-12-15bpf: add test case for ld_abs and helper changing pkt dataDaniel Borkmann1-0/+43
Add a test that i) uses LD_ABS, ii) zeroing R6 before call, iii) calls a helper that triggers reload of cached skb data, iv) uses LD_ABS again. It's added for test_bpf in order to do runtime testing after JITing as well as test_verifier to test that the sequence is allowed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15lib/rbtree,drm/mm: add rbtree_replace_node_cached()Chris Wilson1-0/+10
Add a variant of rbtree_replace_node() that maintains the leftmost cache of struct rbtree_root_cached when replacing nodes within the rbtree. As drm_mm is the only rb_replace_node() being used on an interval tree, the mistake looks fairly self-contained. Furthermore the only user of drm_mm_replace_node() is its testsuite... Testcase: igt/drm_mm/replace Link: http://lkml.kernel.org/r/20171122100729.3742-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20171109212435.9265-1-chris@chris-wilson.co.uk Fixes: f808c13fd373 ("lib/interval_tree: fast overlap detection") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Davidlohr Bueso <dbueso@suse.de> Cc: Jérôme Glisse <jglisse@redhat.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-12locking/lockdep: Remove the cross-release locking checksIngo Molnar1-33/+0
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y), while it found a number of old bugs initially, was also causing too many false positives that caused people to disable lockdep - which is arguably a worse overall outcome. If we disable cross-release by default but keep the code upstream then in practice the most likely outcome is that we'll allow the situation to degrade gradually, by allowing entropy to introduce more and more false positives, until it overwhelms maintenance capacity. Another bad side effect was that people were trying to work around the false positives by uglifying/complicating unrelated code. There's a marked difference between annotating locking operations and uglifying good code just due to bad lock debugging code ... This gradual decrease in quality happened to a number of debugging facilities in the kernel, and lockdep is pretty complex already, so we cannot risk this outcome. Either cross-release checking can be done right with no false positives, or it should not be included in the upstream kernel. ( Note that it might make sense to maintain it out of tree and go through the false positives every now and then and see whether new bugs were introduced. ) Cc: Byungchul Park <byungchul.park@lge.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11Kconfig: Make STRICT_DEVMEM default-y on x86 and arm64Kees Cook1-1/+1
Distros have been shipping with CONFIG_STRICT_DEVMEM=y for years now. It is probably time to flip this default for x86 and arm64. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Laura Abbott <labbott@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will.deacon@arm.com> Cc: kernel-hardening@lists.openwall.com Link: http://lkml.kernel.org/r/20171201201000.GA44539@beast Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-11Fix misannotated out-of-line _copy_to_user()Christophe Leroy1-1/+1
Destination is a kernel pointer and source - a userland one in _copy_from_user(); _copy_to_user() is the other way round. Fixes: d597580d37377 ("generic ...copy_..._user primitives") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-12-09Merge tag 'keys-fixes-20171208' of ↵James Morris2-27/+38
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs into keys-for-linus Assorted fixes for keyrings, ASN.1, X.509 and PKCS#7.
2017-12-08509: fix printing uninitialized stack memory when OID is emptyEric Biggers1-2/+6
Callers of sprint_oid() do not check its return value before printing the result. In the case where the OID is zero-length, -EBADMSG was being returned without anything being written to the buffer, resulting in uninitialized stack memory being printed. Fix this by writing "(bad)" to the buffer in the cases where -EBADMSG is returned. Fixes: 4f73175d0375 ("X.509: Add utility functions to render OIDs as strings") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com>
2017-12-08X.509: fix buffer overflow detection in sprint_oid()Eric Biggers1-4/+4
In sprint_oid(), if the input buffer were to be more than 1 byte too small for the first snprintf(), 'bufsize' would underflow, causing a buffer overflow when printing the remainder of the OID. Fortunately this cannot actually happen currently, because no users pass in a buffer that can be too small for the first snprintf(). Regardless, fix it by checking the snprintf() return value correctly. For consistency also tweak the second snprintf() check to look the same. Fixes: 4f73175d0375 ("X.509: Add utility functions to render OIDs as strings") Cc: Takashi Iwai <tiwai@suse.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com>
2017-12-08ASN.1: check for error from ASN1_OP_END__ACT actionsEric Biggers1-0/+2
asn1_ber_decoder() was ignoring errors from actions associated with the opcodes ASN1_OP_END_SEQ_ACT, ASN1_OP_END_SET_ACT, ASN1_OP_END_SEQ_OF_ACT, and ASN1_OP_END_SET_OF_ACT. In practice, this meant the pkcs7_note_signed_info() action (since that was the only user of those opcodes). Fix it by checking for the error, just like the decoder does for actions associated with the other opcodes. This bug allowed users to leak slab memory by repeatedly trying to add a specially crafted "pkcs7_test" key (requires CONFIG_PKCS7_TEST_KEY). In theory, this bug could also be used to bypass module signature verification, by providing a PKCS#7 message that is misparsed such that a signature's ->authattrs do not contain its ->msgdigest. But it doesn't seem practical in normal cases, due to restrictions on the format of the ->authattrs. Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder") Cc: <stable@vger.kernel.org> # v3.7+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: James Morris <james.l.morris@oracle.com>