summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2014-12-13memcg: turn memcg_kmem_skip_account into a bit fieldVladimir Davydov1-2/+5
It isn't supposed to stack, so turn it into a bit-field to save 4 bytes on the task_struct. Also, remove the memcg_stop/resume_kmem_account helpers - it is clearer to set/clear the flag inline. Regarding the overwhelming comment to the helpers, which is removed by this patch too, we already have a compact yet accurate explanation in memcg_schedule_cache_create, no need in yet another one. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13lib: bitmap: add alignment offset for bitmap_find_next_zero_area()Michal Nazarewicz1-5/+31
Add a bitmap_find_next_zero_area_off() function which works like bitmap_find_next_zero_area() function except it allows an offset to be specified when alignment is checked. This lets caller request a bit such that its number plus the offset is aligned according to the mask. [gregory.0xf0@gmail.com: Retrieved from https://patchwork.linuxtv.org/patch/6254/ and updated documentation] Signed-off-by: Michal Nazarewicz <mina86@mina86.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Gregory Fong <gregory.0xf0@gmail.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kukjin Kim <kgene.kim@samsung.com> Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Cc: Laura Abbott <lauraa@codeaurora.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13mm/rmap: share the i_mmap_rwsemDavidlohr Bueso1-0/+10
Similarly to the anon memory counterpart, we can share the mapping's lock ownership as the interval tree is not modified when doing doing the walk, only the file page. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: "Kirill A. Shutemov" <kirill@shutemov.name> Acked-by: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13mm: convert i_mmap_mutex to rwsemDavidlohr Bueso2-4/+5
The i_mmap_mutex is a close cousin of the anon vma lock, both protecting similar data, one for file backed pages and the other for anon memory. To this end, this lock can also be a rwsem. In addition, there are some important opportunities to share the lock when there are no tree modifications. This conversion is straightforward. For now, all users take the write lock. [sfr@canb.auug.org.au: update fremap.c] Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: "Kirill A. Shutemov" <kirill@shutemov.name> Acked-by: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13mm,fs: introduce helpers around the i_mmap_mutexDavidlohr Bueso1-0/+10
This series is a continuation of the conversion of the i_mmap_mutex to rwsem, following what we have for the anon memory counterpart. With Hugh's feedback from the first iteration. Ultimately, the most obvious paths that require exclusive ownership of the lock is when we modify the VMA interval tree, via vma_interval_tree_insert() and vma_interval_tree_remove() families. Cases such as unmapping, where the ptes content is changed but the tree remains untouched should make it safe to share the i_mmap_rwsem. As such, the code of course is straightforward, however the devil is very much in the details. While its been tested on a number of workloads without anything exploding, I would not be surprised if there are some less documented/known assumptions about the lock that could suffer from these changes. Or maybe I'm just missing something, but either way I believe its at the point where it could use more eyes and hopefully some time in linux-next. Because the lock type conversion is the heart of this patchset, its worth noting a few comparisons between mutex vs rwsem (xadd): (i) Same size, no extra footprint. (ii) Both have CONFIG_XXX_SPIN_ON_OWNER capabilities for exclusive lock ownership. (iii) Both can be slightly unfair wrt exclusive ownership, with writer lock stealing properties, not necessarily respecting FIFO order for granting the lock when contended. (iv) Mutexes can be slightly faster than rwsems when the lock is non-contended. (v) Both suck at performance for debug (slowpaths), which shouldn't matter anyway. Sharing the lock is obviously beneficial, and sem writer ownership is close enough to mutexes. The biggest winner of these changes is migration. As for concrete numbers, the following performance results are for a 4-socket 60-core IvyBridge-EX with 130Gb of RAM. Both alltests and disk (xfs+ramdisk) workloads of aim7 suite do quite well with this set, with a steady ~60% throughput (jpm) increase for alltests and up to ~30% for disk for high amounts of concurrency. Lower counts of workload users (< 100) does not show much difference at all, so at least no regressions. 3.18-rc1 3.18-rc1-i_mmap_rwsem alltests-100 17918.72 ( 0.00%) 28417.97 ( 58.59%) alltests-200 16529.39 ( 0.00%) 26807.92 ( 62.18%) alltests-300 16591.17 ( 0.00%) 26878.08 ( 62.00%) alltests-400 16490.37 ( 0.00%) 26664.63 ( 61.70%) alltests-500 16593.17 ( 0.00%) 26433.72 ( 59.30%) alltests-600 16508.56 ( 0.00%) 26409.20 ( 59.97%) alltests-700 16508.19 ( 0.00%) 26298.58 ( 59.31%) alltests-800 16437.58 ( 0.00%) 26433.02 ( 60.81%) alltests-900 16418.35 ( 0.00%) 26241.61 ( 59.83%) alltests-1000 16369.00 ( 0.00%) 26195.76 ( 60.03%) alltests-1100 16330.11 ( 0.00%) 26133.46 ( 60.03%) alltests-1200 16341.30 ( 0.00%) 26084.03 ( 59.62%) alltests-1300 16304.75 ( 0.00%) 26024.74 ( 59.61%) alltests-1400 16231.08 ( 0.00%) 25952.35 ( 59.89%) alltests-1500 16168.06 ( 0.00%) 25850.58 ( 59.89%) alltests-1600 16142.56 ( 0.00%) 25767.42 ( 59.62%) alltests-1700 16118.91 ( 0.00%) 25689.58 ( 59.38%) alltests-1800 16068.06 ( 0.00%) 25599.71 ( 59.32%) alltests-1900 16046.94 ( 0.00%) 25525.92 ( 59.07%) alltests-2000 16007.26 ( 0.00%) 25513.07 ( 59.38%) disk-100 7582.14 ( 0.00%) 7257.48 ( -4.28%) disk-200 6962.44 ( 0.00%) 7109.15 ( 2.11%) disk-300 6435.93 ( 0.00%) 6904.75 ( 7.28%) disk-400 6370.84 ( 0.00%) 6861.26 ( 7.70%) disk-500 6353.42 ( 0.00%) 6846.71 ( 7.76%) disk-600 6368.82 ( 0.00%) 6806.75 ( 6.88%) disk-700 6331.37 ( 0.00%) 6796.01 ( 7.34%) disk-800 6324.22 ( 0.00%) 6788.00 ( 7.33%) disk-900 6253.52 ( 0.00%) 6750.43 ( 7.95%) disk-1000 6242.53 ( 0.00%) 6855.11 ( 9.81%) disk-1100 6234.75 ( 0.00%) 6858.47 ( 10.00%) disk-1200 6312.76 ( 0.00%) 6845.13 ( 8.43%) disk-1300 6309.95 ( 0.00%) 6834.51 ( 8.31%) disk-1400 6171.76 ( 0.00%) 6787.09 ( 9.97%) disk-1500 6139.81 ( 0.00%) 6761.09 ( 10.12%) disk-1600 4807.12 ( 0.00%) 6725.33 ( 39.90%) disk-1700 4669.50 ( 0.00%) 5985.38 ( 28.18%) disk-1800 4663.51 ( 0.00%) 5972.99 ( 28.08%) disk-1900 4674.31 ( 0.00%) 5949.94 ( 27.29%) disk-2000 4668.36 ( 0.00%) 5834.93 ( 24.99%) In addition, a 67.5% increase in successfully migrated NUMA pages, thus improving node locality. The patch layout is simple but designed for bisection (in case reversion is needed if the changes break upstream) and easier review: o Patches 1-4 convert the i_mmap lock from mutex to rwsem. o Patches 5-10 share the lock in specific paths, each patch details the rationale behind why it should be safe. This patchset has been tested with: postgres 9.4 (with brand new hugetlb support), hugetlbfs test suite (all tests pass, in fact more tests pass with these changes than with an upstream kernel), ltp, aim7 benchmarks, memcached and iozone with the -B option for mmap'ing. *Untested* paths are nommu, memory-failure, uprobes and xip. This patch (of 8): Various parts of the kernel acquire and release this mutex, so add i_mmap_lock_write() and immap_unlock_write() helper functions that will encapsulate this logic. The next patch will make use of these. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: "Kirill A. Shutemov" <kirill@shutemov.name> Acked-by: Hugh Dickins <hughd@google.com> Cc: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-12Merge tag 'please-pull-morepstore' of ↵Linus Torvalds1-1/+3
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull pstore update #2 from Tony Luck: "Couple of pstore-ram enhancements to allow use of different memory attributes" * tag 'please-pull-morepstore' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: pstore-ram: Allow optional mapping with pgprot_noncached pstore-ram: Fix hangs by using write-combine mappings
2014-12-12Merge branch 'for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs update from Chris Mason: "From a feature point of view, most of the code here comes from Miao Xie and others at Fujitsu to implement scrubbing and replacing devices on raid56. This has been in development for a while, and it's a big improvement. Filipe and Josef have a great assortment of fixes, many of which solve problems corruptions either after a crash or in error conditions. I still have a round two from Filipe for next week that solves corruptions with discard and block group removal" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (62 commits) Btrfs: make get_caching_control unconditionally return the ctl Btrfs: fix unprotected deletion from pending_chunks list Btrfs: fix fs mapping extent map leak Btrfs: fix memory leak after block remove + trimming Btrfs: make btrfs_abort_transaction consider existence of new block groups Btrfs: fix race between writing free space cache and trimming Btrfs: fix race between fs trimming and block group remove/allocation Btrfs, replace: enable dev-replace for raid56 Btrfs: fix freeing used extents after removing empty block group Btrfs: fix crash caused by block group removal Btrfs: fix invalid block group rbtree access after bg is removed Btrfs, raid56: fix use-after-free problem in the final device replace procedure on raid56 Btrfs, replace: write raid56 parity into the replace target device Btrfs, replace: write dirty pages into the replace target device Btrfs, raid56: support parity scrub on raid56 Btrfs, raid56: use a variant to record the operation type Btrfs, scrub: repair the common data on RAID5/6 if it is corrupted Btrfs, raid56: don't change bbio and raid_map Btrfs: remove unnecessary code of stripe_index assignment in __btrfs_map_block Btrfs: remove noused bbio_ret in __btrfs_map_block in condition ...
2014-12-12Merge branch 'for-linus' of ↵Linus Torvalds1-4/+39
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID updates from Jiri Kosina: - i2c-hid race condition fix from Jean-Baptiste Maneyrol - Logitech driver now supports vendor-specific HID++ protocol, allowing us to deliver a full multitouch support on wider range of Logitech touchpads. Written by Benjamin Tissoires - MS Surface Pro 3 Type Cover support added by Alan Wu - RMI touchpad support improvements from Andrew Duggan - a lot of updates to Wacom driver from Jason Gerecke and Ping Cheng - various small fixes all over the place * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (56 commits) HID: rmi: The address of query8 must be calculated based on which query registers are present HID: rmi: Check for additional ACM registers appended to F11 data report HID: i2c-hid: prevent buffer overflow in early IRQ HID: logitech-hidpp: disable io in probe error path HID: logitech-hidpp: add boundary check for name retrieval HID: logitech-hidpp: check name retrieval return code HID: logitech-hidpp: do not return the name length HID: wacom: Report input events for each finger on generic devices HID: wacom: Initialize MT slots for generic devices at post_parse_hid HID: wacom: Update maximum X/Y accounding to outbound offset HID: wacom: Add support for DTU-1031X HID: wacom: add defines for new Cintiq and DTU outbound tracking HID: wacom: fix freeze on open when autosuspend is on HID: wacom: re-add accidentally dropped Lenovo PID HID: make hid_report_len as a static inline function in hid.h HID: wacom: Consult the application usage when determining field type HID: wacom: PAD is independent with pen/touch HID: multitouch: Add quirk for VTL touch panels HID: i2c-hid: fix race condition reading reports HID: wacom: Add angular resolution data to some ABS axes ...
2014-12-12Merge branch 'for-linus' of ↵Linus Torvalds5-28/+28
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree update from Jiri Kosina: "Usual stuff: documentation updates, printk() fixes, etc" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (24 commits) intel_ips: fix a type in error message cpufreq: cpufreq-dt: Move newline to end of error message ps3rom: fix error return code treewide: fix typo in printk and Kconfig ARM: dts: bcm63138: change "interupts" to "interrupts" Replace mentions of "list_struct" to "list_head" kernel: trace: fix printk message scsi: mpt2sas: fix ioctl in comment zbud, zswap: change module author email clocksource: Fix 'clcoksource' typo in comment arm: fix wording of "Crotex" in CONFIG_ARCH_EXYNOS3 help gpio: msm-v1: make boolean argument more obvious usb: Fix typo in usb-serial-simple.c PCI: Fix comment typo 'COMFIG_PM_OPS' powerpc: Fix comment typo 'CONIFG_8xx' powerpc: Fix comment typos 'CONFiG_ALTIVEC' clk: st: Spelling s/stucture/structure/ isci: Spelling s/stucture/structure/ usb: gadget: zero: Spelling s/infrastucture/infrastructure/ treewide: Fix company name in module descriptions ...
2014-12-12Merge tag 'ext4_for_linus' of ↵Linus Torvalds1-11/+6
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Lots of bugs fixes, including Zheng and Jan's extent status shrinker fixes, which should improve CPU utilization and potential soft lockups under heavy memory pressure, and Eric Whitney's bigalloc fixes" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (26 commits) ext4: ext4_da_convert_inline_data_to_extent drop locked page after error ext4: fix suboptimal seek_{data,hole} extents traversial ext4: ext4_inline_data_fiemap should respect callers argument ext4: prevent fsreentrance deadlock for inline_data ext4: forbid journal_async_commit in data=ordered mode jbd2: remove unnecessary NULL check before iput() ext4: Remove an unnecessary check for NULL before iput() ext4: remove unneeded code in ext4_unlink ext4: don't count external journal blocks as overhead ext4: remove never taken branch from ext4_ext_shift_path_extents() ext4: create nojournal_checksum mount option ext4: update comments regarding ext4_delete_inode() ext4: cleanup GFP flags inside resize path ext4: introduce aging to extent status tree ext4: cleanup flag definitions for extent status tree ext4: limit number of scanned extents in status tree shrinker ext4: move handling of list of shrinkable inodes into extent status code ext4: change LRU to round-robin in extent status tree shrinker ext4: cache extent hole in extent status tree for ext4_da_map_blocks() ext4: fix block reservation for bigalloc filesystems ...
2014-12-12Merge branches 'for-3.19/hid-report-len', 'for-3.19/i2c-hid', ↵Jiri Kosina1-0/+39
'for-3.19/lenovo', 'for-3.19/logitech', 'for-3.19/microsoft', 'for-3.19/plantronics', 'for-3.19/rmi', 'for-3.19/sony' and 'for-3.19/wacom' into for-linus
2014-12-12Merge branches 'for-3.18/upstream-fixes' and 'for-3.19/upstream' into for-linusJiri Kosina1-4/+0
Conflicts: drivers/hid/hid-input.c
2014-12-12Merge branch 'for-3.19' of ↵Linus Torvalds2-30/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup update from Tejun Heo: "cpuset got simplified a bit. cgroup core got a fix on unified hierarchy and grew some effective css related interfaces which will be used for blkio support for writeback IO traffic which is currently being worked on" * 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: implement cgroup_get_e_css() cgroup: add cgroup_subsys->css_e_css_changed() cgroup: add cgroup_subsys->css_released() cgroup: fix the async css offline wait logic in cgroup_subtree_control_write() cgroup: restructure child_subsys_mask handling in cgroup_subtree_control_write() cgroup: separate out cgroup_calc_child_subsys_mask() from cgroup_refresh_child_subsys_mask() cpuset: lock vs unlock typo cpuset: simplify cpuset_node_allowed API cpuset: convert callback_mutex to a spinlock
2014-12-12Merge branch 'for-3.19' of ↵Linus Torvalds2-10/+7
git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata changes from Tejun Heo: "The only interesting piece is the support for shingled drives. The changes in libata layer are minimal. All it does is identifying the new class of device and report upwards accordingly" * 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: Remove FIXME comment in atapi_request_sense() sata_rcar: Document deprecated "renesas,rcar-sata" sata_rcar: Add clocks to sata_rcar bindings ahci_sunxi: Make AHCI_HFLAG_NO_PMP flag configurable with a module option libata-scsi: Update SATL for ZAC drives libata: Implement ATA_DEV_ZAC libsas: use ata_dev_classify()
2014-12-12Merge branch 'for-3.19' of ↵Linus Torvalds1-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu updates from Tejun Heo: "Nothing interesting. A patch to convert the remaining __get_cpu_var() users, another to fix non-critical off-by-one in an assertion and a cosmetic conversion to lockless_dereference() in percpu-ref. The back-merge from mainline is to receive lockless_dereference()" * 'for-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: Replace smp_read_barrier_depends() with lockless_dereference() percpu: Convert remaining __get_cpu_var uses in 3.18-rcX percpu: off by one in BUG_ON()
2014-12-12Merge tag 'stable/for-linus-3.19-rc0-tag' of ↵Linus Torvalds4-3/+26
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen features and fixes from David Vrabel: - Fully support non-coherent devices on ARM by introducing the mechanisms to request the hypervisor to perform the required cache maintainance operations. - A number of pciback bug fixes and cleanups. Notably a deadlock fix if a PCI device was manually uunbound and a fix for incorrectly restoring state after a function reset. - In x86 PVHVM guests, use the APIC for interrupts if this has been virtualized by the hardware. This reduces the number of interrupt- related VM exits on such hardware. * tag 'stable/for-linus-3.19-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (26 commits) Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single" xen/pci: Use APIC directly when APIC virtualization hardware is available xen/pci: Defer initialization of MSI ops on HVM guests xen-pciback: drop SR-IOV VFs when PF driver unloads xen/pciback: Restore configuration space when detaching from a guest. PCI: Expose pci_load_saved_state for public consumption. xen/pciback: Remove tons of dereferences xen/pciback: Print out the domain owning the device. xen/pciback: Include the domain id if removing the device whilst still in use driver core: Provide an wrapper around the mutex to do lockdep warnings xen/pciback: Don't deadlock when unbinding. swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single swiotlb-xen: call xen_dma_sync_single_for_device when appropriate swiotlb-xen: remove BUG_ON in xen_bus_to_phys swiotlb-xen: pass dev_addr to xen_dma_unmap_page and xen_dma_sync_single_for_cpu xen/arm: introduce GNTTABOP_cache_flush xen/arm/arm64: introduce xen_arch_need_swiotlb xen/arm/arm64: merge xen/mm32.c into xen/mm.c xen/arm: use hypercall to flush caches in map_page xen: add a dma_addr_t dev_addr argument to xen_dma_map_page ...
2014-12-12Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linusLinus Torvalds3-0/+263
Pull MIPS updates from Ralf Baechle: "This is an unusually large pull request for MIPS - in parts because lots of patches missed the 3.18 deadline but primarily because some folks opened the flood gates. - Retire the MIPS-specific phys_t with the generic phys_addr_t. - Improvments for the backtrace code used by oprofile. - Better backtraces on SMP systems. - Cleanups for the Octeon platform code. - Cleanups and fixes for the Loongson platform code. - Cleanups and fixes to the firmware library. - Switch ATH79 platform to use the firmware library. - Grand overhault to the SEAD3 and Malta interrupt code. - Move the GIC interrupt code to drivers/irqchip - Lots of GIC cleanups and updates to the GIC code to use modern IRQ infrastructures and features of the kernel. - OF documentation updates for the GIC bindings - Move GIC clocksource driver to drivers/clocksource - Merge GIC clocksource driver with clockevent driver. - Further updates to bring the GIC clocksource driver up to date. - R3000 TLB code cleanups - Improvments to the Loongson 3 platform code. - Convert pr_warning to pr_warn. - Merge a bunch of small lantiq and ralink fixes that have been staged/lingering inside the openwrt tree for a while. - Update archhelp for IP22/IP32 - Fix a number of issues for Loongson 1B. - New clocksource and clockevent driver for Loongson 1B. - Further work on clk handling for Loongson 1B. - Platform work for Broadcom BMIPS. - Error handling cleanups for TurboChannel. - Fixes and optimization to the microMIPS support. - Option to disable the FTLB. - Dump more relevant information on machine check exception - Change binfmt to allow arch to examine PT_*PROC headers - Support for new style FPU register model in O32 - VDSO randomization. - BCM47xx cleanups - BCM47xx reimplement the way the kernel accesses NVRAM information. - Random cleanups - Add support for ATH25 platforms - Remove pointless locking code in some PCI platforms. - Some improvments to EVA support - Minor Alchemy cleanup" * 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (185 commits) MIPS: Add MFHC0 and MTHC0 instructions to uasm. MIPS: Cosmetic cleanups of page table headers. MIPS: Add CP0 macros for extended EntryLo registers MIPS: Remove now unused definition of phys_t. MIPS: Replace use of phys_t with phys_addr_t. MIPS: Replace MIPS-specific 64BIT_PHYS_ADDR with generic PHYS_ADDR_T_64BIT PCMCIA: Alchemy Don't select 64BIT_PHYS_ADDR in Kconfig. MIPS: lib: memset: Clean up some MIPS{EL,EB} ifdefery MIPS: iomap: Use __mem_{read,write}{b,w,l} for MMIO MIPS: <asm/types.h> fix indentation. MAINTAINERS: Add entry for BMIPS multiplatform kernel MIPS: Enable VDSO randomization MIPS: Remove a temporary hack for debugging cache flushes in SMTC configuration MIPS: Remove declaration of obsolete arch_init_clk_ops() MIPS: atomic.h: Reformat to fit in 79 columns MIPS: Apply `.insn' to fixup labels throughout MIPS: Fix microMIPS LL/SC immediate offsets MIPS: Kconfig: Only allow 32-bit microMIPS builds MIPS: signal.c: Fix an invalid cast in ISA mode bit handling MIPS: mm: Only build one microassembler that is suitable ...
2014-12-12Merge tag 'powerpc-3.19-1' of ↵Linus Torvalds1-0/+46
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux Pull powerpc updates from Michael Ellerman: "Some nice cleanups like removing bootmem, and removal of __get_cpu_var(). There is one patch to mm/gup.c. This is the generic GUP implementation, but is only used by us and arm(64). We have an ack from Steve Capper, and although we didn't get an ack from Andrew he told us to take the patch through the powerpc tree. There's one cxl patch. This is in drivers/misc, but Greg said he was happy for us to manage fixes for it. There is an infrastructure patch to support an IPMI driver for OPAL. There is also an RTC driver for OPAL. We weren't able to get any response from the RTC maintainer, Alessandro Zummo, so in the end we just merged the driver. The usual batch of Freescale updates from Scott" * tag 'powerpc-3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: (101 commits) powerpc/powernv: Return to cpu offline loop when finished in KVM guest powerpc/book3s: Fix partial invalidation of TLBs in MCE code. powerpc/mm: don't do tlbie for updatepp request with NO HPTE fault powerpc/xmon: Cleanup the breakpoint flags powerpc/xmon: Enable HW instruction breakpoint on POWER8 powerpc/mm/thp: Use tlbiel if possible powerpc/mm/thp: Remove code duplication powerpc/mm/hugetlb: Sanity check gigantic hugepage count powerpc/oprofile: Disable pagefaults during user stack read powerpc/mm: Check for matching hpte without taking hpte lock powerpc: Drop useless warning in eeh_init() powerpc/powernv: Cleanup unused MCE definitions/declarations. powerpc/eeh: Dump PHB diag-data early powerpc/eeh: Recover EEH error on ownership change for BCM5719 powerpc/eeh: Set EEH_PE_RESET on PE reset powerpc/eeh: Refactor eeh_reset_pe() powerpc: Remove more traces of bootmem powerpc/pseries: Initialise nvram_pstore_info's buf_lock cxl: Name interrupts in /proc/interrupt cxl: Return error to PSL if IRQ demultiplexing fails & print clearer warning ...
2014-12-12Merge branch 'for-linus' of ↵Linus Torvalds3-0/+23
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "The most notable change for this pull request is the ftrace rework from Heiko. It brings a small performance improvement and the ground work to support a new gcc option to replace the mcount blocks with a single nop. Two new s390 specific system calls are added to emulate user space mmio for PCI, an artifact of the how PCI memory is accessed. Two patches for the memory management with changes to common code. For KVM mm_forbids_zeropage is added which disables the empty zero page for an mm that is used by a KVM process. And an optimization, pmdp_get_and_clear_full is added analog to ptep_get_and_clear_full. Some micro optimization for the cmpxchg and the spinlock code. And as usual bug fixes and cleanups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (46 commits) s390/cputime: fix 31-bit compile s390/scm_block: make the number of reqs per HW req configurable s390/scm_block: handle multiple requests in one HW request s390/scm_block: allocate aidaw pages only when necessary s390/scm_block: use mempool to manage aidaw requests s390/eadm: change timeout value s390/mm: fix memory leak of ptlock in pmd_free_tlb s390: use local symbol names in entry[64].S s390/ptrace: always include vector registers in core files s390/simd: clear vector register pointer on fork/clone s390: translate cputime magic constants to macros s390/idle: convert open coded idle time seqcount s390/idle: add missing irq off lockdep annotation s390/debug: avoid function call for debug_sprintf_* s390/kprobes: fix instruction copy for out of line execution s390: remove diag 44 calls from cpu_relax() s390/dasd: retry partition detection s390/dasd: fix list corruption for sleep_on requests s390/dasd: fix infinite term I/O loop s390/dasd: remove unused code ...
2014-12-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds118-898/+4186
Pull networking updates from David Miller: 1) New offloading infrastructure and example 'rocker' driver for offloading of switching and routing to hardware. This work was done by a large group of dedicated individuals, not limited to: Scott Feldman, Jiri Pirko, Thomas Graf, John Fastabend, Jamal Hadi Salim, Andy Gospodarek, Florian Fainelli, Roopa Prabhu 2) Start making the networking operate on IOV iterators instead of modifying iov objects in-situ during transfers. Thanks to Al Viro and Herbert Xu. 3) A set of new netlink interfaces for the TIPC stack, from Richard Alpe. 4) Remove unnecessary looping during ipv6 routing lookups, from Martin KaFai Lau. 5) Add PAUSE frame generation support to gianfar driver, from Matei Pavaluca. 6) Allow for larger reordering levels in TCP, which are easily achievable in the real world right now, from Eric Dumazet. 7) Add a variable of napi_schedule that doesn't need to disable cpu interrupts, from Eric Dumazet. 8) Use a doubly linked list to optimize neigh_parms_release(), from Nicolas Dichtel. 9) Various enhancements to the kernel BPF verifier, and allow eBPF programs to actually be attached to sockets. From Alexei Starovoitov. 10) Support TSO/LSO in sunvnet driver, from David L Stevens. 11) Allow controlling ECN usage via routing metrics, from Florian Westphal. 12) Remote checksum offload, from Tom Herbert. 13) Add split-header receive, BQL, and xmit_more support to amd-xgbe driver, from Thomas Lendacky. 14) Add MPLS support to openvswitch, from Simon Horman. 15) Support wildcard tunnel endpoints in ipv6 tunnels, from Steffen Klassert. 16) Do gro flushes on a per-device basis using a timer, from Eric Dumazet. This tries to resolve the conflicting goals between the desired handling of bulk vs. RPC-like traffic. 17) Allow userspace to ask for the CPU upon what a packet was received/steered, via SO_INCOMING_CPU. From Eric Dumazet. 18) Limit GSO packets to half the current congestion window, from Eric Dumazet. 19) Add a generic helper so that all drivers set their RSS keys in a consistent way, from Eric Dumazet. 20) Add xmit_more support to enic driver, from Govindarajulu Varadarajan. 21) Add VLAN packet scheduler action, from Jiri Pirko. 22) Support configurable RSS hash functions via ethtool, from Eyal Perry. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1820 commits) Fix race condition between vxlan_sock_add and vxlan_sock_release net/macb: fix compilation warning for print_hex_dump() called with skb->mac_header net/mlx4: Add support for A0 steering net/mlx4: Refactor QUERY_PORT net/mlx4_core: Add explicit error message when rule doesn't meet configuration net/mlx4: Add A0 hybrid steering net/mlx4: Add mlx4_bitmap zone allocator net/mlx4: Add a check if there are too many reserved QPs net/mlx4: Change QP allocation scheme net/mlx4_core: Use tasklet for user-space CQ completion events net/mlx4_core: Mask out host side virtualization features for guests net/mlx4_en: Set csum level for encapsulated packets be2net: Export tunnel offloads only when a VxLAN tunnel is created gianfar: Fix dma check map error when DMA_API_DEBUG is enabled cxgb4/csiostor: Don't use MASTER_MUST for fw_hello call net: fec: only enable mdio interrupt before phy device link up net: fec: clear all interrupt events to support i.MX6SX net: fec: reset fep link status in suspend function net: sock: fix access via invalid file descriptor net: introduce helper macro for_each_cmsghdr ...
2014-12-12pstore-ram: Allow optional mapping with pgprot_noncachedTony Lindgren1-1/+3
On some ARMs the memory can be mapped pgprot_noncached() and still be working for atomic operations. As pointed out by Colin Cross <ccross@android.com>, in some cases you do want to use pgprot_noncached() if the SoC supports it to see a debug printk just before a write hanging the system. On ARMs, the atomic operations on strongly ordered memory are implementation defined. So let's provide an optional kernel parameter for configuring pgprot_noncached(), and use pgprot_writecombine() by default. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Rob Herring <robherring2@gmail.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Anton Vorontsov <anton@enomsg.org> Cc: Colin Cross <ccross@android.com> Cc: Olof Johansson <olof@lixom.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: stable@vger.kernel.org Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2014-12-12Merge tag 'sound-3.19-rc1' of ↵Linus Torvalds19-141/+383
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "This became a fairly large pull request. In addition to the usual driver updates / fixes, there have been a high amount of cleanups in ASoC area, as well as control API helpers and kernel documentations fixes touching through the whole tree. In the driver side, the biggest changes are the support for new Intel SoC found on new x86 machines, and the updates of FireWire dice and oxfw drivers. Some remarkable items are below: ALSA core: - PCM mmap code cleanup, removal of arch-dependent codes - PCM xrun injection support - PCM hwptr tracepoint support - Refactoring of snd_pcm_action(), simplification of PCM locking - Robustified sequecner auto-load functionality - New control API helpers and lots of cleanups along with them - Lots of kerneldoc fixes and cleanups USB-audio: - The mixer resume code was largely rewritten, and the devices with quirks are resumed properly. - New hardware support: Focusrite Scarlett, Digidesign Mbox1, Denon/Marantz DACs, Zoom R16/24 FireWire: - DICE driver updates with better duplex and sync support, including MIDI support - New OXFW driver for Oxford Semiconductor FW970/971 chipset, including the previous LaCie Speakers device. Fullduplex and MIDI support included as well as DICE driver. HD-audio: - Refactoring the driver-caps quirk handling in snd-hda-intel - More consistent control names representing the topology better - Fixups: HP mute LED with ALC268 codec, Ideapad S210 built-in mic fix, ASUS Z99He laptop EAPD ASoC: - Conversion of AC'97 drivers to use regmap, bringing us closer to the removal of the ASoC level I/O code - Clean up a lot of old drivers that were open coding things that have subsequently been implemented in the core - Some DAPM performance improvements - Removal of the now seldom used CODEC mutex - Lots of updates for the newer Intel SoC support, including support for the DSP and some Cherrytrail and Braswell machine drivers - Support for Samsung boards using rt5631 as the CODEC - Removal of the obsolete AFEB9260 machine driver - Driver support for the TI TS3A227E headset driver used in some Chrombeooks Others: - ASIHPI driver update and cleanups - Lots of dev_*() printk conversions - Lots of trivial cleanups for the codes spotted by Coccinelle" * tag 'sound-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (594 commits) ALSA: pcxhr: NULL dereference on probe failure ALSA: lola: NULL dereference on probe failure ALSA: hda - Add "eapd" model string for AD1986A codec ALSA: hda - Add EAPD fixup for ASUS Z99He laptop ALSA: oxfw: Add hwdep interface ALSA: oxfw: Add support for capture/playback MIDI messages ALSA: oxfw: add support for capturing PCM samples ALSA: oxfw: Add support AMDTP in-stream ALSA: oxfw: Add support for Behringer/Mackie devices ALSA: oxfw: Change the way to start stream ALSA: oxfw: Add proc interface for debugging purpose ALSA: oxfw: Change the way to make PCM rules/constraints ALSA: oxfw: Add support for AV/C stream format command to get/set supported stream formation ALSA: oxfw: Change the way to name card ALSA: dice: Add support for MIDI capture/playback ALSA: dice: Add support for capturing PCM samples ALSA: dice: Support for non SYT-Match sampling clock source mode ALSA: dice: Add support for duplex streams with synchronization ALSA: dice: Change the way to start stream ALSA: jack: Add dummy snd_jack_set_key() definition ...
2014-12-12Merge tag 'devicetree-for-linus' of ↵Linus Torvalds4-26/+108
git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux Pull devicetree changes from Grant Likely: "Lots of activity in the devicetree code for v3.18. Most of it is related to getting all of the overlay support code in place, but there are other important things in there. Highlights: - OF_RECONFIG notifiers for SPI, I2C and Platform devices. Those subsystems can now respond to live changes to the device tree. - CONFIG_OF_OVERLAY method for applying live changes to the device tree - Removal of the of_allnodes list. This used to be used to iterate over all the nodes in the device tree, but it is unnecessary because the same thing can be done by iterating over the list of child pointers. Getting rid of of_allnodes saves some memory and avoids the possibility of of_allnodes being sorted differently from the child lists. - Support for retrieving original DTB blob via sysfs. Needed by kexec. - More unittests - Documentation and minor bug fixes" * tag 'devicetree-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/glikely/linux: (42 commits) of: Delete unnecessary check before calling "of_node_put()" of: Drop ->next pointer from struct device_node spi: Check for spi_of_notifier when CONFIG_OF_DYNAMIC=y of: support passing console options with stdout-path of: add optional options parameter to of_find_node_by_path() of: Add bindings for chosen node, stdout-path of: Remove unneeded and incorrect MODULE_DEVICE_TABLE ARM: dt: fix up PL011 device tree bindings of: base, fix of_property_read_string_helper kernel-doc of: remove select of non-existant OF_DEVICE config symbol spi/of: Add OF notifier handler spi/of: Create new device registration method and accessors i2c/of: Add OF_RECONFIG notifier handler i2c/of: Factor out Devicetree registration code of/overlay: Add overlay unittests of/overlay: Introduce DT overlay support of/reconfig: Add OF_DYNAMIC notifier for platform_bus_type of/reconfig: Always use the same structure for notifiers of/reconfig: Add debug output for OF_RECONFIG notifiers of/reconfig: Add empty stubs for the of_reconfig methods ...
2014-12-11Merge tag 'fbdev-3.19' of ↵Linus Torvalds3-43/+48
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux Pull fbdev updates from Tomi Valkeinen: - support for mx6sl and mx6sx - OMAP HDMI audio rewrite to make it finally work - OMAP video PLL work to prepare for new DRA7xx SoCs - simplefb DT related improvements * tag 'fbdev-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (81 commits) video: uvesafb: Deletion of an unnecessary check before the function call "platform_device_put" video: fbdev-VIA: Deletion of an unnecessary check before the function call "framebuffer_release" video: fbdev-MMP: Deletion of an unnecessary check before the function call "mmp_unregister_path" video: mx3fb: Deletion of an unnecessary check before the function call "backlight_device_unregister" video: fbdev-OMAP2: Deletion of unnecessary checks before the function call "i2c_put_adapter" video: fbdev-SIS: Deletion of unnecessary checks before the function call "pci_dev_put" video: smscufx: Deletion of unnecessary checks before the function call "vfree" video: udlfb: Deletion of unnecessary checks before the function call "vfree" video: uvesafb: Deletion of an unnecessary check before the function call "uvesafb_free" video: fbdev-LCDC: Deletion of an unnecessary check before the function call "vfree" video: fbdev: arkfb: suppress build warning video: fbdev: s3fb: suppress build warning video: fbdev: vt8623fb: suppress build warning OMAPDSS: hdmi5: Fix bit field for IEC958_AES2_CON_SOURCE OMAPDSS: hdmi: Remove __exit qualifier from hdmi_uninit_output() OMAPDSS: hdmi5: Change hdmi_wp idlemode to to no_idle for audio playback OMAPDSS: Remove all references to obsolete HDMI audio callbacks ASoC: omap: Remove obsolete HDMI audio code and Kconfig options OMAPDSS: hdmi5: Register ASoC platform device for omap hdmi audio OMAPDSS: hdmi5: Remove callbacks for the old ASoC DAI driver ...
2014-12-11Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds12-118/+318
Pull virtio updates from Michael Tsirkin: "virtio: virtio 1.0 support, misc patches This adds a lot of infrastructure for virtio 1.0 support. Notable missing pieces: virtio pci, virtio balloon (needs spec extension), vhost scsi. Plus, there are some minor fixes in a couple of places. Note: some net drivers are affected by these patches. David said he's fine with merging these patches through my tree. Rusty's on vacation, he acked using my tree for these, too" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (70 commits) virtio_ccw: finalize_features error handling virtio_ccw: future-proof finalize_features virtio_pci: rename virtio_pci -> virtio_pci_common virtio_pci: update file descriptions and copyright virtio_pci: split out legacy device support virtio_pci: setup config vector indirectly virtio_pci: setup vqs indirectly virtio_pci: delete vqs indirectly virtio_pci: use priv for vq notification virtio_pci: free up vq->priv virtio_pci: fix coding style for structs virtio_pci: add isr field virtio: drop legacy_only driver flag virtio_balloon: drop legacy_only driver flag virtio_ccw: rev 1 devices set VIRTIO_F_VERSION_1 virtio: allow finalize_features to fail virtio_ccw: legacy: don't negotiate rev 1/features virtio: add API to detect legacy devices virtio_console: fix sparse warnings vhost: remove unnecessary forward declarations in vhost.h ...
2014-12-11Merge branch 'mailbox-devel' of ↵Linus Torvalds2-8/+11
git://git.linaro.org/landing-teams/working/fujitsu/integration Pull mailbox framework updates from Jassi Brar. * 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration: Mailbox: Add support for Platform Communication Channel mailbox/omap: adapt to the new mailbox framework mailbox: add tx_prepare client callback mailbox: Don't unnecessarily re-arm the polling timer
2014-12-11Merge tag 'spi-v3.19' of ↵Linus Torvalds2-0/+26
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi updates from Mark Brown: "Not a huge amount going on this release, mainly new drivers (there's a couple more waiting that didn't quite make the cut for this release too): - An interface for querying if the current transfer is the last in a message, allowing controllers that need special handling for the final transfer to use the core message parsing. - Support for Amlogic Meson SPIFC, Imagination Technologies SFPI, Intel Quark X1000 and Samsung Exynos 7 controllers" * tag 'spi-v3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (38 commits) spi/s3c64xx: Remove redundant runtime PM management spi: fsl-spi: remove unused variable assignment spi: spi-fsl-spi: Return an error code in fsl_spi_do_one_msg() spi: core: Do not mangle error code from kthread_run() spi: fsl-espi: add (un)prepare_transfer_hardware calls to save power if SPI is not in use spi: fsl-(e)spi: migrate to generic master queueing spi/txx9: Deletion of an unnecessary check before the function call "clk_disable" spi: cadence: Fix 3-to-8 mux mode spi: cadence: Init HW after reading devicetree attributes spi: meson: Select REGMAP_MMIO spi: s3c64xx: add support for exynos7 SPI controller spi: spi-pxa2xx: SPI support for Intel Quark X1000 spi: meson: meson_spifc_setup_speed() can be static spi: spi-pxa2xx: Add helpers for regiseters' accessing spi: spi-mxs: Fix mapping from vmalloc-ed buffer to scatter list spi: atmel: introduce probe deferring spi: atmel: remove compat for non DT board when requesting dma chan spi: meson: Add support for Amlogic Meson SPIFC spi: meson: Add device tree bindings documentation for SPIFC spi: core: Add spi_transfer_is_last() helper ...
2014-12-11Merge tag 'media/v3.19-rc1' of ↵Linus Torvalds22-232/+406
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media updates from Mauro Carvalho Chehab: - Two new dvb frontend drivers: mn88472 and mn88473 - A new driver for some PCIe DVBSky cards - A new remote controller driver: meson-ir - One LIRC staging driver got rewritten and promoted to mainstream: igorplugusb - A new tuner driver (m88rs6000t) - The old omap2 media driver got removed from staging. This driver uses an old DMA API and it is likely broken on recent kernels. Nobody cared enough to fix it - Media bus format moved to a separate header, as DRM will also use the definitions there - mem2mem_testdev were renamed to vim2m, in order to use the same naming convention taken by the other virtual test driver (vivid) - Added a new driver for coda SoC (coda-jpeg) - The cx88 driver got converted to use videobuf2 core - Make DMABUF export buffer to work with DMA Scatter/Gather and Vmalloc cores - Lots of other fixes, improvements and cleanups on the drivers. * tag 'media/v3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (384 commits) [media] mn88473: One function call less in mn88473_init() after error [media] mn88473: Remove uneeded check before release_firmware() [media] lirc_zilog: Deletion of unnecessary checks before vfree() [media] MAINTAINERS: Add myself as img-ir maintainer [media] img-ir: Don't set driver's module owner [media] img-ir: Depend on METAG or MIPS or COMPILE_TEST [media] img-ir/hw: Drop [un]register_decoder declarations [media] img-ir/hw: Fix potential deadlock stopping timer [media] img-ir/hw: Always read data to clear buffer [media] redrat3: ensure dma is setup properly [media] ddbridge: remove unneeded check before dvb_unregister_device() [media] si2157: One function call less in si2157_init() after error [media] tuners: remove uneeded checks before release_firmware() [media] arm: omap2: rx51-peripherals: fix build warning [media] stv090x: add an extra protetion against buffer overflow [media] stv090x: Remove an unreachable code [media] stv090x: Some whitespace cleanups [media] em28xx: checkpatch cleanup: whitespaces/new lines cleanups [media] si2168: add support for firmware files in new format [media] si2168: debug printout for firmware version ...
2014-12-11net/mlx4: Add support for A0 steeringMatan Barak1-2/+15
Add the required firmware commands for A0 steering and a way to enable that. The firmware support focuses on INIT_HCA, QUERY_HCA, QUERY_PORT, QUERY_DEV_CAP and QUERY_FUNC_CAP commands. Those commands are used to configure and query the device. The different A0 DMFS (steering) modes are: Static - optimized performance, but flow steering rules are limited. This mode should be choosed explicitly by the user in order to be used. Dynamic - this mode should be explicitly choosed by the user. In this mode, the FW works in optimized steering mode as long as it can and afterwards automatically drops to classic (full) DMFS. Disable - this mode should be explicitly choosed by the user. The user instructs the system not to use optimized steering, even if the FW supports Dynamic A0 DMFS (and thus will be able to use optimized steering in Default A0 DMFS mode). Default - this mode is implicitly choosed. In this mode, if the FW supports Dynamic A0 DMFS, it'll work in this mode. Otherwise, it'll work at Disable A0 DMFS mode. Under SRIOV configuration, when the A0 steering mode is enabled, older guest VF drivers who aren't using the RX QP allocation flag (MLX4_RESERVE_A0_QP) will get a QP from the general range and fail when attempting to register a steering rule. To avoid that, the PF context behaviour is changed once on A0 static mode, to require support for the allocation flag in VF drivers too. In order to enable A0 steering, we use log_num_mgm_entry_size param. If the value of the parameter is not positive, we treat the absolute value of log_num_mgm_entry_size as a bit field. Setting bit 2 of this bit field enables static A0 steering. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11net/mlx4: Add A0 hybrid steeringMatan Barak1-2/+8
A0 hybrid steering is a form of high performance flow steering. By using this mode, mlx4 cards use a fast limited table based steering, in order to enable fast steering of unicast packets to a QP. In order to implement A0 hybrid steering we allocate resources from different zones: (1) General range (2) Special MAC-assigned QPs [RSS, Raw-Ethernet] each has its own region. When we create a rss QP or a raw ethernet (A0 steerable and BF ready) QP, we try hard to allocate the QP from range (2). Otherwise, we try hard not to allocate from this range. However, when the system is pushed to its limits and one needs every resource, the allocator uses every region it can. Meaning, when we run out of raw-eth qps, the allocator allocates from the general range (and the special-A0 area is no longer active). If we run out of RSS qps, the mechanism tries to allocate from the raw-eth QP zone. If that is also exhausted, the allocator will allocate from the general range (and the A0 region is no longer active). Note that if a raw-eth qp is allocated from the general range, it attempts to allocate the range such that bits 6 and 7 (blueflame bits) in the QP number are not set. When the feature is used in SRIOV, the VF has to notify the PF what kind of QP attributes it needs. In order to do that, along with the "Eth QP blueflame" bit, we reserve a new "A0 steerable QP". According to the combination of these bits, the PF tries to allocate a suitable QP. In order to maintain backward compatibility (with older PFs), the PF notifies which QP attributes it supports via QUERY_FUNC_CAP command. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11net/mlx4: Change QP allocation schemeEugenia Emantayev1-2/+19
When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset. The current Ethernet driver code reserves a Tx QP range with 256b alignment. This is wrong because if there are more than 64 Tx QPs in use, QPNs >= base + 65 will have bits 6/7 set. This problem is not specific for the Ethernet driver, any entity that tries to reserve more than 64 BF-enabled QPs should fail. Also, using ranges is not necessary here and is wasteful. The new mechanism introduced here will support reservation for "Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs (when hypervisors support WC in VMs). The flow we use is: 1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation, and request "BF enabled QPs" if BF is supported for the function 2. In the ALLOC_RES FW command, change param1 to: a. param1[23:0] - number of QPs b. param1[31-24] - flags controlling QPs reservation Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits 6 and 7 unset in order to be used in Ethernet. Bits 24-30 of the flags are currently reserved. When a function tries to allocate a QP, it states the required attributes for this QP. Those attributes are considered "best-effort". If an attribute, such as Ethernet BF enabled QP, is a must-have attribute, the function has to check that attribute is supported before trying to do the allocation. In a lower layer of the code, mlx4_qp_reserve_range masks out the bits which are unsupported. If SRIOV is used, the PF validates those attributes and masks out unsupported attributes as well. In order to notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's mailbox is filled by the PF, which notifies which QP allocation attributes it supports. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11net/mlx4_core: Use tasklet for user-space CQ completion eventsMatan Barak1-0/+5
Previously, we've fired all our completion callbacks straight from our ISR. Some of those callbacks were lightweight (for example, mlx4_en's and IPoIB napi callbacks), but some of them did more work (for example, the user-space RDMA stack uverbs' completion handler). Besides that, doing more than the minimal work in ISR is generally considered wrong, it could even lead to a hard lockup of the system. Since when a lot of completion events are generated by the hardware, the loop over those events could be so long, that we'll get into a hard lockup by the system watchdog. In order to avoid that, add a new way of invoking completion events callbacks. In the interrupt itself, we add the CQs which receive completion event to a per-EQ list and schedule a tasklet. In the tasklet context we loop over all the CQs in the list and invoke the user callback. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11Merge tag 'backlight-for-linus-3.19' of ↵Linus Torvalds1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight Pull backlight updates from Lee Jones: - Clean-up leaky resources; pwm_bl - Simplify Device Tree initialisation; lp855x_bl - Add Regulator support; lp855x - Remove Bryan from the Maintainer list -- new baby, no time :) * tag 'backlight-for-linus-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight: MAINTAINERS: Remove my name from Backlight subsystem backlight: lp855x: Add supply regulator to lp855x backlight: lp855x: Refactor DT parsing code backlight: pwm: Clean-up pwm requested using legacy API
2014-12-11Merge tag 'pinctrl-v3.19-1' of ↵Linus Torvalds2-0/+186
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control changes from Linus Walleij: "Here is a stash of pin control changes I have collected for the v3.19 series. Mainly new hardware support, with Intels new embedded SoC as the especially interesting thing standing out, fully using the subsystem. - Force conversion of the ux500 pin control device trees and parsers to use the generic pin control bindings. - New driver and device tree bindings for the Qualcomm PMIC MPP pin controller and GPIO. - Some ACPI infrastructure for pin controllers. - New driver for the Intel CherryView/Braswell pin controller, the first Intel pin controller to fully take advantage of the pin control subsystem. - Support the Freescale i.MX VF610 variant. - Support the sunxi A80 variant. - Support the Samsung Exynos 4415 and Exynos 7 variants. - Split out Intel pin controllers to their own subdirectory. - A large slew of rockchip pin control updates, including suspend/resume support. - A large slew of Samsung Exynos pin controller updates. - Various minor updates and fixes" * tag 'pinctrl-v3.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (49 commits) pinctrl: at91: enhance (debugfs) at91_gpio_dbg_show pinctrl: meson: add device tree bindings documentation gpio: tz1090: Fix error handling of irq_of_parse_and_map pinctrl: tz1090-pinctrl.txt: Fix typo in binding pinctrl: pinconf-generic: Declare dt_params/conf_items const pinctrl: exynos: Add support for Exynos4415 pinctrl: exynos: Add initial driver data for Exynos7 pinctrl: exynos: Add irq_chip instance for Exynos7 wakeup interrupts pinctrl: exynos: Consolidate irq domain callbacks pinctrl: exynos: Generalize the eint16_31 demux code pinctrl: samsung: Separate per-bank init and runtime data pinctrl: samsung: Constify samsung_pin_ctrl struct pinctrl: samsung: Constify samsung_pin_bank_type struct pinctrl: samsung: Drop unused label field in samsung_pin_ctrl struct pinctrl: samsung: Make samsung_pinctrl_get_soc_data use ERR_PTR() pinctrl: Add Intel Cherryview/Braswell pin controller support gpio / ACPI: Add knowledge about pin controllers to acpi_get_gpiod() pinctrl: Fix path error in documentation pinctrl: rockchip: save and restore gpio6_c6 pinmux in suspend/resume pinctrl: rockchip: add suspend/resume functions ...
2014-12-11Merge tag 'pm+acpi-3.19-rc1' of ↵Linus Torvalds22-125/+480
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael Wysocki: "This time we have some more new material than we used to have during the last couple of development cycles. The most important part of it to me is the introduction of a unified interface for accessing device properties provided by platform firmware. It works with Device Trees and ACPI in a uniform way and drivers using it need not worry about where the properties come from as long as the platform firmware (either DT or ACPI) makes them available. It covers both devices and "bare" device node objects without struct device representation as that turns out to be necessary in some cases. This has been in the works for quite a few months (and development cycles) and has been approved by all of the relevant maintainers. On top of that, some drivers are switched over to the new interface (at25, leds-gpio, gpio_keys_polled) and some additional changes are made to the core GPIO subsystem to allow device drivers to manipulate GPIOs in the "canonical" way on platforms that provide GPIO information in their ACPI tables, but don't assign names to GPIO lines (in which case the driver needs to do that on the basis of what it knows about the device in question). That also has been approved by the GPIO core maintainers and the rfkill driver is now going to use it. Second is support for hardware P-states in the intel_pstate driver. It uses CPUID to detect whether or not the feature is supported by the processor in which case it will be enabled by default. However, it can be disabled entirely from the kernel command line if necessary. Next is support for a platform firmware interface based on ACPI operation regions used by the PMIC (Power Management Integrated Circuit) chips on the Intel Baytrail-T and Baytrail-T-CR platforms. That interface is used for manipulating power resources and for thermal management: sensor temperature reporting, trip point setting and so on. Also the ACPI core is now going to support the _DEP configuration information in a limited way. Basically, _DEP it supposed to reflect off-the-hierarchy dependencies between devices which may be very indirect, like when AML for one device accesses locations in an operation region handled by another device's driver (usually, the device depended on this way is a serial bus or GPIO controller). The support added this time is sufficient to make the ACPI battery driver work on Asus T100A, but it is general enough to be able to cover some other use cases in the future. Finally, we have a new cpufreq driver for the Loongson1B processor. In addition to the above, there are fixes and cleanups all over the place as usual and a traditional ACPICA update to a recent upstream release. As far as the fixes go, the ACPI LPSS (Low-power Subsystem) driver for Intel platforms should be able to handle power management of the DMA engine correctly, the cpufreq-dt driver should interact with the thermal subsystem in a better way and the ACPI backlight driver should handle some more corner cases, among other things. On top of the ACPICA update there are fixes for race conditions in the ACPICA's interrupt handling code which might lead to some random and strange looking failures on some systems. In the cleanups department the most visible part is the series of commits targeted at getting rid of the CONFIG_PM_RUNTIME configuration option. That was triggered by a discussion regarding the generic power domains code during which we realized that trying to support certain combinations of PM config options was painful and not really worth it, because nobody would use them in production anyway. For this reason, we decided to make CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and that lead to the conclusion that the latter became redundant and CONFIG_PM could be used instead of it. The material here makes that replacement in a major part of the tree, but there will be at least one more batch of that in the second part of the merge window. Specifics: - Support for retrieving device properties information from ACPI _DSD device configuration objects and a unified device properties interface for device drivers (and subsystems) on top of that. As stated above, this works with Device Trees and ACPI and allows device drivers to be written in a platform firmware (DT or ACPI) agnostic way. The at25, leds-gpio and gpio_keys_polled drivers are now going to use this new interface and the GPIO subsystem is additionally modified to allow device drivers to assign names to GPIO resources returned by ACPI _CRS objects (in case _DSD is not present or does not provide the expected data). The changes in this set are mostly from Mika Westerberg, Rafael J Wysocki, Aaron Lu, and Darren Hart with some fixes from others (Fabio Estevam, Geert Uytterhoeven). - Support for Hardware Managed Performance States (HWP) as described in Volume 3, section 14.4, of the Intel SDM in the intel_pstate driver. CPUID is used to detect whether or not the feature is supported by the processor. If supported, it will be enabled automatically unless the intel_pstate=no_hwp switch is present in the kernel command line. From Dirk Brandewie. - New Intel Broadwell-H ID for intel_pstate (Dirk Brandewie). - Support for firmware interface based on ACPI operation regions used by the PMIC chips on the Intel Baytrail-T and Baytrail-T-CR platforms for power resource control and thermal management (Aaron Lu). - Limited support for retrieving off-the-hierarchy dependencies between devices from ACPI _DEP device configuration objects and deferred probing support for the ACPI battery driver based on the _DEP information to make that driver work on Asus T100A (Lan Tianyu). - New cpufreq driver for the Loongson1B processor (Kelvin Cheung). - ACPICA update to upstream revision 20141107 which only affects tools (Bob Moore). - Fixes for race conditions in the ACPICA's interrupt handling code and in the ACPI code related to system suspend and resume (Lv Zheng and Rafael J Wysocki). - ACPI core fix for an RCU-related issue in the ioremap() regions management code that slowed down significantly after CPUs had been allowed to enter idle states even if they'd had RCU callbakcs queued and triggered some problems in certain proprietary graphics driver (and elsewhere). The fix replaces synchronize_rcu() in that code with synchronize_rcu_expedited() which makes the issue go away. From Konstantin Khlebnikov. - ACPI LPSS (Low-Power Subsystem) driver fix to handle power management of the DMA engine included into the LPSS correctly. The problem is that the DMA engine doesn't have ACPI PM support of its own and it simply is turned off when the last LPSS device having ACPI PM support goes into D3cold. To work around that, the PM domain used by the ACPI LPSS driver is redesigned so at least one device with ACPI PM support will be on as long as the DMA engine is in use. From Andy Shevchenko. - ACPI backlight driver fix to avoid using it on "Win8-compatible" systems where it doesn't work and where it was used by default by mistake (Aaron Lu). - Assorted minor ACPI core fixes and cleanups from Tomasz Nowicki, Sudeep Holla, Huang Rui, Hanjun Guo, Fabian Frederick, and Ashwin Chaugule (mostly related to the upcoming ARM64 support). - Intel RAPL (Running Average Power Limit) power capping driver fixes and improvements including new processor IDs (Jacob Pan). - Generic power domains modification to power up domains after attaching devices to them to meet the expectations of device drivers and bus types assuming devices to be accessible at probe time (Ulf Hansson). - Preliminary support for controlling device clocks from the generic power domains core code and modifications of the ARM/shmobile platform to use that feature (Ulf Hansson). - Assorted minor fixes and cleanups of the generic power domains core code (Ulf Hansson, Geert Uytterhoeven). - Assorted minor fixes and cleanups of the device clocks control code in the PM core (Geert Uytterhoeven, Grygorii Strashko). - Consolidation of device power management Kconfig options by making CONFIG_PM_SLEEP select CONFIG_PM_RUNTIME and removing the latter which is now redundant (Rafael J Wysocki and Kevin Hilman). That is the first batch of the changes needed for this purpose. - Core device runtime power management support code cleanup related to the execution of callbacks (Andrzej Hajda). - cpuidle ARM support improvements (Lorenzo Pieralisi). - cpuidle cleanup related to the CPUIDLE_FLAG_TIME_VALID flag and a new MAINTAINERS entry for ARM Exynos cpuidle (Daniel Lezcano and Bartlomiej Zolnierkiewicz). - New cpufreq driver callback (->ready) to be executed when the cpufreq core is ready to use a given policy object and cpufreq-dt driver modification to use that callback for cooling device registration (Viresh Kumar). - cpufreq core fixes and cleanups (Viresh Kumar, Vince Hsu, James Geboski, Tomeu Vizoso). - Assorted fixes and cleanups in the cpufreq-pcc, intel_pstate, cpufreq-dt, pxa2xx cpufreq drivers (Lenny Szubowicz, Ethan Zhao, Stefan Wahren, Petr Cvek). - OPP (Operating Performance Points) framework modification to allow OPPs to be removed too and update of a few cpufreq drivers (cpufreq-dt, exynos5440, imx6q, cpufreq) to remove OPPs (added during initialization) on driver removal (Viresh Kumar). - Hibernation core fixes and cleanups (Tina Ruchandani and Markus Elfring). - PM Kconfig fix related to CPU power management (Pankaj Dubey). - cpupower tool fix (Prarit Bhargava)" * tag 'pm+acpi-3.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (120 commits) i2c-omap / PM: Drop CONFIG_PM_RUNTIME from i2c-omap.c dmaengine / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM tools: cpupower: fix return checks for sysfs_get_idlestate_count() drivers: sh / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM e1000e / igb / PM: Eliminate CONFIG_PM_RUNTIME MMC / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM MFD / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM misc / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM media / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM input / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM leds: leds-gpio: Fix multiple instances registration without 'label' property iio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM hsi / OMAP / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM i2c-hid / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM drm / exynos / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM gpio / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM hwrandom / exynos / PM: Use CONFIG_PM in #ifdef block / PM: Replace CONFIG_PM_RUNTIME with CONFIG_PM USB / PM: Drop CONFIG_PM_RUNTIME from the USB core PM: Merge the SET*_RUNTIME_PM_OPS() macros ...
2014-12-11Merge tag 'pci-v3.19-changes' of ↵Linus Torvalds1-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI changes from Bjorn Helgaas: "Here are the PCI changes intended for v3.19. I don't think there's anything very exciting here, but there was a lot of MSI-related stuff coming via Thomas. Details: NUMA - Allow numa_node override via sysfs (Prarit Bhargava) Resource management - Restore detection of read-only BARs (Myron Stowe) - Shrink decoding-disabled window while sizing BARs (Myron Stowe) - Add informational printk for invalid BARs (Myron Stowe) - Remove fixed parameter in pci_iov_resource_bar() (Myron Stowe) MSI - Add pci_msi_ignore_mask to prevent writes to MSI/MSI-X Mask Bits (Yijing Wang) - Revert "PCI: Add x86_msi.msi_mask_irq() and msix_mask_irq()" (Yijing Wang) - s390/MSI: Use __msi_mask_irq() instead of default_msi_mask_irq() (Yijing Wang) Virtualization - xen: Process failure for pcifront_(re)scan_root() (Chen Gang) - Make FLR and AF FLR reset warning messages different (Gavin Shan) Generic host bridge driver - Allocate config space windows after limiting bus number range (Lorenzo Pieralisi) - Convert to DT resource parsing API (Lorenzo Pieralisi) Freescale Layerscape - Add Freescale Layerscape PCIe driver (Minghuan Lian) NVIDIA Tegra - Do not build on 64-bit ARM (Thierry Reding) - Add Kconfig help text (Thierry Reding) Renesas R-Car - Make rcar_pci static (Jingoo Han) Samsung Exynos - Add exynos prefix to add_pcie_port(), pcie_init() (Jingoo Han) ST Microelectronics SPEAr13xx - Add spear prefix to add_pcie_port(), pcie_init() (Jingoo Han) - Make spear13xx_add_pcie_port() __init (Jingoo Han) - Remove unnecessary OOM message (Jingoo Han) TI DRA7xx - Add dra7xx prefix to add_pcie_port() (Jingoo Han) - Make dra7xx_add_pcie_port() __init (Jingoo Han) TI Keystone - Make ks_dw_pcie_msi_domain_ops static (Jingoo Han) - Remove unnecessary OOM message (Jingoo Han) Miscellaneous - Delete unnecessary NULL pointer checks (Markus Elfring) - Remove unused to_hotplug_slot() (Gavin Shan) - Whitespace cleanup (Jingoo Han) - Simplify if-return sequences (Quentin Lambert)" * tag 'pci-v3.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (28 commits) PCI: Remove fixed parameter in pci_iov_resource_bar() PCI: Add informational printk for invalid BARs PCI: tegra: Add Kconfig help text PCI: tegra: Do not build on 64-bit ARM PCI: spear: Remove unnecessary OOM message PCI: mvebu: Add a blank line after declarations PCI: designware: Add a blank line after declarations PCI: exynos: Remove unnecessary return statement PCI: imx6: Use tabs for indentation PCI: keystone: Remove unnecessary OOM message PCI: Remove unused and broken to_hotplug_slot() PCI: Make FLR and AF FLR reset warning messages different PCI: dra7xx: Add __init annotation to dra7xx_add_pcie_port() PCI: spear: Add __init annotation to spear13xx_add_pcie_port() PCI: spear: Rename add_pcie_port(), pcie_init() to spear13xx_add_pcie_port(), etc. PCI: dra7xx: Rename add_pcie_port() to dra7xx_add_pcie_port() PCI: layerscape: Add Freescale Layerscape PCIe driver PCI: Simplify if-return sequences PCI: Delete unnecessary NULL pointer checks PCI: Shrink decoding-disabled window while sizing BARs ...
2014-12-11Merge tag 'trace-seq-buf-3.19' of ↵Linus Torvalds4-7/+165
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull nmi-safe seq_buf printk update from Steven Rostedt: "This code is a fork from the trace-3.19 pull as it needed the trace_seq clean ups from that branch. This code solves the issue of performing stack dumps from NMI context. The issue is that printk() is not safe from NMI context as if the NMI were to trigger when a printk() was being performed, the NMI could deadlock from the printk() internal locks. This has been seen in practice. With lots of review from Petr Mladek, this code went through several iterations, and we feel that it is now at a point of quality to be accepted into mainline. Here's what is contained in this patch set: - Creates a "seq_buf" generic buffer utility that allows a descriptor to be passed around where functions can write their own "printk()" formatted strings into it. The generic version was pulled out of the trace_seq() code that was made specifically for tracing. - The seq_buf code was change to model the seq_file code. I have a patch (not included for 3.19) that converts the seq_file.c code over to use seq_buf.c like the trace_seq.c code does. This was done to make sure that seq_buf.c is compatible with seq_file.c. I may try to get that patch in for 3.20. - The seq_buf.c file was moved to lib/ to remove it from being dependent on CONFIG_TRACING. - The printk() was updated to allow for a per_cpu "override" of the internal calls. That is, instead of writing to the console, a call to printk() may do something else. This made it easier to allow the NMI to change what printk() does in order to call dump_stack() without needing to update that code as well. - Finally, the dump_stack from all CPUs via NMI code was converted to use the seq_buf code. The caller to trigger the NMI code would wait till all the NMIs finished, and then it would print the seq_buf data to the console safely from a non NMI context One added bonus is that this code also makes the NMI dump stack work on PREEMPT_RT kernels. As printk() includes sleeping locks on PREEMPT_RT, printk() only writes to console if the console does not use any rt_mutex converted spin locks. Which a lot do" * tag 'trace-seq-buf-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: x86/nmi: Fix use of unallocated cpumask_var_t printk/percpu: Define printk_func when printk is not defined x86/nmi: Perform a safe NMI stack trace on all CPUs printk: Add per_cpu printk func to allow printk to be diverted seq_buf: Move the seq_buf code to lib/ seq-buf: Make seq_buf_bprintf() conditional on CONFIG_BINARY_PRINTF tracing: Add seq_buf_get_buf() and seq_buf_commit() helper functions tracing: Have seq_buf use full buffer seq_buf: Add seq_buf_can_fit() helper function tracing: Add paranoid size check in trace_printk_seq() tracing: Use trace_seq_used() and seq_buf_used() instead of len tracing: Clean up tracing_fill_pipe_page() seq_buf: Create seq_buf_used() to find out how much was written tracing: Add a seq_buf_clear() helper and clear len and readpos in init tracing: Convert seq_buf fields to be like seq_file fields tracing: Convert seq_buf_path() to be like seq_path() tracing: Create seq_buf layer in trace_seq
2014-12-11Merge tag 'trace-3.19' of ↵Linus Torvalds4-30/+74
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "There was a lot of clean ups and minor fixes. One of those clean ups was to the trace_seq code. It also removed the return values to the trace_seq_*() functions and use trace_seq_has_overflowed() to see if the buffer filled up or not. This is similar to work being done to the seq_file code as well in another tree. Some of the other goodies include: - Added some "!" (NOT) logic to the tracing filter. - Fixed the frame pointer logic to the x86_64 mcount trampolines - Added the logic for dynamic trampolines on !CONFIG_PREEMPT systems. That is, the ftrace trampoline can be dynamically allocated and be called directly by functions that only have a single hook to them" * tag 'trace-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (55 commits) tracing: Truncated output is better than nothing tracing: Add additional marks to signal very large time deltas Documentation: describe trace_buf_size parameter more accurately tracing: Allow NOT to filter AND and OR clauses tracing: Add NOT to filtering logic ftrace/fgraph/x86: Have prepare_ftrace_return() take ip as first parameter ftrace/x86: Get rid of ftrace_caller_setup ftrace/x86: Have save_mcount_regs macro also save stack frames if needed ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs ftrace/x86: Simplify save_mcount_regs on getting RIP ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file ftrace/x86: Have static tracing also use ftrace_caller_setup ftrace/x86: Have static function tracing always test for function graph kprobes: Add IPMODIFY flag to kprobe_ftrace_ops ftrace, kprobes: Support IPMODIFY flag to find IP modify conflict kprobes/ftrace: Recover original IP if pre_handler doesn't change it tracing/trivial: Fix typos and make an int into a bool tracing: Deletion of an unnecessary check before iput() ...
2014-12-11net: introduce helper macro for_each_cmsghdrGu Zheng1-0/+4
Introduce helper macro for_each_cmsghdr as a wrapper of the enumerating cmsghdr from msghdr, just cleanup. Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-11Merge branch 'akpm' (patchbomb from Andrew)Linus Torvalds21-417/+211
Merge first patchbomb from Andrew Morton: - a few minor cifs fixes - dma-debug upadtes - ocfs2 - slab - about half of MM - procfs - kernel/exit.c - panic.c tweaks - printk upates - lib/ updates - checkpatch updates - fs/binfmt updates - the drivers/rtc tree - nilfs - kmod fixes - more kernel/exit.c - various other misc tweaks and fixes * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits) exit: pidns: fix/update the comments in zap_pid_ns_processes() exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting exit: exit_notify: re-use "dead" list to autoreap current exit: reparent: call forget_original_parent() under tasklist_lock exit: reparent: avoid find_new_reaper() if no children exit: reparent: introduce find_alive_thread() exit: reparent: introduce find_child_reaper() exit: reparent: document the ->has_child_subreaper checks exit: reparent: s/while_each_thread/for_each_thread/ in find_new_reaper() exit: reparent: fix the cross-namespace PR_SET_CHILD_SUBREAPER reparenting exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting exit: proc: don't try to flush /proc/tgid/task/tgid exit: release_task: fix the comment about group leader accounting exit: wait: drop tasklist_lock before psig->c* accounting exit: wait: don't use zombie->real_parent exit: wait: cleanup the ptrace_reparented() checks usermodehelper: kill the kmod_thread_locker logic usermodehelper: don't use CLONE_VFORK for ____call_usermodehelper() fs/hfs/catalog.c: fix comparison bug in hfs_cat_keycmp nilfs2: fix the nilfs_iget() vs. nilfs_new_inode() races ...
2014-12-11printk: add and use LOGLEVEL_<level> defines for KERN_<LEVEL> equivalentsJoe Perches1-0/+13
Use #defines instead of magic values. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jason Baron <jbaron@akamai.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-11printk: remove used-once early_vprintkJoe Perches1-1/+0
Eliminate the unlikely possibility of message interleaving for early_printk/early_vprintk use. early_vprintk can be done via the %pV extension so remove this unnecessary function and change early_printk to have the equivalent vprintk code. All uses of early_printk already end with a newline so also remove the unnecessary newline from the early_printk function. Signed-off-by: Joe Perches <joe@perches.com> Acked-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-11kernel: add panic_on_warnPrarit Bhargava2-0/+2
There have been several times where I have had to rebuild a kernel to cause a panic when hitting a WARN() in the code in order to get a crash dump from a system. Sometimes this is easy to do, other times (such as in the case of a remote admin) it is not trivial to send new images to the user. A much easier method would be a switch to change the WARN() over to a panic. This makes debugging easier in that I can now test the actual image the WARN() was seen on and I do not have to engage in remote debugging. This patch adds a panic_on_warn kernel parameter and /proc/sys/kernel/panic_on_warn calls panic() in the warn_slowpath_common() path. The function will still print out the location of the warning. An example of the panic_on_warn output: The first line below is from the WARN_ON() to output the WARN_ON()'s location. After that the panic() output is displayed. WARNING: CPU: 30 PID: 11698 at /home/prarit/dummy_module/dummy-module.c:25 init_dummy+0x1f/0x30 [dummy_module]() Kernel panic - not syncing: panic_on_warn set ... CPU: 30 PID: 11698 Comm: insmod Tainted: G W OE 3.17.0+ #57 Hardware name: Intel Corporation S2600CP/S2600CP, BIOS RMLSDP.86I.00.29.D696.1311111329 11/11/2013 0000000000000000 000000008e3f87df ffff88080f093c38 ffffffff81665190 0000000000000000 ffffffff818aea3d ffff88080f093cb8 ffffffff8165e2ec ffffffff00000008 ffff88080f093cc8 ffff88080f093c68 000000008e3f87df Call Trace: [<ffffffff81665190>] dump_stack+0x46/0x58 [<ffffffff8165e2ec>] panic+0xd0/0x204 [<ffffffffa038e05f>] ? init_dummy+0x1f/0x30 [dummy_module] [<ffffffff81076b90>] warn_slowpath_common+0xd0/0xd0 [<ffffffffa038e040>] ? dummy_greetings+0x40/0x40 [dummy_module] [<ffffffff81076c8a>] warn_slowpath_null+0x1a/0x20 [<ffffffffa038e05f>] init_dummy+0x1f/0x30 [dummy_module] [<ffffffff81002144>] do_one_initcall+0xd4/0x210 [<ffffffff811b52c2>] ? __vunmap+0xc2/0x110 [<ffffffff810f8889>] load_module+0x16a9/0x1b30 [<ffffffff810f3d30>] ? store_uevent+0x70/0x70 [<ffffffff810f49b9>] ? copy_module_from_fd.isra.44+0x129/0x180 [<ffffffff810f8ec6>] SyS_finit_module+0xa6/0xd0 [<ffffffff8166cf29>] system_call_fastpath+0x12/0x17 Successfully tested by me. hpa said: There is another very valid use for this: many operators would rather a machine shuts down than being potentially compromised either functionally or security-wise. Signed-off-by: Prarit Bhargava <prarit@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Acked-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Fabian Frederick <fabf@skynet.be> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-11include/linux/file.h: remove get_unused_fd() macroYann Droneaud1-1/+0
Macro get_unused_fd() is used to allocate a file descriptor with default flags. Those default flags (0) don't enable close-on-exec. This can be seen as an unsafe default: in most case close-on-exec should be enabled to not leak file descriptor across exec(). It would be better to have a "safer" default set of flags, eg. O_CLOEXEC must be used to enable close-on-exec. Instead this patch removes get_unused_fd() so that out of tree modules won't be affect by a runtime behavor change which might introduce other kind of bugs: it's better to catch the change at build time, making it easier to fix. Removing the macro will also promote use of get_unused_fd_flags() (or anon_inode_getfd()) with flags provided by userspace. Or, if flags cannot be given by userspace, with flags set to O_CLOEXEC by default. Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-11exit: ptrace: shift "reap dead" code from exit_ptrace() to ↵Oleg Nesterov1-1/+1
forget_original_parent() Now that forget_original_parent() uses ->ptrace_entry for EXIT_DEAD tasks, we can simply pass "dead_children" list to exit_ptrace() and remove another release_task() loop. Plus this way we do not need to drop and reacquire tasklist_lock. Also shift the list_empty(ptraced) check, if we want this optimization it makes sense to eliminate the function call altogether. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com>, Cc: Sterling Alexander <stalexan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Roland McGrath <roland@hack.frob.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-11mm: move page->mem_cgroup bad page handling into generic codeJohannes Weiner1-17/+0
Now that the external page_cgroup data structure and its lookup is gone, let the generic bad_page() check for page->mem_cgroup sanity. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: David S. Miller <davem@davemloft.net> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Tejun Heo <tj@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-11mm: page_cgroup: rename file to mm/swap_cgroup.cJohannes Weiner1-3/+5
Now that the external page_cgroup data structure and its lookup is gone, the only code remaining in there is swap slot accounting. Rename it and move the conditional compilation into mm/Makefile. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Tejun Heo <tj@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-11mm: embed the memcg pointer directly into struct pageJohannes Weiner4-70/+6
Memory cgroups used to have 5 per-page pointers. To allow users to disable that amount of overhead during runtime, those pointers were allocated in a separate array, with a translation layer between them and struct page. There is now only one page pointer remaining: the memcg pointer, that indicates which cgroup the page is associated with when charged. The complexity of runtime allocation and the runtime translation overhead is no longer justified to save that *potential* 0.19% of memory. With CONFIG_SLUB, page->mem_cgroup actually sits in the doubleword padding after the page->private member and doesn't even increase struct page, and then this patch actually saves space. Remaining users that care can still compile their kernels without CONFIG_MEMCG. text data bss dec hex filename 8828345 1725264 983040 11536649 b00909 vmlinux.old 8827425 1725264 966656 11519345 afc571 vmlinux.new [mhocko@suse.cz: update Documentation/cgroups/memory.txt] Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.cz> Acked-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Cc: Michal Hocko <mhocko@suse.cz> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Tejun Heo <tj@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Konstantin Khlebnikov <koct9i@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-11mm, memcg: fix potential undefined behaviour in page stat accountingMichal Hocko1-3/+3
Since commit d7365e783edb ("mm: memcontrol: fix missed end-writeback page accounting") mem_cgroup_end_page_stat consumes locked and flags variables directly rather than via pointers which might trigger C undefined behavior as those variables are initialized only in the slow path of mem_cgroup_begin_page_stat. Although mem_cgroup_end_page_stat handles parameters correctly and touches them only when they hold a sensible value it is caller which loads a potentially uninitialized value which then might allow compiler to do crazy things. I haven't seen any warning from gcc and it seems that the current version (4.9) doesn't exploit this type undefined behavior but Sasha has reported the following: UBSan: Undefined behaviour in mm/rmap.c:1084:2 load of value 255 is not a valid value for type '_Bool' CPU: 4 PID: 8304 Comm: rngd Not tainted 3.18.0-rc2-next-20141029-sasha-00039-g77ed13d-dirty #1427 Call Trace: dump_stack (lib/dump_stack.c:52) ubsan_epilogue (lib/ubsan.c:159) __ubsan_handle_load_invalid_value (lib/ubsan.c:482) page_remove_rmap (mm/rmap.c:1084 mm/rmap.c:1096) unmap_page_range (./arch/x86/include/asm/atomic.h:27 include/linux/mm.h:463 mm/memory.c:1146 mm/memory.c:1258 mm/memory.c:1279 mm/memory.c:1303) unmap_single_vma (mm/memory.c:1348) unmap_vmas (mm/memory.c:1377 (discriminator 3)) exit_mmap (mm/mmap.c:2837) mmput (kernel/fork.c:659) do_exit (./arch/x86/include/asm/thread_info.h:168 kernel/exit.c:462 kernel/exit.c:747) do_group_exit (include/linux/sched.h:775 kernel/exit.c:873) SyS_exit_group (kernel/exit.c:901) tracesys_phase2 (arch/x86/kernel/entry_64.S:529) Fix this by using pointer parameters for both locked and flags and be more robust for future compiler changes even though the current code is implemented correctly. Signed-off-by: Michal Hocko <mhocko@suse.cz> Reported-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-11mm: memcontrol: drop bogus RCU locking from mem_cgroup_same_or_subtree()Johannes Weiner1-7/+6
None of the mem_cgroup_same_or_subtree() callers actually require it to take the RCU lock, either because they hold it themselves or they have css references. Remove it. To make the API change clear, rename the leftover helper to mem_cgroup_is_descendant() to match cgroup_is_descendant(). Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Vladimir Davydov <vdavydov@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>