summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)AuthorFilesLines
2020-04-23net: napi: add hard irqs deferral featureEric Dumazet1-0/+2
Back in commit 3b47d30396ba ("net: gro: add a per device gro flush timer") we added the ability to arm one high resolution timer, that we used to keep not-complete packets in GRO engine a bit longer, hoping that further frames might be added to them. Since then, we added the napi_complete_done() interface, and commit 364b6055738b ("net: busy-poll: return busypolling status to drivers") allowed drivers to avoid re-arming NIC interrupts if we made a promise that their NAPI poll() handler would be called in the near future. This infrastructure can be leveraged, thanks to a new device parameter, which allows to arm the napi hrtimer, instead of re-arming the device hard IRQ. We have noticed that on some servers with 32 RX queues or more, the chit-chat between the NIC and the host caused by IRQ delivery and re-arming could hurt throughput by ~20% on 100Gbit NIC. In contrast, hrtimers are using local (percpu) resources and might have lower cost. The new tunable, named napi_defer_hard_irqs, is placed in the same hierarchy than gro_flush_timeout (/sys/class/net/ethX/) By default, both gro_flush_timeout and napi_defer_hard_irqs are zero. This patch does not change the prior behavior of gro_flush_timeout if used alone : NIC hard irqs should be rearmed as before. One concrete usage can be : echo 20000 >/sys/class/net/eth1/gro_flush_timeout echo 10 >/sys/class/net/eth1/napi_defer_hard_irqs If at least one packet is retired, then we will reset napi counter to 10 (napi_defer_hard_irqs), ensuring at least 10 periodic scans of the queue. On busy queues, this should avoid NIC hard IRQ, while before this patch IRQ avoidance was only possible if napi->poll() was exhausting its budget and not call napi_complete_done(). This feature also can be used to work around some non-optimal NIC irq coalescing strategies. Having the ability to insert XX usec delays between each napi->poll() can increase cache efficiency, since we increase batch sizes. It also keeps serving cpus not idle too long, reducing tail latencies. Co-developed-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-23ipv6: Honor all IPv6 PIO Valid Lifetime valuesFernando Gont1-2/+0
RFC4862 5.5.3 e) prevents received Router Advertisements from reducing the Valid Lifetime of configured addresses to less than two hours, thus preventing hosts from reacting to the information provided by a router that has positive knowledge that a prefix has become invalid. This patch makes hosts honor all Valid Lifetime values, as per draft-gont-6man-slaac-renum-06, Section 4.2. This is meant to help mitigate the problem discussed in draft-ietf-v6ops-slaac-renum. Note: Attacks aiming at disabling an advertised prefix via a Valid Lifetime of 0 are not really more harmful than other attacks that can be performed via forged RA messages, such as those aiming at completely disabling a next-hop router via an RA that advertises a Router Lifetime of 0, or performing a Denial of Service (DoS) attack by advertising illegitimate prefixes via forged PIOs. In scenarios where RA-based attacks are of concern, proper mitigations such as RA-Guard [RFC6105] [RFC7113] should be implemented. Signed-off-by: Fernando Gont <fgont@si6networks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-23xdp: export the DEV_MAP_BULK_SIZE macroIoana Ciornei1-0/+2
Export the DEV_MAP_BULK_SIZE macro to the header file so that drivers can directly use it as the maximum number of xdp_frames received in the .ndo_xdp_xmit() callback. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-23net: mdio: of: export part of of_mdiobus_register_phy()Oleksij Rempel1-1/+10
This function will be needed in tja11xx driver for secondary PHY support. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-22net: qrtr: Add tracepoint supportManivannan Sadhasivam1-0/+115
Add tracepoint support for QRTR with NS as the first candidate. Later on this can be extended to core QRTR and transport drivers. The trace_printk() used in NS has been replaced by tracepoints. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-22net: phy: add device-managed devm_mdiobus_registerHeiner Kallweit1-0/+17
If there's no special ordering requirement for mdiobus_unregister(), then driver code can be simplified by using a device-managed version of mdiobus_register(). Prerequisite is that bus allocation has been done device-managed too. Else mdiobus_free() may be called whilst bus is still registered, resulting in a BUG_ON(). Therefore let devm_mdiobus_register() return -EPERM if bus was allocated non-managed. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-22net: phy: add Broadcom BCM54140 supportMichael Walle1-0/+1
The Broadcom BCM54140 is a Quad SGMII/QSGMII Copper/Fiber Gigabit Ethernet transceiver. This also adds support for tunables to set and get downshift and energy detect auto power-down. The PHY has four ports and each port has its own PHY address. There are per-port registers as well as global registers. Unfortunately, the global registers can only be accessed by reading and writing from/to the PHY address of the first port. Further, there is no way to find out what port you actually are by just reading the per-port registers. We therefore, have to scan the bus on the PHY probe to determine the port and thus what address we need to access the global registers. Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-22net: phy: broadcom: add helper to write/read RDB registersMichael Walle1-0/+3
RDB (Register Data Base) registers are used on newer Broadcom PHYs. Add helper to read, write and modify these registers. Signed-off-by: Michael Walle <michael@walle.cc> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-22net: mscc: ocelot: support 4 PTP programmable pinsYangbo Lu2-0/+7
Support 4 PTP programmable pins with only PTP_PF_PEROUT function for now. The PTP_PF_EXTTS function will be supported in the future, and it should be implemented separately for Felix and Ocelot, because of different hardware interrupt implementation in them. Since the hardware is not able to support absolute start time, the periodic clock request only allows start time 0 0. But nsec could be accepted for PPS case for phase adjustment. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-22net: mscc: ocelot: add wave programming registers definitionsYangbo Lu2-0/+4
Add wave programming registers definitions for Ocelot platforms. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-22net: mscc: ocelot: redefine PTP pinsYangbo Lu1-4/+5
There are 5 PTP_PINS register groups on Ocelot switch. Except the one used for TOD operations, there are still 4 register groups for programmable pins. So redefine the 4 programmable pins. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-22net: mscc: ocelot: move ocelot ptp clock code out of ocelot.cYangbo Lu2-1/+52
The Ocelot PTP clock driver had been embedded into ocelot.c driver. It had supported basic gettime64/settime64/adjtime/adjfine functions by now which were used by both Ocelot switch and Felix switch. This patch is to move current ptp clock code out of ocelot.c driver maintaining as a single ocelot_ptp.c. For futher new features implementation, the common code could be put in ocelot_ptp.c and the switch specific code should be in specific switch driver. The interrupt implementation in SoC is different between Ocelot and Felix. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21kernel/module: Hide vermagic header file from general useLeon Romanovsky1-0/+5
VERMAGIC* definitions are not supposed to be used by the drivers, see this [1] bug report, so introduce special define to guard inclusion of this header file and define it in kernel/modules.h and in internal script that generates *.mod.c files. In-tree module build: ➜ kernel git:(vermagic) ✗ make clean ➜ kernel git:(vermagic) ✗ make M=drivers/infiniband/hw/mlx5 ➜ kernel git:(vermagic) ✗ modinfo drivers/infiniband/hw/mlx5/mlx5_ib.ko filename: /images/leonro/src/kernel/drivers/infiniband/hw/mlx5/mlx5_ib.ko <...> vermagic: 5.6.0+ SMP mod_unload modversions Out-of-tree module build: ➜ mlx5 make -C /images/leonro/src/kernel clean M=/tmp/mlx5 ➜ mlx5 make -C /images/leonro/src/kernel M=/tmp/mlx5 ➜ mlx5 modinfo /tmp/mlx5/mlx5_ib.ko filename: /tmp/mlx5/mlx5_ib.ko <...> vermagic: 5.6.0+ SMP mod_unload modversions [1] https://lore.kernel.org/lkml/20200411155623.GA22175@zn.tnic Reported-by: Borislav Petkov <bp@suse.de> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Jessica Yu <jeyu@kernel.org> Co-developed-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21Merge tag 'mlx5-updates-2020-04-20' of ↵David S. Miller1-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-04-20 This series includes misc updates and clean ups to mlx5 driver: 1) improve some comments from Hu Haowen. 2) Handles errors of netif_set_real_num_{tx,rx}_queues, from Maxim 3) IPsec and FPGA related code cleanup to prepare for ASIC devices IPsec offloads, from Raed 4) Allow partial mask for tunnel options, from Roi. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-21net/mlx5: Refactor mlx5_accel_esp_create_hw_context parameter listRaed Salem1-0/+12
Currently the FPGA IPsec is the only hw implementation of the IPsec acceleration api, and so the mlx5_accel_esp_create_hw_context was wrongly made to suit this HW api, among other in its parameter list and some of its parameter endianness. This implementation might not be suitable for different HW. Refactor by group and pass all function arguments of mlx5_accel_esp_create_hw_context in common mlx5_accel_esp_xfrm_attrs struct field of mlx5_accel_esp_xfrm struct and correct the endianness according to the HW being called. Signed-off-by: Raed Salem <raeds@mellanox.com> Reviewed-by: Boris Pismenny <borisp@mellanox.com> Reviewed-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-04-20net: Add IF_OPER_TESTINGAndrew Lunn2-0/+42
RFC 2863 defines the operational state testing. Add support for this state, both as a IF_LINK_MODE_ and __LINK_STATE_. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-19net: phy: broadcom: Add support for BCM53125 internal PHYsFlorian Fainelli1-0/+1
BCM53125 has internal Gigabit PHYs which support interrupts as well as statistics, make it possible to configure both of those features with a PHY driver entry. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-17Merge branch 'for-upstream' of ↵David S. Miller4-3/+45
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2020-04-17 Here's the first bluetooth-next pull request for the 5.8 kernel: - Added debugfs option to control MITM flag usage during pairing - Added new BT_MODE socket option - Added support for Qualcom QCA6390 device - Added support for Realtek RTL8761B device - Added support for mSBC audio codec over USB endpoints - Added framework for Microsoft HCI vendor extensions - Added new Read Security Information management command - Fixes/cleanup to link layer privacy related code - Various other smaller cleanups & fixes ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds7-5/+21
Pull networking fixes from David Miller: 1) Disable RISCV BPF JIT builds when !MMU, from Björn Töpel. 2) nf_tables leaves dangling pointer after free, fix from Eric Dumazet. 3) Out of boundary write in __xsk_rcv_memcpy(), fix from Li RongQing. 4) Adjust icmp6 message source address selection when routes have a preferred source address set, from Tim Stallard. 5) Be sure to validate HSR protocol version when creating new links, from Taehee Yoo. 6) CAP_NET_ADMIN should be sufficient to manage l2tp tunnels even in non-initial namespaces, from Michael Weiß. 7) Missing release firmware call in mlx5, from Eran Ben Elisha. 8) Fix variable type in macsec_changelink(), caught by KASAN. Fix from Taehee Yoo. 9) Fix pause frame negotiation in marvell phy driver, from Clemens Gruber. 10) Record RX queue early enough in tun packet paths such that XDP programs will see the correct RX queue index, from Gilberto Bertin. 11) Fix double unlock in mptcp, from Florian Westphal. 12) Fix offset overflow in ARM bpf JIT, from Luke Nelson. 13) marvell10g needs to soft reset PHY when coming out of low power mode, from Russell King. 14) Fix MTU setting regression in stmmac for some chip types, from Florian Fainelli. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits) amd-xgbe: Use __napi_schedule() in BH context mISDN: make dmril and dmrim static net: stmmac: dwmac-sunxi: Provide TX and RX fifo sizes net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode tipc: fix incorrect increasing of link window Documentation: Fix tcp_challenge_ack_limit default value net: tulip: make early_486_chipsets static dt-bindings: net: ethernet-phy: add desciption for ethernet-phy-id1234.d400 ipv6: remove redundant assignment to variable err net/rds: Use ERR_PTR for rds_message_alloc_sgs() net: mscc: ocelot: fix untagged packet drops when enslaving to vlan aware bridge selftests/bpf: Check for correct program attach/detach in xdp_attach test libbpf: Fix type of old_fd in bpf_xdp_set_link_opts libbpf: Always specify expected_attach_type on program load if supported xsk: Add missing check on user supplied headroom size mac80211: fix channel switch trigger from unknown mesh peer mac80211: fix race in ieee80211_register_hw() net: marvell10g: soft-reset the PHY when coming out of low power net: marvell10g: report firmware version net/cxgb4: Check the return from t4_query_params properly ...
2020-04-15net: mscc: ocelot: fix untagged packet drops when enslaving to vlan aware bridgeVladimir Oltean1-1/+3
To rehash a previous explanation given in commit 1c44ce560b4d ("net: mscc: ocelot: fix vlan_filtering when enslaving to bridge before link is up"), the switch driver operates the in a mode where a single VLAN can be transmitted as untagged on a particular egress port. That is the "native VLAN on trunk port" use case. The configuration for this native VLAN is driven in 2 ways: - Set the egress port rewriter to strip the VLAN tag for the native VID (as it is egress-untagged, after all). - Configure the ingress port to drop untagged and priority-tagged traffic, if there is no native VLAN. The intention of this setting is that a trunk port with no native VLAN should not accept untagged traffic. Since both of the above configurations for the native VLAN should only be done if VLAN awareness is requested, they are actually done from the ocelot_port_vlan_filtering function, after the basic procedure of toggling the VLAN awareness flag of the port. But there's a problem with that simplistic approach: we are trying to juggle with 2 independent variables from a single function: - Native VLAN of the port - its value is held in port->vid. - VLAN awareness state of the port - currently there are some issues here, more on that later*. The actual problem can be seen when enslaving the switch ports to a VLAN filtering bridge: 0. The driver configures a pvid of zero for each port, when in standalone mode. While the bridge configures a default_pvid of 1 for each port that gets added as a slave to it. 1. The bridge calls ocelot_port_vlan_filtering with vlan_aware=true. The VLAN-filtering-dependent portion of the native VLAN configuration is done, considering that the native VLAN is 0. 2. The bridge calls ocelot_vlan_add with vid=1, pvid=true, untagged=true. The native VLAN changes to 1 (change which gets propagated to hardware). 3. ??? - nobody calls ocelot_port_vlan_filtering again, to reapply the VLAN-filtering-dependent portion of the native VLAN configuration, for the new native VLAN of 1. One can notice that after toggling "ip link set dev br0 type bridge vlan_filtering 0 && ip link set dev br0 type bridge vlan_filtering 1", the new native VLAN finally makes it through and untagged traffic finally starts flowing again. But obviously that shouldn't be needed. So it is clear that 2 independent variables need to both re-trigger the native VLAN configuration. So we introduce the second variable as ocelot_port->vlan_aware. *Actually both the DSA Felix driver and the Ocelot driver already had each its own variable: - Ocelot: ocelot_port_private->vlan_aware - Felix: dsa_port->vlan_filtering but the common Ocelot library needs to work with a single, common, variable, so there is some refactoring done to move the vlan_aware property from the private structure into the common ocelot_port structure. Fixes: 97bb69e1e36e ("net: mscc: ocelot: break apart ocelot_vlan_port_apply") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-04-15Bluetooth: Clear HCI_LL_RPA_RESOLUTION flag on resetMarcel Holtmann1-0/+1
When the controller is being reset or power cycled, then the flag HCI_LL_RPA_RESOLUTION which indicates if controller based address resolution is active needs to be also reset. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2020-04-15Bluetooth: Enable LE Enhanced Connection Complete event.Marcel Holtmann1-0/+1
In case LL Privacy is supported by the controller, it is also a good idea to use the LE Enhanced Connection Complete event for getting all information about the new connection and its addresses. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2020-04-15Bluetooth: Sort list of LE features constantsMarcel Holtmann1-3/+1
The list of LE features constants has gotten a bit confused. It lost the order and gained duplicated. Clean this up. Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2020-04-14Merge tag 'hyperv-fixes-signed' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - a series from Tianyu Lan to fix crash reporting on Hyper-V - three miscellaneous cleanup patches * tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/Hyper-V: Report crash data in die() when panic_on_oops is set x86/Hyper-V: Report crash register data when sysctl_record_panic_msg is not set x86/Hyper-V: Report crash register data or kmsg before running crash kernel x86/Hyper-V: Trigger crash enlightenment only once during system crash. x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump x86/Hyper-V: Unload vmbus channel in hv panic callback x86: hyperv: report value of misc_features hv_debugfs: Make hv_debug_root static hv: hyperv_vmbus.h: Replace zero-length array with flexible-array member
2020-04-14Merge tag 'for-5.7-rc1-tag' of ↵Linus Torvalds1-6/+4
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "We have a few regressions and one fix for stable: - revert fsync optimization - fix lost i_size update - fix a space accounting leak - build fix, add back definition of a deprecated ioctl flag - fix search condition for old roots in relocation" * tag 'for-5.7-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: re-instantiate the removed BTRFS_SUBVOL_CREATE_ASYNC definition btrfs: fix reclaim counter leak of space_info objects btrfs: make full fsyncs always operate on the entire file again btrfs: fix lost i_size update after cloning inline extent btrfs: check commit root generation in should_ignore_root
2020-04-14cfg80211: fix kernel-doc notationLothar Rubusch1-0/+10
Update missing kernel-doc annotations and fix of related warnings at 'make htmldocs'. Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com> Link: https://lore.kernel.org/r/20200408231013.28370-1-l.rubusch@gmail.com [fix indentation, attribute references] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2020-04-12Merge tag 'locking-urgent-2020-04-12' of ↵Linus Torvalds1-5/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fixes from Thomas Gleixner: "Three small fixes/updates for the locking core code: - Plug a task struct reference leak in the percpu rswem implementation. - Document the refcount interaction with PID_MAX_LIMIT - Improve the 'invalid wait context' data dump in lockdep so it contains all information which is required to decode the problem" * tag 'locking-urgent-2020-04-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: locking/lockdep: Improve 'invalid wait context' splat locking/refcount: Document interaction with PID_MAX_LIMIT locking/percpu-rwsem: Fix a task_struct refcount
2020-04-11Merge tag 'kbuild-v5.7-2' of ↵Linus Torvalds1-4/+2
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull more Kbuild updates from Masahiro Yamada: - raise minimum supported binutils version to 2.23 - remove old CONFIG_AS_* macros that we know binutils >= 2.23 supports - move remaining CONFIG_AS_* tests to Kconfig from Makefile - enable -Wtautological-compare warnings to catch more issues - do not support GCC plugins for GCC <= 4.7 - fix various breakages of 'make xconfig' - include the linker version used for linking the kernel into LINUX_COMPILER, which is used for the banner, and also exposed to /proc/version - link lib-y objects to vmlinux forcibly when CONFIG_MODULES=y, which allows us to remove the lib-ksyms.o workaround, and to solve the last known issue of the LLVM linker - add dummy tools in scripts/dummy-tools/ to enable all compiler tests in Kconfig, which will be useful for distro maintainers - support the single switch, LLVM=1 to use Clang and all LLVM utilities instead of GCC and Binutils. - support LLVM_IAS=1 to enable the integrated assembler, which is still experimental * tag 'kbuild-v5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (36 commits) kbuild: fix comment about missing include guard detection kbuild: support LLVM=1 to switch the default tools to Clang/LLVM kbuild: replace AS=clang with LLVM_IAS=1 kbuild: add dummy toolchains to enable all cc-option etc. in Kconfig kbuild: link lib-y objects to vmlinux forcibly when CONFIG_MODULES=y MIPS: fw: arc: add __weak to prom_meminit and prom_free_prom_memory kbuild: remove -I$(srctree)/tools/include from scripts/Makefile kbuild: do not pass $(KBUILD_CFLAGS) to scripts/mkcompile_h Documentation/llvm: fix the name of llvm-size kbuild: mkcompile_h: Include $LD version in /proc/version kconfig: qconf: Fix a few alignment issues kconfig: qconf: remove some old bogus TODOs kconfig: qconf: fix support for the split view mode kconfig: qconf: fix the content of the main widget kconfig: qconf: Change title for the item window kconfig: qconf: clean deprecated warnings gcc-plugins: drop support for GCC <= 4.7 kbuild: Enable -Wtautological-compare x86: update AS_* macros to binutils >=2.23, supporting ADX and AVX2 crypto: x86 - clean up poly1305-x86_64-cryptogams.S by 'make clean' ...
2020-04-11x86/Hyper-V: Report crash data in die() when panic_on_oops is setTianyu Lan1-1/+1
When oops happens with panic_on_oops unset, the oops thread is killed by die() and system continues to run. In such case, guest should not report crash register data to host since system still runs. Check panic_on_oops and return directly in hyperv_report_panic() when the function is called in the die() and panic_on_oops is unset. Fix it. Fixes: 7ed4325a44ea ("Drivers: hv: vmbus: Make panic reporting to be more useful") Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/20200406155331.2105-7-Tianyu.Lan@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2020-04-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds7-15/+73
Merge yet more updates from Andrew Morton: - Almost all of the rest of MM (memcg, slab-generic, slab, pagealloc, gup, hugetlb, pagemap, memremap) - Various other things (hfs, ocfs2, kmod, misc, seqfile) * akpm: (34 commits) ipc/util.c: sysvipc_find_ipc() should increase position index kernel/gcov/fs.c: gcov_seq_next() should increase position index fs/seq_file.c: seq_read(): add info message about buggy .next functions drivers/dma/tegra20-apb-dma.c: fix platform_get_irq.cocci warnings change email address for Pali Rohár selftests: kmod: test disabling module autoloading selftests: kmod: fix handling test numbers above 9 docs: admin-guide: document the kernel.modprobe sysctl fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() kmod: make request_module() return an error when autoloading is disabled mm/memremap: set caching mode for PCI P2PDMA memory to WC mm/memory_hotplug: add pgprot_t to mhp_params powerpc/mm: thread pgprot_t through create_section_mapping() x86/mm: introduce __set_memory_prot() x86/mm: thread pgprot_t through init_memory_mapping() mm/memory_hotplug: rename mhp_restrictions to mhp_params mm/memory_hotplug: drop the flags field from struct mhp_restrictions mm/special: create generic fallbacks for pte_special() and pte_mkspecial() mm/vma: introduce VM_ACCESS_FLAGS mm/vma: define a default value for VM_DATA_DEFAULT_FLAGS ...
2020-04-11Merge tag 'for-linus-5.7-rc1b-tag' of ↵Linus Torvalds3-14/+15
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull more xen updates from Juergen Gross: - two cleanups - fix a boot regression introduced in this merge window - fix wrong use of memory allocation flags * tag 'for-linus-5.7-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: x86/xen: fix booting 32-bit pv guest x86/xen: make xen_pvmmu_arch_setup() static xen/blkfront: fix memory allocation flags in blkfront_setup_indirect() xen: Use evtchn_type_t as a type for event channels
2020-04-11change email address for Pali RohárPali Rohár1-1/+1
For security reasons I stopped using gmail account and kernel address is now up-to-date alias to my personal address. People periodically send me emails to address which they found in source code of drivers, so this change reflects state where people can contact me. [ Added .mailmap entry as per Joe Perches - Linus ] Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Joe Perches <joe@perches.com> Link: http://lkml.kernel.org/r/20200307104237.8199-1-pali@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm/memory_hotplug: add pgprot_t to mhp_paramsLogan Gunthorpe1-0/+3
devm_memremap_pages() is currently used by the PCI P2PDMA code to create struct page mappings for IO memory. At present, these mappings are created with PAGE_KERNEL which implies setting the PAT bits to be WB. However, on x86, an mtrr register will typically override this and force the cache type to be UC-. In the case firmware doesn't set this register it is effectively WB and will typically result in a machine check exception when it's accessed. Other arches are not currently likely to function correctly seeing they don't have any MTRR registers to fall back on. To solve this, provide a way to specify the pgprot value explicitly to arch_add_memory(). Of the arches that support MEMORY_HOTPLUG: x86_64, and arm64 need a simple change to pass the pgprot_t down to their respective functions which set up the page tables. For x86_32, set the page tables explicitly using _set_memory_prot() (seeing they are already mapped). For ia64, s390 and sh, reject anything but PAGE_KERNEL settings -- this should be fine, for now, seeing these architectures don't support ZONE_DEVICE. A check in __add_pages() is also added to ensure the pgprot parameter was set for all arches. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-7-logang@deltatee.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm/memory_hotplug: rename mhp_restrictions to mhp_paramsLogan Gunthorpe1-8/+8
The mhp_restrictions struct really doesn't specify anything resembling a restriction anymore so rename it to be mhp_params as it is a list of extended parameters. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: http://lkml.kernel.org/r/20200306170846.9333-3-logang@deltatee.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm/memory_hotplug: drop the flags field from struct mhp_restrictionsLogan Gunthorpe1-2/+0
Patch series "Allow setting caching mode in arch_add_memory() for P2PDMA", v4. Currently, the page tables created using memremap_pages() are always created with the PAGE_KERNEL cacheing mode. However, the P2PDMA code is creating pages for PCI BAR memory which should never be accessed through the cache and instead use either WC or UC. This still works in most cases, on x86, because the MTRR registers typically override the caching settings in the page tables for all of the IO memory to be UC-. However, this tends not to work so well on other arches or some rare x86 machines that have firmware which does not setup the MTRR registers in this way. Instead of this, this series proposes a change to arch_add_memory() to take the pgprot required by the mapping which allows us to explicitly set pagetable entries for P2PDMA memory to UC. This changes is pretty routine for most of the arches: x86_64, arm64 and powerpc simply need to thread the pgprot through to where the page tables are setup. x86_32 unfortunately sets up the page tables at boot so must use _set_memory_prot() to change their caching mode. ia64, s390 and sh don't appear to have an easy way to change the page tables so, for now at least, we just return -EINVAL on such mappings and thus they will not support P2PDMA memory until the work for this is done. This should be fine as they don't yet support ZONE_DEVICE. This patch (of 7): This variable is not used anywhere and should therefore be removed from the structure. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Eric Badger <ebadger@gigaio.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Mackerras <paulus@samba.org> Link: http://lkml.kernel.org/r/20200306170846.9333-2-logang@deltatee.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm/special: create generic fallbacks for pte_special() and pte_mkspecial()Anshuman Khandual1-0/+12
Currently there are many platforms that dont enable ARCH_HAS_PTE_SPECIAL but required to define quite similar fallback stubs for special page table entry helpers such as pte_special() and pte_mkspecial(), as they get build in generic MM without a config check. This creates two generic fallback stub definitions for these helpers, eliminating much code duplication. mips platform has a special case where pte_special() and pte_mkspecial() visibility is wider than what ARCH_HAS_PTE_SPECIAL enablement requires. This restricts those symbol visibility in order to avoid redefinitions which is now exposed through this new generic stubs and subsequent build failure. arm platform set_pte_at() definition needs to be moved into a C file just to prevent a build failure. [anshuman.khandual@arm.com: use defined(CONFIG_ARCH_HAS_PTE_SPECIAL) in mips per Thomas] Link: http://lkml.kernel.org/r/1583851924-21603-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Guo Ren <guoren@kernel.org> [csky] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Stafford Horne <shorne@gmail.com> [openrisc] Acked-by: Helge Deller <deller@gmx.de> [parisc] Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Brian Cain <bcain@codeaurora.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Sam Creasey <sammy@sammy.net> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paulburton@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Greentime Hu <green.hu@gmail.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Link: http://lkml.kernel.org/r/1583802551-15406-1-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm/vma: introduce VM_ACCESS_FLAGSAnshuman Khandual1-1/+5
There are many places where all basic VMA access flags (read, write, exec) are initialized or checked against as a group. One such example is during page fault. Existing vma_is_accessible() wrapper already creates the notion of VMA accessibility as a group access permissions. Hence lets just create VM_ACCESS_FLAGS (VM_READ|VM_WRITE|VM_EXEC) which will not only reduce code duplication but also extend the VMA accessibility concept in general. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Salter <msalter@redhat.com> Cc: Nick Hu <nickhu@andestech.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Rob Springer <rspringer@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Link: http://lkml.kernel.org/r/1583391014-8170-3-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm/vma: define a default value for VM_DATA_DEFAULT_FLAGSAnshuman Khandual1-0/+14
There are many platforms with exact same value for VM_DATA_DEFAULT_FLAGS This creates a default value for VM_DATA_DEFAULT_FLAGS in line with the existing VM_STACK_DEFAULT_FLAGS. While here, also define some more macros with standard VMA access flag combinations that are used frequently across many platforms. Apart from simplification, this reduces code duplication as well. Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Mark Salter <msalter@redhat.com> Cc: Guo Ren <guoren@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Brian Cain <bcain@codeaurora.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Burton <paulburton@kernel.org> Cc: Nick Hu <nickhu@andestech.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Rich Felker <dalias@libc.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jeff Dike <jdike@addtoit.com> Cc: Chris Zankel <chris@zankel.net> Link: http://lkml.kernel.org/r/1583391014-8170-2-git-send-email-anshuman.khandual@arm.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm/memory.c: add vm_insert_pages()Arjun Roy1-0/+2
Add the ability to insert multiple pages at once to a user VM with lower PTE spinlock operations. The intention of this patch-set is to reduce atomic ops for tcp zerocopy receives, which normally hits the same spinlock multiple times consecutively. [akpm@linux-foundation.org: pte_alloc() no longer takes the `addr' argument] [arjunroy@google.com: add missing page_count() check to vm_insert_pages()] Link: http://lkml.kernel.org/r/20200214005929.104481-1-arjunroy.kdev@gmail.com [arjunroy@google.com: vm_insert_pages() checks if pte_index defined] Link: http://lkml.kernel.org/r/20200228054714.204424-2-arjunroy.kdev@gmail.com Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: David Miller <davem@davemloft.net> Cc: Matthew Wilcox <willy@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Link: http://lkml.kernel.org/r/20200128025958.43490-2-arjunroy.kdev@gmail.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm: hugetlb: optionally allocate gigantic hugepages using cmaRoman Gushchin1-0/+12
Commit 944d9fec8d7a ("hugetlb: add support for gigantic page allocation at runtime") has added the run-time allocation of gigantic pages. However it actually works only at early stages of the system loading, when the majority of memory is free. After some time the memory gets fragmented by non-movable pages, so the chances to find a contiguous 1GB block are getting close to zero. Even dropping caches manually doesn't help a lot. At large scale rebooting servers in order to allocate gigantic hugepages is quite expensive and complex. At the same time keeping some constant percentage of memory in reserved hugepages even if the workload isn't using it is a big waste: not all workloads can benefit from using 1 GB pages. The following solution can solve the problem: 1) On boot time a dedicated cma area* is reserved. The size is passed as a kernel argument. 2) Run-time allocations of gigantic hugepages are performed using the cma allocator and the dedicated cma area In this case gigantic hugepages can be allocated successfully with a high probability, however the memory isn't completely wasted if nobody is using 1GB hugepages: it can be used for pagecache, anon memory, THPs, etc. * On a multi-node machine a per-node cma area is allocated on each node. Following gigantic hugetlb allocation are using the first available numa node if the mask isn't specified by a user. Usage: 1) configure the kernel to allocate a cma area for hugetlb allocations: pass hugetlb_cma=10G as a kernel argument 2) allocate hugetlb pages as usual, e.g. echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages If the option isn't enabled or the allocation of the cma area failed, the current behavior of the system is preserved. x86 and arm-64 are covered by this patch, other architectures can be trivially added later. The patch contains clean-ups and fixes proposed and implemented by Aslan Bakirov and Randy Dunlap. It also contains ideas and suggestions proposed by Rik van Riel, Michal Hocko and Mike Kravetz. Thanks! Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Acked-by: Michal Hocko <mhocko@kernel.org> Cc: Aslan Bakirov <aslan@fb.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Joonsoo Kim <js1304@gmail.com> Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11mm: cma: NUMA node interfaceAslan Bakirov2-2/+15
I've noticed that there is no interface exposed by CMA which would let me to declare contigous memory on particular NUMA node. This patchset adds the ability to try to allocate contiguous memory on a specific node. It will fallback to other nodes if the specified one doesn't work. Implement a new method for declaring contigous memory on particular node and keep cma_declare_contiguous() as a wrapper. [akpm@linux-foundation.org: build fix] Signed-off-by: Aslan Bakirov <aslan@fb.com> Signed-off-by: Roman Gushchin <guro@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Michal Hocko <mhocko@kernel.org> Cc: Andreas Schaufler <andreas.schaufler@gmx.de> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Rik van Riel <riel@surriel.com> Cc: Joonsoo Kim <js1304@gmail.com> Link: http://lkml.kernel.org/r/20200407163840.92263-2-guro@fb.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11docs: mm: slab.h: fix a broken cross-referenceMauro Carvalho Chehab1-1/+1
There is a typo at the cross-reference link, causing this warning: include/linux/slab.h:11: WARNING: undefined label: memory-allocation (if the link has no caption the label must precede a section header) Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Link: http://lkml.kernel.org/r/0aeac24235d356ebd935d11e147dcc6edbb6465c.1586359676.git.mchehab+huawei@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10printk: queue wake_up_klogd irq_work only if per-CPU areas are readySergey Senozhatsky1-5/+0
printk_deferred(), similarly to printk_safe/printk_nmi, does not immediately attempt to print a new message on the consoles, avoiding calls into non-reentrant kernel paths, e.g. scheduler or timekeeping, which potentially can deadlock the system. Those printk() flavors, instead, rely on per-CPU flush irq_work to print messages from safer contexts. For same reasons (recursive scheduler or timekeeping calls) printk() uses per-CPU irq_work in order to wake up user space syslog/kmsg readers. However, only printk_safe/printk_nmi do make sure that per-CPU areas have been initialised and that it's safe to modify per-CPU irq_work. This means that, for instance, should printk_deferred() be invoked "too early", that is before per-CPU areas are initialised, printk_deferred() will perform illegal per-CPU access. Lech Perczak [0] reports that after commit 1b710b1b10ef ("char/random: silence a lockdep splat with printk()") user-space syslog/kmsg readers are not able to read new kernel messages. The reason is printk_deferred() being called too early (as was pointed out by Petr and John). Fix printk_deferred() and do not queue per-CPU irq_work before per-CPU areas are initialized. Link: https://lore.kernel.org/lkml/aa0732c6-5c4e-8a8b-a1c1-75ebe3dca05b@camlintechnologies.com/ Reported-by: Lech Perczak <l.perczak@camlintechnologies.com> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Tested-by: Jann Horn <jannh@google.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Theodore Ts'o <tytso@mit.edu> Cc: John Ogness <john.ogness@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-10Merge branch 'for-linus' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull proc fix from Eric Biederman: "A brown paper bag slipped through my proc changes, and syzcaller caught it when the code ended up in your tree. I have opted to fix it the simplest cleanest way I know how, so there is no reasonable chance for the bug to repeat" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: proc: Use a dedicated lock in struct pid
2020-04-10Merge tag 'pwm/for-5.7-rc1' of ↵Linus Torvalds3-93/+4
git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm Pull pwm updates from Thierry Reding: "There's quite a few changes this time around. Most of these are fixes and cleanups, but there's also new chip support for some drivers and a bit of rework" * tag 'pwm/for-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm: (33 commits) pwm: pca9685: Fix PWM/GPIO inter-operation pwm: Make pwm_apply_state_debug() static pwm: meson: Remove redundant assignment to variable fin_freq pwm: jz4740: Allow selection of PWM channels 0 and 1 pwm: jz4740: Obtain regmap from parent node pwm: jz4740: Improve algorithm of clock calculation pwm: jz4740: Use clocks from TCU driver pwm: sun4i: Remove redundant needs_delay pwm: omap-dmtimer: Implement .apply callback pwm: omap-dmtimer: Do not disable PWM before changing period/duty_cycle pwm: omap-dmtimer: Fix PWM enabling sequence pwm: omap-dmtimer: Update description for PWM OMAP DM timer pwm: omap-dmtimer: Drop unused header file pwm: renesas-tpu: Drop confusing registered message pwm: renesas-tpu: Fix late Runtime PM enablement pwm: rcar: Fix late Runtime PM enablement dt-bindings: pwm: renesas-tpu: Document more R-Car Gen2 support pwm: meson: Fix confusing indentation pwm: pca9685: Use gpio core provided macro GPIO_LINE_DIRECTION_OUT pwm: pca9685: Replace CONFIG_PM with __maybe_unused ...
2020-04-10Merge tag 'drm-next-2020-04-10' of git://anongit.freedesktop.org/drm/drmLinus Torvalds2-3/+4
Pull more drm fixes from Dave Airlie: "As expected, more fixes did turn up in the latter part of the week. The drm_local_map build regression fix is here, along with temporary disabling of the hugepage work due to some amdgpu related crashes. Otherwise it's just a bunch of i915, and amdgpu fixes. legacy: - fix drm_local_map.offset type ttm: - temporarily disable hugepages to debug amdgpu problems. prime: - fix sg extraction amdgpu: - Various Renoir fixes - Fix gfx clockgating sequence on gfx10 - RAS fixes - Avoid MST property creation after registration - Various cursor/viewport fixes - Fix a confusing log message about optional firmwares i915: - Flush all the reloc_gpu batch (Chris) - Ignore readonly failures when updating relocs (Chris) - Fill all the unused space in the GGTT (Chris) - Return the right vswing table (Jose) - Don't enable DDI IO power on a TypeC port in TBT mode for ICL+ (Imre) analogix_dp: - probe fix virtio: - oob fix in object create" * tag 'drm-next-2020-04-10' of git://anongit.freedesktop.org/drm/drm: (34 commits) drm/ttm: Temporarily disable the huge_fault() callback drm/bridge: analogix_dp: Split bind() into probe() and real bind() drm/legacy: Fix type for drm_local_map.offset drm/amdgpu/display: fix warning when compiling without debugfs drm/amdgpu: unify fw_write_wait for new gfx9 asics drm/amd/powerplay: error out on forcing clock setting not supported drm/amdgpu: fix gfx hang during suspend with video playback (v2) drm/amd/display: Check for null fclk voltage when parsing clock table drm/amd/display: Acknowledge wm_optimized_required drm/amd/display: Make cursor source translation adjustment optional drm/amd/display: Calculate scaling ratios on every medium/full update drm/amd/display: Program viewport when source pos changes for DCN20 hw seq drm/amd/display: Fix incorrect cursor pos on scaled primary plane drm/amd/display: change default pipe_split policy for DCN1 drm/amd/display: Translate cursor position by source rect drm/amd/display: Update stream adjust in dc_stream_adjust_vmin_vmax drm/amd/display: Avoid create MST prop after registration drm/amdgpu/psp: dont warn on missing optional TA's drm/amdgpu: update RAS related dmesg print drm/amdgpu: resolve mGPU RAS query instability ...
2020-04-10Merge tag 'sound-fix-5.7-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes gathered since the previous update. ALSA core: - Regression fix for OSS PCM emulation ASoC: - Trivial fixes in reg bit mask ops, DAPM, DPCM and topology - Lots of fixes for Intel-based devices - Minor fixes for AMD, STM32, Qualcomm, Realtek Others: - Fixes for the bugs in mixer handling in HD-audio and ice1724 drivers that were caught by the recent kctl validator - New quirks for HD-audio and USB-audio Also this contains a fix for EDD firmware fix, which slipped from anyone's hands" * tag 'sound-fix-5.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits) ALSA: hda: Add driver blacklist ALSA: usb-audio: Add mixer workaround for TRX40 and co ALSA: hda/realtek - Add quirk for MSI GL63 ALSA: ice1724: Fix invalid access for enumerated ctl items ALSA: hda: Fix potential access overflow in beep helper ASoC: cs4270: pull reset GPIO low then high ALSA: hda/realtek - Add HP new mute led supported for ALC236 ALSA: hda/realtek - Add supported new mute Led for HP ASoC: rt5645: Add platform-data for Medion E1239T ASoC: Intel: bytcr_rt5640: Add quirk for MPMAN MPWIN895CL tablet ASoC: stm32: sai: Add missing cleanup ALSA: usb-audio: Add registration quirk for Kingston HyperX Cloud Alpha S ASoC: Intel: atom: Fix uninitialized variable compiler warning ASoC: Intel: atom: Check drv->lock is locked in sst_fill_and_send_cmd_unlocked ASoC: Intel: atom: Take the drv->lock mutex before calling sst_send_slot_map() ASoC: SOF: Turn "firmware boot complete" message into a dbg message ALSA: usb-audio: Add Pioneer DJ DJM-250MK2 quirk ALSA: pcm: oss: Fix regression by buffer overflow fix (again) ALSA: pcm: oss: Fix regression by buffer overflow fix edd: Use scnprintf() for avoiding potential buffer overflow ...
2020-04-10Merge tag 'block-5.7-2020-04-10' of git://git.kernel.dk/linux-blockLinus Torvalds2-30/+17
Pull block fixes from Jens Axboe: "Here's a set of fixes that should go into this merge window. This contains: - NVMe pull request from Christoph with various fixes - Better discard support for loop (Evan) - Only call ->commit_rqs() if we have queued IO (Keith) - blkcg offlining fixes (Tejun) - fix (and fix the fix) for busy partitions" * tag 'block-5.7-2020-04-10' of git://git.kernel.dk/linux-block: block: fix busy device checking in blk_drop_partitions again block: fix busy device checking in blk_drop_partitions nvmet-rdma: fix double free of rdma queue blk-mq: don't commit_rqs() if none were queued nvme-fc: Revert "add module to ops template to allow module references" nvme: fix deadlock caused by ANA update wrong locking nvmet-rdma: fix bonding failover possible NULL deref loop: Better discard support for block devices loop: Report EOPNOTSUPP properly nvmet: fix NULL dereference when removing a referral nvme: inherit stable pages constraint in the mpath stack device blkcg: don't offline parent blkcg first blkcg: rename blkcg->cgwb_refcnt to ->online_pin and always use it nvme-tcp: fix possible crash in recv error flow nvme-tcp: don't poll a non-live queue nvme-tcp: fix possible crash in write_zeroes processing nvmet-fc: fix typo in comment nvme-rdma: Replace comma with a semicolon nvme-fcloop: fix deallocation of working context nvme: fix compat address handling in several ioctls
2020-04-10btrfs: re-instantiate the removed BTRFS_SUBVOL_CREATE_ASYNC definitionEugene Syromiatnikov1-6/+4
The commit 9c1036fdb1d1ff1b ("btrfs: Remove BTRFS_SUBVOL_CREATE_ASYNC support") breaks strace build with the kernel headers from git: btrfs.c: In function "btrfs_test_subvol_ioctls": btrfs.c:531:23: error: "BTRFS_SUBVOL_CREATE_ASYNC" undeclared (first use in this function) vol_args_v2.flags = BTRFS_SUBVOL_CREATE_ASYNC; Moreover, it is improper to break UAPI, strace uses the definitions to decode ioctls that are considered part of public API. Restore the macro definition and put it under "#ifndef __KERNEL__" in order to prevent inadvertent in-kernel usage. Fixes: 9c1036fdb1d1ff1b ("btrfs: Remove BTRFS_SUBVOL_CREATE_ASYNC support") Reviewed-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2020-04-09Merge tag 'riscv-for-linus-5.7' of ↵Linus Torvalds1-0/+20
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Palmer Dabbelt: "This contains a handful of new features: - Partial support for the Kendryte K210. There are still a few outstanding issues that I have patches for, but I don't actually have a board to test them so they're not included yet. - SBI v0.2 support. - Fixes to support for building with LLVM-based toolchains. The resulting images are known not to boot yet. I don't anticipate a part two, but I'll probably have something early in the RCs to finish up the K210 support" * tag 'riscv-for-linus-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (38 commits) riscv: create a loader.bin boot image for Kendryte SoC riscv: Kendryte K210 default config riscv: Add Kendryte K210 device tree riscv: Select required drivers for Kendryte SOC riscv: Add Kendryte K210 SoC support riscv: Add SOC early init support riscv: Unaligned load/store handling for M_MODE RISC-V: Support cpu hotplug RISC-V: Add supported for ordered booting method using HSM RISC-V: Add SBI HSM extension definitions RISC-V: Export SBI error to linux error mapping function RISC-V: Add cpu_ops and modify default booting method RISC-V: Move relocate and few other functions out of __init RISC-V: Implement new SBI v0.2 extensions RISC-V: Introduce a new config for SBI v0.1 RISC-V: Add SBI v0.2 extension definitions RISC-V: Add basic support for SBI v0.2 RISC-V: Mark existing SBI as 0.1 SBI. riscv: Use macro definition instead of magic number riscv: Add support to dump the kernel page tables ...