summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-03-31Merge tag 'microblaze-v5.7-rc1' of git://git.monstr.eu/linux-2.6-microblazeLinus Torvalds51-236/+67
Pull microblaze updates from Michal Simek: - convert license headers to SPDX - cleanup header handling and use asm-generic one - get rid of earlyprintk residues - define barriers and use it in the code - get rid of setup_irq() for timer - various small addons and fixes * tag 'microblaze-v5.7-rc1' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: Replace setup_irq() by request_irq() microblaze: Stop printing the virtual memory layout microblaze: Use asm generic cmpxchg.h for !SMP case microblaze: Define percpu sestion in linker file microblaze: Remove unused boot_cpuid variable microblaze: Add missing irqflags.h header microblaze: Add sync to tlb operations microblaze: Define microblaze barrier microblaze: Remove empty headers microblaze: Remove early printk setup microblaze: Remove architecture tlb.h and use generic one microblaze: Convert headers to SPDX license microblaze: Fix _reset() function microblaze: Kernel parameters should be parsed earlier
2020-03-31Merge tag 'please-pull-ia64_for_5.7' of ↵Linus Torvalds8-181/+40
git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux Pull ia64 updates from Tony Luck: "Couple of cleanup patches" * tag 'please-pull-ia64_for_5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux: tty/serial: cleanup after ioc*_serial driver removal ia64: replace setup_irq() by request_irq()
2020-03-31Makefile: Update kselftest help informationShuah Khan1-6/+9
Update kselftest help information. Signed-off-by: Shuah Khan <skhan@linuxfoundation.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-03-31Merge tag 'mips_5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linuxLinus Torvalds136-1924/+1369
Pull MIPS updates from Thomas Bogendoerfer: - loongson64 irq rework - dmi support loongson - replace setup_irq() by request_irq() - jazz cleanups - minor cleanups and fixes * tag 'mips_5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: (44 commits) MIPS: ralink: mt7621: Fix soc_device introduction MIPS: Exclude more dsemul code when CONFIG_MIPS_FP_SUPPORT=n MIPS/tlbex: Fix LDDIR usage in setup_pw() for Loongson-3 MIPS: do not compile generic functions for CONFIG_CAVIUM_OCTEON_SOC MAINTAINERS: Update Loongson64 entry MIPS: Loongson64: Load built-in dtbs MIPS: Loongson64: Add generic dts dt-bindings: mips: Add loongson boards MIPS: Loongson64: Drop legacy IRQ code dt-bindings: interrupt-controller: Add Loongson-3 HTPIC irqchip: Add driver for Loongson-3 HyperTransport PIC controller dt-bindings: interrupt-controller: Add Loongson LIOINTC irqchip: loongson-liointc: Workaround LPC IRQ Errata irqchip: Add driver for Loongson I/O Local Interrupt Controller docs: mips: remove no longer needed au1xxx_ide.rst documentation MIPS: Alchemy: remove no longer used au1xxx_ide.h header ide: remove no longer used au1xxx-ide driver MIPS: Add support for Desktop Management Interface (DMI) firmware: dmi: Add macro SMBIOS_ENTRY_POINT_SCAN_START MIPS: ralink: mt7621: introduce 'soc_device' initialization ...
2020-03-31Merge tag 'm68k-for-v5.7-tag1' of ↵Linus Torvalds33-396/+351
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - pagetable layout rewrite, to facilitate global READ_ONCE() rework - Zorro (Amiga) and DIO (HP 9000/300) bus cleanups - defconfig updates - minor cleanups and fixes * tag 'm68k-for-v5.7-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: (23 commits) m68k: defconfig: Update defconfigs for v5.6-rc4 zorro: Replace zero-length array with flexible-array member m68k: Switch to asm-generic/hardirq.h fbdev: c2p: Use BUILD_BUG() instead of custom solution dio: Remove unused dio_dev_driver() dio: Fix dio_bus_match() kerneldoc dio: Make dio_match_device() static zorro: Move zorro_bus_type to bus-private header file zorro: Remove unused zorro_dev_driver() zorro: Use zorro_match_device() helper in zorro_bus_match() zorro: Fix zorro_bus_match() kerneldoc zorro: Make zorro_match_device() static m68k: Fix Kconfig indentation m68k: mm: Change ColdFire pgtable_t m68k: mm: Fully initialize the page-table allocator m68k: mm: Extend table allocator for multiple sizes m68k: mm: Use table allocator for pgtables m68k: mm: Improve kernel_page_table() m68k: mm: Restructure Motorola MMU page-table layout m68k: mm: Move the pointer table allocator to motorola.c ...
2020-03-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller12-61/+115
2020-03-31netdevsim: dev: Fix memory leak in nsim_dev_take_snapshot_writeGustavo A. R. Silva1-0/+1
In case memory resources for dummy_data were allocated, release them before return. Addresses-Coverity-ID: 1491997 ("Resource leak") Fixes: 7ef19d3b1d5e ("devlink: report error once U32_MAX snapshot ids have been used") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Merge branch 'stmmac-Add-additional-EHL-PCI-info-and-PCI-ID'David S. Miller4-313/+602
Voon Weifeng says: ==================== stmmac: Add additional EHL PCI info and PCI ID Thanks Jose Miguel Abreu for the feedback. Summary of v2 patches: 1/3: As suggested to keep the stmmac_pci.c file simple. So created a new file dwmac-intel.c and moved all the Intel specific PCI device out of stmmac_pci.c. 2/3: Added Intel(R) Programmable Services Engine (Intel(R) PSE) MAC PCI ID and PCI info 3/3: Added EHL 2.5Gbps PCI ID and info Changes from v1: -Added a patch to move all Intel specific PCI device from stmmac_pci.c to a new file named dwmac-intel.c. -Combine v1 patch 1/3 and 2/3 into single patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: stmmac: add EHL 2.5Gbps PCI info and PCI IDVoon Weifeng1-8/+16
Add EHL SGMII 2.5Gbps PCI info and PCI ID Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: stmmac: add EHL PSE0 & PSE1 1Gbps PCI info and PCI IDVoon Weifeng1-0/+75
Add EHL PSE0/1 RGMII & SGMII 1Gbps PCI info and PCI ID Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: stmmac: create dwmac-intel.c to contain all Intel platformVoon Weifeng4-313/+519
As stmmac_pci.c file is getting bigger and more complex, it is reasonable to separate all the Intel specific dwmac pci device to a different file. This move includes Intel Quark, TGL and EHL. A new kernel config CONFIG_DWMAC_INTEL is introduced and depends on X86. For this initial patch, all the necessary function such as probe() and exit() are identical besides the function name. Signed-off-by: Voon Weifeng <weifeng.voon@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Merge branch 'net-dsa-b53-and-bcm_sf2-updates-for-7278'David S. Miller3-42/+136
Florian Fainelli says: ==================== net: dsa: b53 & bcm_sf2 updates for 7278 This patch series contains some updates to the b53 and bcm_sf2 drivers specifically for the 7278 Ethernet switch. The first patch is technically a bug fix so it should ideally be backported to -stable, provided that Dan also agress with my resolution on this. Patches #2 through #4 are minor changes to the core b53 driver to restore VLAN configuration upon system resumption as well as deny specific bridge/VLAN operations on port 7 with the 7278 which is special and does not support VLANs. Patches #5 through #9 add support for matching VLAN TCI keys/masks to the CFP code. Changes in v2: - fixed some code comments and arrange some code for easier reading ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: dsa: bcm_sf2: Support specifying VLAN tag egress ruleFlorian Fainelli1-2/+38
The port to which the ASP is connected on 7278 is not capable of processing VLAN tags as part of the Ethernet frame, so allow an user to configure the egress VLAN policy they want to see applied by purposing the h_ext.data[1] field. Bit 0 is used to indicate that 0=tagged, 1=untagged. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: dsa: bcm_sf2: Add support for matching VLAN TCIFlorian Fainelli1-15/+38
Update relevant code paths to support the programming and matching of VLAN TCI, this is the only member of the ethtool_flow_ext that we can match, the switch does not permit matching the VLAN Ethernet Type field. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: dsa: bcm_sf2: Move writing of CFP_DATA(5) into slicing functionsFlorian Fainelli1-32/+32
In preparation for matching VLANs, move the writing of CFP_DATA(5) into the IPv4 and IPv6 slicing logic since they are part of the per-flow configuration. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: dsa: bcm_sf2: Check earlier for FLOW_EXT and FLOW_MAC_EXTFlorian Fainelli1-2/+3
We do not currently support matching on FLOW_EXT or FLOW_MAC_EXT, but we were not checking for those bits being set in the flow specification. The check for FLOW_EXT and FLOW_MAC_EXT are separated out because a subsequent commit will add support for matching VLAN TCI which are covered by FLOW_EXT. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: dsa: bcm_sf2: Disable learning for ASP portFlorian Fainelli1-1/+9
We don't want to enable learning for the ASP port since it only receives directed traffic, this allows us to bypass ARL-driven forwarding rules which could conflict with Broadcom tags and/or CFP forwarding. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: dsa: b53: Deny enslaving port 7 for 7278 into a bridgeFlorian Fainelli1-0/+6
On 7278, port 7 connects to the ASP which should only receive frames through the use of CFP rules, it is not desirable to have it be part of a bridge at all since that would make it pick up unwanted traffic that it may not even be able to filter or sustain. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: dsa: b53: Prevent tagged VLAN on port 7 for 7278Florian Fainelli1-0/+8
On 7278, port 7 of the switch connects to the ASP UniMAC which is not capable of processing VLAN tagged frames. We can still allow the port to be part of a VLAN entry, and we may want it to be untagged on egress on that VLAN because of that limitation. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: dsa: b53: Restore VLAN entries upon (re)configurationFlorian Fainelli1-0/+15
The first time b53_configure_vlan() is called we have not configured any VLAN entries yet, since that happens later when interfaces get brought up. When b53_configure_vlan() is called again from suspend/resume we need to restore all VLAN entries though. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31net: dsa: bcm_sf2: Fix overflow checksFlorian Fainelli1-6/+3
Commit f949a12fd697 ("net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc") tried to fix the some user controlled buffer overflows in bcm_sf2_cfp_rule_set() and bcm_sf2_cfp_rule_del() but the fix was using CFP_NUM_RULES, which while it is correct not to overflow the bitmaps, is not representative of what the device actually supports. Correct that by using bcm_sf2_cfp_rule_size() instead. The latter subtracts the number of rules by 1, so change the checks from greater than or equal to greater than accordingly. Fixes: f949a12fd697 ("net: dsa: bcm_sf2: fix buffer overflow doing set_rxnfc") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Merge tag 'x86-timers-2020-03-30' of ↵Linus Torvalds1-16/+112
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 timer updates from Thomas Gleixner: "A series of commits to make the MSR derived CPU and TSC frequency more accurate. It turned out that the frequency tables which have been taken from the SDM are inaccurate because the SDM provides truncated and rounded values, e.g. 83.3Mhz (83.3333...) or 116.7Mhz (116.6666...). This causes time drift in the range of ~1 second per hour (20-30 seconds per day). On some of these SoCs it's not possible to recalibrate the TSC because there is no reference (PIT, HPET) available. With some reverse engineering it was established that the possible frequencies are derived from the base clock with fixed multiplier / divider pairs. For the CPU models which have a known crystal frequency the kernel now uses multiplier / divider pairs which bring the frequencies closer to reality and fix the observed time drift issues" * tag 'x86-timers-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/tsc_msr: Make MSR derived TSC frequency more accurate x86/tsc_msr: Fix MSR_FSB_FREQ mask for Cherry Trail devices x86/tsc_msr: Use named struct initializers
2020-03-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller107-1728/+6086
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31hv_netvsc: Remove unnecessary round_up for recv_completion_cntHaiyang Zhang1-4/+5
The vzalloc_node(), already rounds the total size to whole pages, and sizeof(u64) is smaller than sizeof(struct recv_comp_data). So round_up of recv_completion_cnt is not necessary, and may cause extra memory allocation. To save memory, remove this unnecessary round_up for recv_completion_cnt. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller21-198/+280
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next: 1) Add support to specify a stateful expression in set definitions, this allows users to specify e.g. counters per set elements. 2) Flowtable software counter support. 3) Flowtable hardware offload counter support, from wenxu. 3) Parallelize flowtable hardware offload requests, from Paul Blakey. This includes a patch to add one work entry per offload command. 4) Several patches to rework nf_queue refcount handling, from Florian Westphal. 4) A few fixes for the flowtable tunnel offload: Fix crash if tunneling information is missing and set up indirect flow block as TC_SETUP_FT, patch from wenxu. 5) Stricter netlink attribute sanity check on filters, from Romain Bellan and Florent Fourcot. 5) Annotations to make sparse happy, from Jules Irenge. 6) Improve icmp errors in debugging information, from Haishuang Yan. 7) Fix warning in IPVS icmp error debugging, from Haishuang Yan. 8) Fix endianess issue in tcp extension header, from Sergey Marinkevich. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Merge tag 'x86-splitlock-2020-03-30' of ↵Linus Torvalds9-3/+258
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 splitlock updates from Thomas Gleixner: "Support for 'split lock' detection: Atomic operations (lock prefixed instructions) which span two cache lines have to acquire the global bus lock. This is at least 1k cycles slower than an atomic operation within a cache line and disrupts performance on other cores. Aside of performance disruption this is a unpriviledged form of DoS. Some newer CPUs have the capability to raise an #AC trap when such an operation is attempted. The detection is by default enabled in warning mode which will warn once when a user space application is caught. A command line option allows to disable the detection or to select fatal mode which will terminate offending applications with SIGBUS" * tag 'x86-splitlock-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/split_lock: Avoid runtime reads of the TEST_CTRL MSR x86/split_lock: Rework the initialization flow of split lock detection x86/split_lock: Enable split lock detection by kernel
2020-03-31Merge tag 'x86-entry-2020-03-30' of ↵Linus Torvalds49-1304/+1209
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 entry code updates from Thomas Gleixner: - Convert the 32bit syscalls to be pt_regs based which removes the requirement to push all 6 potential arguments onto the stack and consolidates the interface with the 64bit variant - The first small portion of the exception and syscall related entry code consolidation which aims to address the recently discovered issues vs. RCU, int3, NMI and some other exceptions which can interrupt any context. The bulk of the changes is still work in progress and aimed for 5.8. - A few lockdep namespace cleanups which have been applied into this branch to keep the prerequisites for the ongoing work confined. * tag 'x86-entry-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits) x86/entry: Fix build error x86 with !CONFIG_POSIX_TIMERS lockdep: Rename trace_{hard,soft}{irq_context,irqs_enabled}() lockdep: Rename trace_softirqs_{on,off}() lockdep: Rename trace_hardirq_{enter,exit}() x86/entry: Rename ___preempt_schedule x86: Remove unneeded includes x86/entry: Drop asmlinkage from syscalls x86/entry/32: Enable pt_regs based syscalls x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments x86/entry/32: Rename 32-bit specific syscalls x86/entry/32: Clean up syscall_32.tbl x86/entry: Remove ABI prefixes from functions in syscall tables x86/entry/64: Add __SYSCALL_COMMON() x86/entry: Remove syscall qualifier support x86/entry/64: Remove ptregs qualifier from syscall table x86/entry: Move max syscall number calculation to syscallhdr.sh x86/entry/64: Split X32 syscall table into its own file x86/entry/64: Move sys_ni_syscall stub to common.c x86/entry/64: Use syscall wrappers for x32_rt_sigreturn x86/entry: Refactor SYS_NI macros ...
2020-03-31Merge tag 'timers-core-2020-03-30' of ↵Linus Torvalds123-927/+1289
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timekeeping and timer updates from Thomas Gleixner: "Core: - Consolidation of the vDSO build infrastructure to address the difficulties of cross-builds for ARM64 compat vDSO libraries by restricting the exposure of header content to the vDSO build. This is achieved by splitting out header content into separate headers. which contain only the minimaly required information which is necessary to build the vDSO. These new headers are included from the kernel headers and the vDSO specific files. - Enhancements to the generic vDSO library allowing more fine grained control over the compiled in code, further reducing architecture specific storage and preparing for adopting the generic library by PPC. - Cleanup and consolidation of the exit related code in posix CPU timers. - Small cleanups and enhancements here and there Drivers: - The obligatory new drivers: Ingenic JZ47xx and X1000 TCU support - Correct the clock rate of PIT64b global clock - setup_irq() cleanup - Preparation for PWM and suspend support for the TI DM timer - Expand the fttmr010 driver to support ast2600 systems - The usual small fixes, enhancements and cleanups all over the place" * tag 'timers-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits) Revert "clocksource/drivers/timer-probe: Avoid creating dead devices" vdso: Fix clocksource.h macro detection um: Fix header inclusion arm64: vdso32: Enable Clang Compilation lib/vdso: Enable common headers arm: vdso: Enable arm to use common headers x86/vdso: Enable x86 to use common headers mips: vdso: Enable mips to use common headers arm64: vdso32: Include common headers in the vdso library arm64: vdso: Include common headers in the vdso library arm64: Introduce asm/vdso/processor.h arm64: vdso32: Code clean up linux/elfnote.h: Replace elf.h with UAPI equivalent scripts: Fix the inclusion order in modpost common: Introduce processor.h linux/ktime.h: Extract common header for vDSO linux/jiffies.h: Extract common header for vDSO linux/time64.h: Extract common header for vDSO linux/time32.h: Extract common header for vDSO linux/time.h: Extract common header for vDSO ...
2020-03-31Merge tag 'timers-nohz-2020-03-30' of ↵Linus Torvalds8-17/+19
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull NOHZ update from Thomas Gleixner: "Remove TIF_NOHZ from three architectures These architectures use a static key to decide whether context tracking needs to be invoked and the TIF_NOHZ flag just causes a pointless slowpath execution for nothing" * tag 'timers-nohz-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: arm64: Remove TIF_NOHZ arm: Remove TIF_NOHZ x86: Remove TIF_NOHZ context-tracking: Introduce CONFIG_HAVE_TIF_NOHZ x86/entry: Remove _TIF_NOHZ from _TIF_WORK_SYSCALL_ENTRY
2020-03-31Merge tag 'smp-core-2020-03-30' of ↵Linus Torvalds20-97/+194
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core SMP updates from Thomas Gleixner: "CPU (hotplug) updates: - Support for locked CSD objects in smp_call_function_single_async() which allows to simplify callsites in the scheduler core and MIPS - Treewide consolidation of CPU hotplug functions which ensures the consistency between the sysfs interface and kernel state. The low level functions cpu_up/down() are now confined to the core code and not longer accessible from random code" * tag 'smp-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits) cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus() cpu/hotplug: Hide cpu_up/down() cpu/hotplug: Move bringup of secondary CPUs out of smp_init() torture: Replace cpu_up/down() with add/remove_cpu() firmware: psci: Replace cpu_up/down() with add/remove_cpu() xen/cpuhotplug: Replace cpu_up/down() with device_online/offline() parisc: Replace cpu_up/down() with add/remove_cpu() sparc: Replace cpu_up/down() with add/remove_cpu() powerpc: Replace cpu_up/down() with add/remove_cpu() x86/smp: Replace cpu_up/down() with add/remove_cpu() arm64: hibernate: Use bringup_hibernate_cpu() cpu/hotplug: Provide bringup_hibernate_cpu() arm64: Use reboot_cpu instead of hardconding it to 0 arm64: Don't use disable_nonboot_cpus() ARM: Use reboot_cpu instead of hardcoding it to 0 ARM: Don't use disable_nonboot_cpus() ia64: Replace cpu_down() with smp_shutdown_nonboot_cpus() cpu/hotplug: Create a new function to shutdown nonboot cpus cpu/hotplug: Add new {add,remove}_cpu() functions sched/core: Remove rq.hrtick_csd_pending ...
2020-03-31Merge branch 'Add-packet-trap-policers-support'David S. Miller16-51/+1778
Ido Schimmel says: ==================== Add packet trap policers support Background ========== Devices capable of offloading the kernel's datapath and perform functions such as bridging and routing must also be able to send (trap) specific packets to the kernel (i.e., the CPU) for processing. For example, a device acting as a multicast-aware bridge must be able to trap IGMP membership reports to the kernel for processing by the bridge module. Motivation ========== In most cases, the underlying device is capable of handling packet rates that are several orders of magnitude higher compared to those that can be handled by the CPU. Therefore, in order to prevent the underlying device from overwhelming the CPU, devices usually include packet trap policers that are able to police the trapped packets to rates that can be handled by the CPU. Proposed solution ================= This patch set allows capable device drivers to register their supported packet trap policers with devlink. User space can then tune the parameters of these policers (currently, rate and burst size) and read from the device the number of packets that were dropped by the policer, if supported. These packet trap policers can then be bound to existing packet trap groups, which are used to aggregate logically related packet traps. As a result, trapped packets are policed to rates that can be handled the host CPU. Example usage ============= Instantiate netdevsim: Dump available packet trap policers: netdevsim/netdevsim10: policer 1 rate 1000 burst 128 policer 2 rate 2000 burst 256 policer 3 rate 3000 burst 512 Change the parameters of a packet trap policer: Bind a packet trap policer to a packet trap group: Dump parameters and statistics of a packet trap policer: netdevsim/netdevsim10: policer 3 rate 100 burst 16 stats: rx: dropped 92 Unbind a packet trap policer from a packet trap group: Patch set overview ================== Patch #1 adds the core infrastructure in devlink which allows capable device drivers to register their supported packet trap policers with devlink. Patch #2 extends the existing devlink-trap documentation. Patch #3 extends netdevsim to register a few dummy packet trap policers with devlink. Used later on to selftests the core infrastructure. Patches #4-#5 adds infrastructure in devlink to allow binding of packet trap policers to packet trap groups. Patch #6 extends netdevsim to allow such binding. Patch #7 adds a selftest over netdevsim that verifies the core devlink-trap policers functionality. Patches #8-#14 gradually add devlink-trap policers support in mlxsw. Patch #15 adds a selftest over mlxsw. All registered packet trap policers are verified to handle the configured rate and burst size. Future plans ============ * Allow changing default association between packet traps and packet trap groups * Add more packet traps. For example, for control packets (e.g., IGMP) v3: * Rebase v2 (address comments from Jiri and Jakub): * Patch #1: Add 'strict_start_type' in devlink policy * Patch #1: Have device drivers provide max/min rate/burst size for each policer. Use them to check validity of user provided parameters * Patch #3: Remove check about burst size being a power of 2 and instead add a debugfs knob to fail the operation * Patch #3: Provide max/min rate/burst size when registering policers and remove the validity checks from nsim_dev_devlink_trap_policer_set() * Patch #5: Check for presence of 'DEVLINK_ATTR_TRAP_POLICER_ID' in devlink_trap_group_set() and bail if not present * Patch #5: Add extack error message in case trap group was partially modified * Patch #7: Add test case with new 'fail_trap_policer_set' knob * Patch #7: Add test case for partially modified trap group * Patch #10: Provide max/min rate/burst size when registering policers * Patch #11: Remove the max/min validity checks from __mlxsw_sp_trap_policer_set() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31selftests: mlxsw: Add test cases for devlink-trap policersIdo Schimmel2-0/+390
Add test cases that verify that each registered packet trap policer: * Honors that imposed limitations of rate and burst size * Able to police trapped packets to the specified rate * Able to police trapped packets to the specified burst size * Able to be unbound from its trap group Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum_trap: Add support for setting of packet trap group parametersIdo Schimmel5-5/+46
Implement support for setting of packet trap group parameters by invoking the trap_group_init() callback with the new parameters. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum_trap: Switch to use correct packet trap groupIdo Schimmel3-18/+16
Some packet traps are currently exposed to user space as being member of "l3_drops" trap group, but internally they are member of a different group. Switch these traps to use the correct group so that they are all subject to the same policer, as exposed to user space. Set the trap priority of packets trapped due to loopback error during routing to the lowest priority. Such packets are not routed again by the kernel and therefore should not mask other traps (e.g., host miss) that should be routed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum_trap: Do not initialize dedicated discard policerIdo Schimmel1-9/+1
The policer is now initialized as part of the registration with devlink, so there is no need to initialize it before the registration. Remove the initialization. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum_trap: Add devlink-trap policer supportIdo Schimmel6-11/+297
Register supported packet trap policers with devlink and implement callbacks to change their parameters and read their counters. Prevent user space from passing invalid policer parameters down to the device by checking their validity and communicating the failure via an appropriate extack message. v2: * Remove the max/min validity checks from __mlxsw_sp_trap_policer_set() Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum_trap: Prepare policers for registration with devlinkIdo Schimmel2-1/+77
Prepare an array of policer IDs to register with devlink and their associated parameters. The array is composed from both policers that are currently bound to exposed trap groups and policers that are not bound to any trap group. v2: * Provide max/min rate/burst size when registering policers Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: spectrum: Track used packet trap policer IDsIdo Schimmel4-3/+38
During initialization the driver configures various packet trap groups and binds policers to them. Currently, most of these groups are not exposed to user space and therefore their policers should not be exposed as well. Otherwise, user space will be able to alter policer parameters without knowing which packet traps are policed by the policer. Use a bitmap to track the used policer IDs so that these policers will not be registered with devlink in a subsequent patch. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31mlxsw: reg: Extend QPCR registerIdo Schimmel1-0/+17
The QoS Policer Configuration Register (QPCR) is used to configure hardware policers. Extend this register with following fields and defines which will be used by subsequent patches: 1. Violate counter: reads number of packets dropped by the policer 2. Clear counter: to ensure we start counting from 0 3. Rate and burst size limits Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31selftests: netdevsim: Add test cases for devlink-trap policersIdo Schimmel2-0/+153
Add test cases for packet trap policer set / show commands as well as for the binding of these policers to packet trap groups. Both good and bad flows are tested for maximum coverage. v2: * Add test case with new 'fail_trap_policer_set' knob * Add test case for partially modified trap group Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31netdevsim: Add support for setting of packet trap group parametersIdo Schimmel2-0/+18
Add a dummy callback to set trap group parameters. Return an error when the 'fail_trap_group_set' debugfs file is set in order to exercise error paths and verify that error is propagated to user space when should. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31devlink: Allow setting of packet trap group parametersIdo Schimmel2-2/+63
The previous patch allowed device drivers to publish their default binding between packet trap policers and packet trap groups. However, some users might not be content with this binding and would like to change it. In case user space passed a packet trap policer identifier when setting a packet trap group, invoke the appropriate device driver callback and pass the new policer identifier. v2: * Check for presence of 'DEVLINK_ATTR_TRAP_POLICER_ID' in devlink_trap_group_set() and bail if not present * Add extack error message in case trap group was partially modified Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31devlink: Add packet trap group parameters supportIdo Schimmel4-9/+43
Packet trap groups are used to aggregate logically related packet traps. Currently, these groups allow user space to batch operations such as setting the trap action of all member traps. In order to prevent the CPU from being overwhelmed by too many trapped packets, it is desirable to bind a packet trap policer to these groups. For example, to limit all the packets that encountered an exception during routing to 10Kpps. Allow device drivers to bind default packet trap policers to packet trap groups when the latter are registered with devlink. The next patch will enable user space to change this default binding. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31netdevsim: Add devlink-trap policer supportIdo Schimmel2-1/+86
Register three dummy packet trap policers with devlink and implement callbacks to change their parameters and read their counters. This will be used later on in the series to test the devlink-trap policer infrastructure. v2: * Remove check about burst size being a power of 2 and instead add a debugfs knob to fail the operation * Provide max/min rate/burst size when registering policers and remove the validity checks from nsim_dev_devlink_trap_policer_set() Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Documentation: Add description of packet trap policersIdo Schimmel1-0/+26
Extend devlink-trap documentation with information about packet trap policers. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31devlink: Add packet trap policers supportIdo Schimmel3-0/+515
Devices capable of offloading the kernel's datapath and perform functions such as bridging and routing must also be able to send (trap) specific packets to the kernel (i.e., the CPU) for processing. For example, a device acting as a multicast-aware bridge must be able to trap IGMP membership reports to the kernel for processing by the bridge module. In most cases, the underlying device is capable of handling packet rates that are several orders of magnitude higher compared to those that can be handled by the CPU. Therefore, in order to prevent the underlying device from overwhelming the CPU, devices usually include packet trap policers that are able to police the trapped packets to rates that can be handled by the CPU. This patch allows capable device drivers to register their supported packet trap policers with devlink. User space can then tune the parameters of these policer (currently, rate and burst size) and read from the device the number of packets that were dropped by the policer, if supported. Subsequent patches in the series will allow device drivers to create default binding between these policers and packet trap groups and allow user space to change the binding. v2: * Add 'strict_start_type' in devlink policy * Have device drivers provide max/min rate/burst size for each policer. Use them to check validity of user provided parameters Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-31Merge branch 'cgroup-bpf_link'Alexei Starovoitov14-99/+930
Andrii Nakryiko says: ==================== bpf_link abstraction itself was formalized in [0] with justifications for why its semantics is a good fit for attaching BPF programs of various types. This patch set adds bpf_link-based BPF program attachment mechanism for cgroup BPF programs. Cgroup BPF link is semantically compatible with current BPF_F_ALLOW_MULTI semantics of attaching cgroup BPF programs directly. Thus cgroup bpf_link can co-exist with legacy BPF program multi-attachment. bpf_link is destroyed and automatically detached when the last open FD holding the reference to bpf_link is closed. This means that by default, when the process that created bpf_link exits, attached BPF program will be automatically detached due to bpf_link's clean up code. Cgroup bpf_link, like any other bpf_link, can be pinned in BPF FS and by those means survive the exit of process that created the link. This is useful in many scenarios to provide long-living BPF program attachments. Pinning also means that there could be many owners of bpf_link through independent FDs. Additionally, auto-detachmet of cgroup bpf_link is implemented. When cgroup is dying it will automatically detach all active bpf_links. This ensures that cgroup clean up is not delayed due to active bpf_link even despite no chance for any BPF program to be run for a given cgroup. In that sense it's similar to existing behavior of dropping refcnt of attached bpf_prog. But in the case of bpf_link, bpf_link is not destroyed and is still available to user as long as at least one active FD is still open (or if it's pinned in BPF FS). There are two main cgroup-specific differences between bpf_link-based and direct bpf_prog-based attachment. First, as opposed to direct bpf_prog attachment, cgroup itself doesn't "own" bpf_link, which makes it possible to auto-clean up attached bpf_link when user process abruptly exits without explicitly detaching BPF program. This makes for a safe default behavior proven in BPF tracing program types. But bpf_link doesn't bump cgroup->bpf.refcnt as well and because of that doesn't prevent cgroup from cleaning up its BPF state. Second, only owners of bpf_link (those who created bpf_link in the first place or obtained a new FD by opening bpf_link from BPF FS) can detach and/or update it. This makes sure that no other process can accidentally remove/replace BPF program. This patch set also implements LINK_UPDATE sub-command, which allows to replace bpf_link's underlying bpf_prog, similarly to BPF_F_REPLACE flag behavior for direct bpf_prog cgroup attachment. Similarly to LINK_CREATE, it is supposed to be generic command for different types of bpf_links. [0] https://lore.kernel.org/bpf/20200228223948.360936-1-andriin@fb.com/ v2->v3: - revert back to just MULTI mode (Alexei); - fix tinyconfig compilation warning (kbuild test robot); v1->v2: - implement exclusive and overridable exclusive modes (Andrey Ignatov); - fix build for !CONFIG_CGROUP_BPF build; - add more selftests for non-multi mode and inter-operability; ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-03-31selftests/bpf: Test FD-based cgroup attachmentAndrii Nakryiko2-0/+268
Add selftests to exercise FD-based cgroup BPF program attachments and their intermixing with legacy cgroup BPF attachments. Auto-detachment and program replacement (both unconditional and cmpxchng-like) are tested as well. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200330030001.2312810-5-andriin@fb.com
2020-03-31libbpf: Add support for bpf_link-based cgroup attachmentAndrii Nakryiko6-1/+122
Add bpf_program__attach_cgroup(), which uses BPF_LINK_CREATE subcommand to create an FD-based kernel bpf_link. Also add low-level bpf_link_create() API. If expected_attach_type is not specified explicitly with bpf_program__set_expected_attach_type(), libbpf will try to determine proper attach type from BPF program's section definition. Also add support for bpf_link's underlying BPF program replacement: - unconditional through high-level bpf_link__update_program() API; - cmpxchg-like with specifying expected current BPF program through low-level bpf_link_update() API. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200330030001.2312810-4-andriin@fb.com
2020-03-31bpf: Implement bpf_prog replacement for an active bpf_cgroup_linkAndrii Nakryiko5-0/+186
Add new operation (LINK_UPDATE), which allows to replace active bpf_prog from under given bpf_link. Currently this is only supported for bpf_cgroup_link, but will be extended to other kinds of bpf_links in follow-up patches. For bpf_cgroup_link, implemented functionality matches existing semantics for direct bpf_prog attachment (including BPF_F_REPLACE flag). User can either unconditionally set new bpf_prog regardless of which bpf_prog is currently active under given bpf_link, or, optionally, can specify expected active bpf_prog. If active bpf_prog doesn't match expected one, no changes are performed, old bpf_link stays intact and attached, operation returns a failure. cgroup_bpf_replace() operation is resolving race between auto-detachment and bpf_prog update in the same fashion as it's done for bpf_link detachment, except in this case update has no way of succeeding because of target cgroup marked as dying. So in this case error is returned. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200330030001.2312810-3-andriin@fb.com