summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/i40e/i40e_xsk.c
AgeCommit message (Collapse)AuthorFilesLines
2020-05-29i40e: trivial fixup of comments in i40e_xsk.cJesper Dangaard Brouer1-3/+1
The comment above i40e_run_xdp_zc() was clearly copy-pasted from function i40e_xsk_umem_setup, which is just above. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOLBjörn Töpel1-317/+36
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL APIs. The AF_XDP zero-copy rx_bi ring is now simply a struct xdp_buff pointer. v4->v5: Fixed "warning: Excess function parameter 'bi' description in 'i40e_construct_skb_zc'". (Jakub) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20200520192103.355233-9-bjorn.topel@gmail.com
2020-05-22i40e: Separate kernel allocated rx_bi rings from AF_XDP ringsBjörn Töpel1-34/+40
Continuing the path to support MEM_TYPE_XSK_BUFF_POOL, the AF_XDP zero-copy/sk_buff rx_bi rings are now separate. Functions to properly allocate the different rings are added as well. v3->v4: Made i40e_fd_handle_status() static. (kbuild test robot) v4->v5: Fix kdoc for i40e_clean_programming_status(). (Jakub) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20200520192103.355233-8-bjorn.topel@gmail.com
2020-05-22i40e: Refactor rx_bi accessesBjörn Töpel1-6/+12
As a first step to migrate i40e to the new MEM_TYPE_XSK_BUFF_POOL APIs, code that accesses the rx_bi (SW/shadow ring) is refactored to use an accessor function. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20200520192103.355233-7-bjorn.topel@gmail.com
2020-05-22xsk: Move driver interface to xdp_sock_drv.hMagnus Karlsson1-1/+1
Move the AF_XDP zero-copy driver interface to its own include file called xdp_sock_drv.h. This, hopefully, will make it more clear for NIC driver implementors to know what functions to use for zero-copy support. v4->v5: Fix -Wmissing-prototypes by include header file. (Jakub) Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200520192103.355233-4-bjorn.topel@gmail.com
2020-05-15xdp: For Intel AF_XDP drivers add XDP frame_szJesper Dangaard Brouer1-0/+2
Intel drivers implement native AF_XDP zerocopy in separate C-files, that have its own invocation of bpf_prog_run_xdp(). The setup of xdp_buff is also handled in separately from normal code path. This patch update XDP frame_sz for AF_XDP zerocopy drivers i40e, ice and ixgbe, as the code changes needed are very similar. Introduce a helper function xsk_umem_xdp_frame_sz() for calculating frame size. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Björn Töpel <bjorn.topel@intel.com> Cc: intel-wired-lan@lists.osuosl.org Cc: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/158945347511.97035.8536753731329475655.stgit@firesoul
2020-02-06i40e: Relax i40e_xsk_wakeup's return value when PF is busyMaciej Fijalkowski1-1/+1
Return -EAGAIN instead of -ENETDOWN to provide a slightly milder information to user space so that an application will know to retry the syscall when __I40E_CONFIG_BUSY bit is set on pf->state. Fixes: b3873a5be757 ("net/i40e: Fix concurrency issues between config flow and XSK") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/20200205045834.56795-2-maciej.fijalkowski@intel.com
2019-12-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller1-2/+2
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-12-27 The following pull-request contains BPF updates for your *net-next* tree. We've added 127 non-merge commits during the last 17 day(s) which contain a total of 110 files changed, 6901 insertions(+), 2721 deletions(-). There are three merge conflicts. Conflicts and resolution looks as follows: 1) Merge conflict in net/bpf/test_run.c: There was a tree-wide cleanup c593642c8be0 ("treewide: Use sizeof_field() macro") which gets in the way with b590cb5f802d ("bpf: Switch to offsetofend in BPF_PROG_TEST_RUN"): <<<<<<< HEAD if (!range_is_zero(__skb, offsetof(struct __sk_buff, priority) + sizeof_field(struct __sk_buff, priority), ======= if (!range_is_zero(__skb, offsetofend(struct __sk_buff, priority), >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c There are a few occasions that look similar to this. Always take the chunk with offsetofend(). Note that there is one where the fields differ in here: <<<<<<< HEAD if (!range_is_zero(__skb, offsetof(struct __sk_buff, tstamp) + sizeof_field(struct __sk_buff, tstamp), ======= if (!range_is_zero(__skb, offsetofend(struct __sk_buff, gso_segs), >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c Just take the one with offsetofend() /and/ gso_segs. Latter is correct due to 850a88cc4096 ("bpf: Expose __sk_buff wire_len/gso_segs to BPF_PROG_TEST_RUN"). 2) Merge conflict in arch/riscv/net/bpf_jit_comp.c: (I'm keeping Bjorn in Cc here for a double-check in case I got it wrong.) <<<<<<< HEAD if (is_13b_check(off, insn)) return -1; emit(rv_blt(tcc, RV_REG_ZERO, off >> 1), ctx); ======= emit_branch(BPF_JSLT, RV_REG_T1, RV_REG_ZERO, off, ctx); >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c Result should look like: emit_branch(BPF_JSLT, tcc, RV_REG_ZERO, off, ctx); 3) Merge conflict in arch/riscv/include/asm/pgtable.h: <<<<<<< HEAD ======= #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1) #define VMALLOC_END (PAGE_OFFSET - 1) #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) #define BPF_JIT_REGION_SIZE (SZ_128M) #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) #define BPF_JIT_REGION_END (VMALLOC_END) /* * Roughly size the vmemmap space to be large enough to fit enough * struct pages to map half the virtual address space. Then * position vmemmap directly below the VMALLOC region. */ #define VMEMMAP_SHIFT \ (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT) #define VMEMMAP_SIZE BIT(VMEMMAP_SHIFT) #define VMEMMAP_END (VMALLOC_START - 1) #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) #define vmemmap ((struct page *)VMEMMAP_START) >>>>>>> 7c8dce4b166113743adad131b5a24c4acc12f92c Only take the BPF_* defines from there and move them higher up in the same file. Remove the rest from the chunk. The VMALLOC_* etc defines got moved via 01f52e16b868 ("riscv: define vmemmap before pfn_to_page calls"). Result: [...] #define __S101 PAGE_READ_EXEC #define __S110 PAGE_SHARED_EXEC #define __S111 PAGE_SHARED_EXEC #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1) #define VMALLOC_END (PAGE_OFFSET - 1) #define VMALLOC_START (PAGE_OFFSET - VMALLOC_SIZE) #define BPF_JIT_REGION_SIZE (SZ_128M) #define BPF_JIT_REGION_START (PAGE_OFFSET - BPF_JIT_REGION_SIZE) #define BPF_JIT_REGION_END (VMALLOC_END) /* * Roughly size the vmemmap space to be large enough to fit enough * struct pages to map half the virtual address space. Then * position vmemmap directly below the VMALLOC region. */ #define VMEMMAP_SHIFT \ (CONFIG_VA_BITS - PAGE_SHIFT - 1 + STRUCT_PAGE_MAX_SHIFT) #define VMEMMAP_SIZE BIT(VMEMMAP_SHIFT) #define VMEMMAP_END (VMALLOC_START - 1) #define VMEMMAP_START (VMALLOC_START - VMEMMAP_SIZE) [...] Let me know if there are any other issues. Anyway, the main changes are: 1) Extend bpftool to produce a struct (aka "skeleton") tailored and specific to a provided BPF object file. This provides an alternative, simplified API compared to standard libbpf interaction. Also, add libbpf extern variable resolution for .kconfig section to import Kconfig data, from Andrii Nakryiko. 2) Add BPF dispatcher for XDP which is a mechanism to avoid indirect calls by generating a branch funnel as discussed back in bpfconf'19 at LSF/MM. Also, add various BPF riscv JIT improvements, from Björn Töpel. 3) Extend bpftool to allow matching BPF programs and maps by name, from Paul Chaignon. 4) Support for replacing cgroup BPF programs attached with BPF_F_ALLOW_MULTI flag for allowing updates without service interruption, from Andrey Ignatov. 5) Cleanup and simplification of ring access functions for AF_XDP with a bonus of 0-5% performance improvement, from Magnus Karlsson. 6) Enable BPF JITs for x86-64 and arm64 by default. Also, final version of audit support for BPF, from Daniel Borkmann and latter with Jiri Olsa. 7) Move and extend test_select_reuseport into BPF program tests under BPF selftests, from Jakub Sitnicki. 8) Various BPF sample improvements for xdpsock for customizing parameters to set up and benchmark AF_XDP, from Jay Jayatheerthan. 9) Improve libbpf to provide a ulimit hint on permission denied errors. Also change XDP sample programs to attach in driver mode by default, from Toke Høiland-Jørgensen. 10) Extend BPF test infrastructure to allow changing skb mark from tc BPF programs, from Nikita V. Shirokov. 11) Optimize prologue code sequence in BPF arm32 JIT, from Russell King. 12) Fix xdp_redirect_cpu BPF sample to manually attach to tracepoints after libbpf conversion, from Jesper Dangaard Brouer. 13) Minor misc improvements from various others. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-21xsk: ixgbe: i40e: ice: mlx5: Xsk_umem_discard_addr to xsk_umem_release_addrMagnus Karlsson1-2/+2
Change the name of xsk_umem_discard_addr to xsk_umem_release_addr to better reflect the new naming of the AF_XDP queue manipulation functions. As this functions is used by drivers implementing support for AF_XDP zero-copy, it requires a name change to these drivers. The function xsk_umem_release_addr_rq has also changed name in the same fashion. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1576759171-28550-10-git-send-email-magnus.karlsson@intel.com
2019-12-19net/i40e: Fix concurrency issues between config flow and XSKMaxim Mikityanskiy1-0/+4
Use synchronize_rcu to wait until the XSK wakeup function finishes before destroying the resources it uses: 1. i40e_down already calls synchronize_rcu. On i40e_down either __I40E_VSI_DOWN or __I40E_CONFIG_BUSY is set. Check the latter in i40e_xsk_wakeup (the former is already checked there). 2. After switching the XDP program, call synchronize_rcu to let i40e_xsk_wakeup exit before the XDP program is freed. 3. Changing the number of channels brings the interface down (see i40e_prep_for_reset and i40e_pf_quiesce_all_vsi). 4. Disabling UMEM sets __I40E_CONFIG_BUSY, too. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191217162023.16011-4-maximmi@mellanox.com
2019-11-09i40e: need_wakeup flag might not be set for TxMagnus Karlsson1-8/+2
The need_wakeup flag for Tx might not be set for AF_XDP sockets that are only used to send packets. This happens if there is at least one outstanding packet that has not been completed by the hardware and we get that corresponding completion (which will not generate an interrupt since interrupts are disabled in the napi poll loop) between the time we stopped processing the Tx completions and interrupts are enabled again. In this case, the need_wakeup flag will have been cleared at the end of the Tx completion processing as we believe we will get an interrupt from the outstanding completion at a later point in time. But if this completion interrupt occurs before interrupts are enable, we lose it and should at that point really have set the need_wakeup flag since there are no more outstanding completions that can generate an interrupt to continue the processing. When this happens, user space will see a Tx queue need_wakeup of 0 and skip issuing a syscall, which means will never get into the Tx processing again and we have a deadlock. This patch introduces a quick fix for this issue by just setting the need_wakeup flag for Tx to 1 all the time. I am working on a proper fix for this that will toggle the flag appropriately, but it is more challenging than I anticipated and I am afraid that this patch will not be completed before the merge window closes, therefore this easier fix for now. This fix has a negative performance impact in the range of 0% to 4%. Towards the higher end of the scale if you have driver and application on the same core and issue a lot of packets, and towards no negative impact if you use two cores, lower transmission speeds and/or a workload that also receives packets. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-01i40e: Fix receive buffer starvation for AF_XDPJeff Kirsher1-5/+0
Magnus's fix to resolve a potential receive buffer starvation for AF_XDP got applied to both the i40e_xsk_umem_enable/disable() functions, when it should have only been applied to the "enable". So clean up the undesired code in the disable function. CC: Magnus Karlsson <magnus.karlsson@intel.com> Fixes: 1f459bdc2007 ("i40e: fix potential RX buffer starvation for AF_XDP") Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
2019-09-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller1-2/+2
Daniel Borkmann says: ==================== pull-request: bpf-next 2019-09-16 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Now that initial BPF backend for gcc has been merged upstream, enable BPF kselftest suite for bpf-gcc. Also fix a BE issue with access to bpf_sysctl.file_pos, from Ilya. 2) Follow-up fix for link-vmlinux.sh to remove bash-specific extensions related to recent work on exposing BTF info through sysfs, from Andrii. 3) AF_XDP zero copy fixes for i40e and ixgbe driver which caused umem headroom to be added twice, from Ciara. 4) Refactoring work to convert sock opt tests into test_progs framework in BPF kselftests, from Stanislav. 5) Fix a general protection fault in dev_map_hash_update_elem(), from Toke. 6) Cleanup to use BPF_PROG_RUN() macro in KCM, from Sami. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-16i40e: fix xdp handle calculationsCiara Loftus1-2/+2
Commit 4c5d9a7fa149 ("i40e: fix xdp handle calculations") reintroduced the addition of the umem headroom to the xdp handle in the i40e_zca_free, i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc functions. However, the headroom is already added to the handle in the function i40_run_xdp_zc. This commit removes the latter addition and fixes the case where the headroom is non-zero. Fixes: 4c5d9a7fa149 ("i40e: fix xdp handle calculations") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-11i40e: fix potential RX buffer starvation for AF_XDPMagnus Karlsson1-0/+5
When the RX rings are created they are also populated with buffers so that packets can be received. Usually these are kernel buffers, but for AF_XDP in zero-copy mode, these are user-space buffers and in this case the application might not have sent down any buffers to the driver at this point. And if no buffers are allocated at ring creation time, no packets can be received and no interrupts will be generated so the NAPI poll function that allocates buffers to the rings will never get executed. To rectify this, we kick the NAPI context of any queue with an attached AF_XDP zero-copy socket in two places in the code. Once after an XDP program has loaded and once after the umem is registered. This take care of both cases: XDP program gets loaded first then AF_XDP socket is created, and the reverse, AF_XDP socket is created first, then XDP program is loaded. Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-09-05i40e: fix xdp handle calculationsKevin Laatz1-3/+4
Currently, we don't add headroom to the handle in i40e_zca_free, i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc. The addition of the headroom to the handle was removed in commit 2f86c806a8a8 ("i40e: modify driver for handling offsets"), which will break things when headroom is non-zero. This patch fixes this and uses xsk_umem_adjust_offset to add it appropritely based on the mode being run. Fixes: 2f86c806a8a8 ("i40e: modify driver for handling offsets") Reported-by: Bjorn Topel <bjorn.topel@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31i40e: modify driver for handling offsetsKevin Laatz1-4/+9
With the addition of the unaligned chunks option, we need to make sure we handle the offsets accordingly based on the mode we are currently running in. This patch modifies the driver to appropriately mask the address for each case. Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-31i40e: simplify Rx buffer recycleKevin Laatz1-10/+3
Currently, the dma, addr and handle are modified when we reuse Rx buffers in zero-copy mode. However, this is not required as the inputs to the function are copies, not the original values themselves. As we use the copies within the function, we can use the original 'old_bi' values directly without having to mask and add the headroom. Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-18i40e: add support for AF_XDP need_wakeup featureMagnus Karlsson1-0/+18
This patch adds support for the need_wakeup feature of AF_XDP. If the application has told the kernel that it might sleep using the new bind flag XDP_USE_NEED_WAKEUP, the driver will then set this flag if it has no more buffers on the NIC Rx ring and yield to the application. For Tx, it will set the flag if it has no outstanding Tx completion interrupts and return to the application. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-08-18xsk: replace ndo_xsk_async_xmit with ndo_xsk_wakeupMagnus Karlsson1-3/+4
This commit replaces ndo_xsk_async_xmit with ndo_xsk_wakeup. This new ndo provides the same functionality as before but with the addition of a new flags field that is used to specifiy if Rx, Tx or both should be woken up. The previous ndo only woke up Tx, as implied by the name. The i40e and ixgbe drivers (which are all the supported ones) are updated with this new interface. This new ndo will be used by the new need_wakeup functionality of XDP sockets that need to be able to wake up both Rx and Tx driver processing. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-27xsk: Return the whole xdp_desc from xsk_umem_consume_txMaxim Mikityanskiy1-5/+7
Some drivers want to access the data transmitted in order to implement acceleration features of the NICs. It is also useful in AF_XDP TX flow. Change the xsk_umem_consume_tx API to return the whole xdp_desc, that contains the data pointer, length and DMA address, instead of only the latter two. Adapt the implementation of i40e and ixgbe to this change. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Cc: Björn Töpel <bjorn.topel@intel.com> Cc: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-14i40e: mark expected switch fall-throughGustavo A. R. Silva1-0/+1
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. This patch fixes the following warning: drivers/net/ethernet/intel/i40e/i40e_xsk.c: In function ‘i40e_run_xdp_zc’: drivers/net/ethernet/intel/i40e/i40e_xsk.c:217:3: warning: this statement may fall through [-Wimplicit-fallthrough=] bpf_warn_invalid_xdp_action(act); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/intel/i40e/i40e_xsk.c:218:2: note: here case XDP_ABORTED: ^~~~ Signed-off-by: "Gustavo A. R. Silva" <gustavo@embeddedor.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-04-01i40e: add tracking of AF_XDP ZC state for each queue pairBjörn Töpel1-0/+3
In commit f3fef2b6e1cc ("i40e: Remove umem from VSI") a regression was introduced; When the VSI was reset, the setup code would try to enable AF_XDP ZC unconditionally (as long as there was a umem placed in the netdev._rx struct). Here, we add a bitmap to the VSI that tracks if a certain queue pair has been "zero-copy enabled" via the ndo_bpf. The bitmap is used in i40e_xsk_umem, and enables zero-copy if and only if XDP is enabled, the corresponding qid in the bitmap is set and the umem is non-NULL. Fixes: f3fef2b6e1cc ("i40e: Remove umem from VSI") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+5
Three conflicts, one of which, for marvell10g.c is non-trivial and requires some follow-up from Heiner or someone else. The issue is that Heiner converted the marvell10g driver over to use the generic c45 code as much as possible. However, in 'net' a bug fix appeared which makes sure that a new local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0 is cleared. Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-21i40e: fix potential RX buffer starvation for AF_XDPMagnus Karlsson1-0/+5
When the RX rings are created they are also populated with buffers so that packets can be received. Usually these are kernel buffers, but for AF_XDP in zero-copy mode, these are user-space buffers and in this case the application might not have sent down any buffers to the driver at this point. And if no buffers are allocated at ring creation time, no packets can be received and no interrupts will be generated so the NAPI poll function that allocates buffers to the rings will never get executed. To rectify this, we kick the NAPI context of any queue with an attached AF_XDP zero-copy socket in two places in the code. Once after an XDP program has loaded and once after the umem is registered. This take care of both cases: XDP program gets loaded first then AF_XDP socket is created, and the reverse, AF_XDP socket is created first, then XDP program is loaded. Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-02-15net: bpf: remove XDP_QUERY_XSK_UMEM enumeratorJan Sokolowski1-28/+0
Commit c9b47cc1fabc ("xsk: fix bug when trying to use both copy and zero-copy on one queue id") moved the umem query code to the AF_XDP core, and therefore removed the need to query the netdevice for a umem. This patch removes XDP_QUERY_XSK_UMEM and all code that implement that behavior, which is just dead code. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-01-22i40e: Remove umem from VSIJan Sokolowski1-92/+18
As current implementation of netdev already contains and provides umems for us, we no longer have the need to contain these structures in i40e_vsi. Refactor the code to operate on netdev-provided umems. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-13i40e: DRY rx_ptype handling codeMichał Mirosław1-5/+1
Move rx_ptype extracting to i40e_process_skb_fields() to avoid duplicating the code. Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-12-13i40e: fix VLAN.TCI == 0 RX HW offloadMichał Mirosław1-5/+1
This fixes two bugs in hardware VLAN offload: 1. VLAN.TCI == 0 was being dropped 2. there was a race between disabling of VLAN RX feature in hardware and processing RX queue, where packets processed in this window could have their VLAN information dropped Fix moves the VLAN handling into i40e_process_skb_fields() to save on duplicated code. i40e_receive_skb() becomes trivial and so is removed. Signed-off-by: Michał Mirosław <michal.miroslaw@atendesoftware.pl> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-11-28i40e: fix kerneldoc for xsk methodsJan Sokolowski1-7/+7
One method, xsk_umem_setup, had an incorrect kernel doc description, which has been corrected. Also fixes small typos found in the comments. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25i40e: disallow changing the number of descriptors when AF_XDP is onBjörn Töpel1-0/+21
When an AF_XDP UMEM is attached to any of the Rx rings, we disallow a user to change the number of descriptors via e.g. "ethtool -G IFNAME". Otherwise, the size of the stash/reuse queue can grow unbounded, which would result in OOM or leaking userspace buffers. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25i40e: clean zero-copy XDP Rx ring on shutdown/resetBjörn Töpel1-8/+92
Outstanding Rx descriptors are temporarily stored on a stash/reuse queue. When/if the HW rings comes up again, entries from the stash are used to re-populate the ring. The latter required some restructuring of the allocation scheme for the AF_XDP zero-copy implementation. There is now a fast, and a slow allocation. The "fast allocation" is used from the fast-path and obtains free buffers from the fill ring and the internal recycle mechanism. The "slow allocation" is only used in ring setup, and obtains buffers from the fill ring and the stash (if any). Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-25i40e: clean zero-copy XDP Tx ring on shutdown/resetBjörn Töpel1-0/+30
When the zero-copy enabled XDP Tx ring is torn down, due to configuration changes, outstanding frames on the hardware descriptor ring are queued on the completion ring. The completion ring has a back-pressure mechanism that will guarantee that there is sufficient space on the ring. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2018-09-01xsk: i40e: get rid of useless struct xdp_umem_propsMagnus Karlsson1-2/+2
This commit gets rid of the structure xdp_umem_props. It was there to be able to break a dependency at one point, but this is no longer needed. The values in the struct are instead stored directly in the xdp_umem structure. This simplifies the xsk code as well as af_xdp zero-copy drivers and as a bonus gets rid of one internal header file. The i40e driver is also adapted to the new interface in this commit. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-09-01i40e: fix possible compiler warning in xsk TX pathMagnus Karlsson1-4/+2
With certain gcc versions, it was possible to get the warning "'tx_desc' may be used uninitialized in this function" for the i40e_xmit_zc. This was not possible, however this commit simplifies the code path so that this warning is no longer emitted. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-29i40e: add AF_XDP zero-copy Tx supportMagnus Karlsson1-0/+173
This patch adds zero-copy Tx support for AF_XDP sockets. It implements the ndo_xsk_async_xmit netdev ndo and performs all the Tx logic from a NAPI context. This means pulling egress packets from the Tx ring, placing the frames on the NIC HW descriptor ring and completing sent frames back to the application via the completion ring. The regular XDP Tx ring is used for AF_XDP as well. This rationale for this is as follows: XDP_REDIRECT guarantees mutual exclusion between different NAPI contexts based on CPU id. In other words, a netdev can XDP_REDIRECT to another netdev with a different NAPI context, since the operation is bound to a specific core and each core has its own hardware ring. As the AF_XDP Tx action is running in the same NAPI context and using the same ring, it will also be protected from XDP_REDIRECT actions with the exact same mechanism. As with AF_XDP Rx, all AF_XDP Tx specific functions are added to i40e_xsk.c. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-29i40e: add AF_XDP zero-copy Rx supportBjörn Töpel1-0/+661
This patch adds zero-copy Rx support for AF_XDP sockets. Instead of allocating buffers of type MEM_TYPE_PAGE_SHARED, the Rx frames are allocated as MEM_TYPE_ZERO_COPY when AF_XDP is enabled for a certain queue. All AF_XDP specific functions are added to a new file, i40e_xsk.c. Note that when AF_XDP zero-copy is enabled, the XDP action XDP_PASS will allocate a new buffer and copy the zero-copy frame prior passing it to the kernel stack. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>