summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/google/gve/gve_main.c
AgeCommit message (Collapse)AuthorFilesLines
2024-05-07gve: Implement queue apiShailend Chand1-12/+165
The new netdev queue api is implemented for gve. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Link: https://lore.kernel.org/all/20240501232549.1327174-11-shailend@google.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-05gve: Alloc and free QPLs with the ringsShailend Chand1-257/+86
Every tx and rx ring has its own queue-page-list (QPL) that serves as the bounce buffer. Previously we were allocating QPLs for all queues before the queues themselves were allocated and later associating a QPL with a queue. This is avoidable complexity: it is much more natural for each queue to allocate and free its own QPL. Moreover, the advent of new queue-manipulating ndo hooks make it hard to keep things as is: we would need to transfer a QPL from an old queue to a new queue, and that is unpleasant. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-05gve: Avoid rescheduling napi if on wrong cpuShailend Chand1-2/+31
In order to make possible the implementation of per-queue ndo hooks, gve_turnup was changed in a previous patch to account for queues already having some unprocessed descriptors: it does a one-off napi_schdule to handle them. If conditions of consistent high traffic persist in the immediate aftermath of this, the poll routine for a queue can be "stuck" on the cpu on which the ndo hooks ran, instead of the cpu its irq has affinity with. This situation is exacerbated by the fact that the ndo hooks for all the queues are invoked on the same cpu, potentially causing all the napi poll routines to be residing on the same cpu. A self correcting mechanism in the poll method itself solves this problem. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-05gve: Make gve_turnup work for nonempty queuesShailend Chand1-0/+14
gVNIC has a requirement that all queues have to be quiesced before any queue is operated on (created or destroyed). To enable the implementation of future ndo hooks that work on a single queue, we need to evolve gve_turnup to account for queues already having some unprocessed descriptors in the ring. Say rxq 4 is being stopped and started via the queue api. Due to gve's requirement of quiescence, queues 0 through 3 are not processing their rings while queue 4 is being toggled. Once they are made live, these queues need to be poked to cause them to check their rings for descriptors that were written during their brief period of quiescence. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-05gve: Make gve_turn(up|down) ignore stopped queuesShailend Chand1-0/+10
Currently the queues are either all live or all dead, toggling from one state to the other via the ndo open and stop hooks. The future addition of single-queue ndo hooks changes this, and thus gve_turnup and gve_turndown should evolve to account for a state where some queues are live and some aren't. Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-19gve: Remove qpl_cfg struct since qpl_ids map with queues respectivelyZiwei Xiao1-37/+1
The qpl_cfg struct was used to make sure that no two different queues are using QPL with the same qpl_id. We can remove that qpl_cfg struct since now the qpl_ids map with the queues respectively as follows: For tx queues: qpl_id = tx_qid For rx queues: qpl_id = max_tx_queues + rx_qid And when XDP is used, it will need the user to reduce the tx queues to be at most half of the max_tx_queues. Then it will use the same number of tx queues starting from the end of existing tx queues for XDP. So the XDP queues will not exceed the max_tx_queues range and will not overlap with the rx queues, where the qpl_ids will not have overlapping too. Considering of that, we remove the qpl_cfg struct to get the qpl_id directly based on the queue id. Unless we are erroneously allocating a rx/tx queue that has already been allocated, we would never allocate the qpl with the same qpl_id twice. In that case, it should fail much earlier than the QPL assignment. Suggested-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Shailend Chand <shailend@google.com> Link: https://lore.kernel.org/r/20240417205757.778551-1-ziweixiao@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03gve: add support to change ring size via ethtoolHarshitha Ramamurthy1-8/+8
Allow the user to change ring size via ethtool if supported by the device. The driver relies on the ring size ranges queried from device to validate ring sizes requested by the user. Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-03gve: set page count for RX QPL for GQI and DQO queue formatsHarshitha Ramamurthy1-5/+9
Fulfill the requirement that for GQI, the number of pages per RX QPL is equal to the ring size. Set this value to be equal to ring size. Because of this change, the rx_data_slot_cnt and rx_pages_per_qpl fields stored in the priv structure are not needed, so remove their usage. And for DQO, the number of pages per RX QPL is more than ring size to account for out-of-order completions. So set it to two times of rx ring size. Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-05net: introduce page_frag_cache_drain()Yunsheng Lin1-9/+2
When draining a page_frag_cache, most user are doing the similar steps, so introduce an API to avoid code duplication. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-04gve: Add header split data pathJeroen de Borst1-0/+57
Add header buffers and ethtool support to enable header split via the tcp-data-split flag in ethtool's ringparam config. A coherent dma memory is allocated for the header buffers. There is one header buffer per ring entry by calculating the offset to the header-buffers starting address. The header buffer is always copied directly into the skb and payload is always added as frags. When there is a header buffer overflow or the header length is 0, the driver places the whole unsplit packet in frags. When toggling header split, the driver will call gve_adjust_config to set its queues appropriately. If header split is enabled by the user and the max packet buffer size is no less than 4KB, driver will set the packet buffer size as 4KB to support TCP_ZEROCOPY_RECEIVE. Otherwise the driver will use the default 2KB as the packet buffer size. `ethtool -G <dev> tcp-data-split on/off` is the command to toggle header split. `ethtool -g <dev>` will show the status of header split with the field of `tcp-data-split`. Co-developed-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04gve: Add header split device optionJeroen de Borst1-6/+2
To enable header split via ethtool, we first need to query the device to get the max rx buffer size and header buffer size. Add a device option to get these values and store them in the driver. If the header buffer size received from the device is non-zero, it means header split is supported in the device. Currently the max rx buffer size will only be used when header split is enabled which will set the data_buffer_size_dqo to be the max rx buffer size. Also change the data_buffer_size_dqo from int to u16 since we are modifying it and making it to be consistent with max_rx_buffer_size. Co-developed-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-24gve: Alloc before freeing when changing featuresShailend Chand1-20/+21
Previously, existing queues were being freed before the resources for the new queues were being allocated. This would take down the interface if someone were to attempt to change feature flags under a resource crunch. Signed-off-by: Shailend Chand <shailend@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Link: https://lore.kernel.org/r/20240122182632.1102721-7-shailend@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-24gve: Alloc before freeing when adjusting queuesShailend Chand1-22/+67
Previously, existing queues were being freed before the resources for the new queues were being allocated. This would take down the interface if someone were to attempt to change queue counts under a resource crunch. Signed-off-by: Shailend Chand <shailend@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Link: https://lore.kernel.org/r/20240122182632.1102721-6-shailend@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-24gve: Refactor gve_open and gve_closeShailend Chand1-40/+119
gve_open is rewritten to be composed of two funcs: gve_queues_mem_alloc and gve_queues_start. The former only allocates queue resources without doing anything to install the queues, which is taken up by the latter. Similarly gve_close is split into gve_queues_stop and gve_queues_mem_free. Separating the acts of queue resource allocation and making the queue become live help with subsequent changes that aim to not take down the datapath when applying new configurations. Signed-off-by: Shailend Chand <shailend@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Link: https://lore.kernel.org/r/20240122182632.1102721-5-shailend@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-24gve: Switch to config-aware queue allocationShailend Chand1-249/+352
The new config-aware functions will help achieve the goal of being able to allocate resources for new queues while there already are active queues serving traffic. These new functions work off of arbitrary queue allocation configs rather than just the currently active config in priv, and they return the newly allocated resources instead of writing them into priv. Signed-off-by: Shailend Chand <shailend@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240122182632.1102721-4-shailend@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-24gve: Refactor napi add and remove functionsShailend Chand1-17/+3
This change makes the napi poll functions non-static and moves the gve_(add|remove)_napi functions to gve_utils.c, to make possible future "start queue" hooks in the datapath files. Signed-off-by: Shailend Chand <shailend@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240122182632.1102721-3-shailend@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29gve: Remove dependency on 4k page size.John Fraker1-2/+2
Prior to this change, gve crashes when attempting to run in kernels with page sizes other than 4k. This change removes unnecessary references to PAGE_SIZE and replaces them with more meaningful constants. Signed-off-by: Jordan Kimbrough <jrkim@google.com> Signed-off-by: John Fraker <jfraker@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231128002648.320892-6-jfraker@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-17gve: add gve_features_check()Eric Dumazet1-0/+13
It is suboptimal to attempt skb linearization from ndo_start_xmit() if a gso skb has pathological layout, or if host stack does not have access to the payload (TCP direct). Linearization of large skbs can also fail under memory pressure. We should instead have an ndo_features_check() so that we can fallback to GSO, which is supported even for TCP direct, and generally much more efficient (no payload copy). Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Bailey Forrest <bcf@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Jeroen de Borst <jeroendb@google.com> Cc: Praveen Kaligineedi <pkaligineedi@google.com> Cc: Shailend Chand <shailend@google.com> Cc: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-15gve: Fixes for napi_poll when budget is 0Ziwei Xiao1-1/+7
Netpoll will explicilty pass the polling call with a budget of 0 to indicate it's clearing the Tx path only. For the gve_rx_poll and gve_xdp_poll, they were mistakenly taking the 0 budget as the indication to do all the work. Add check to avoid the rx path and xdp path being called when budget is 0. And also avoid napi_complete_done being called when budget is 0 for netpoll. Fixes: f5cedc84a30d ("gve: Add transmit and receive support") Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Link: https://lore.kernel.org/r/20231114004144.2022268-1-ziweixiao@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-12netdev: replace napi_reschedule with napi_scheduleChristian Marangi1-1/+1
Now that napi_schedule return a bool, we can drop napi_reschedule that does the same exact function. The function comes from a very old commit bfe13f54f502 ("ibm_emac: Convert to use napi_struct independent of struct net_device") and the purpose is actually deprecated in favour of different logic. Convert every user of napi_reschedule to napi_schedule. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Acked-by: Jeff Johnson <quic_jjohnson@quicinc.com> # ath10k Acked-by: Nick Child <nnac123@linux.ibm.com> # ibm Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for can/dev/rx-offload.c Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20231009133754.9834-3-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-09-17gve: Use size_add() in call to struct_size()Gustavo A. R. Silva1-1/+1
If, for any reason, `tx_stats_num + rx_stats_num` wraps around, the protection that struct_size() adds against potential integer overflows is defeated. Fix this by hardening call to struct_size() with size_add(). Fixes: 691f4077d560 ("gve: Replace zero-length array with flexible-array member") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-06gve: Control path for DQO-QPLRushil Gupta1-6/+14
GVE supports QPL ("queue-page-list") mode where all data is communicated through a set of pre-registered pages. Adding this mode to DQO descriptor format. Add checks, abi-changes and device options to support QPL mode for DQO in addition to GQI. Also, use pages-per-qpl supplied by device-option to control the size of the "queue-page-list". Signed-off-by: Rushil Gupta <rushilg@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-07-10gve: unify driver name usageJunfeng Guo1-5/+6
Current codebase contained the usage of two different names for this driver (i.e., `gvnic` and `gve`), which is quite unfriendly for users to use, especially when trying to bind or unbind the driver manually. The corresponding kernel module is registered with the name of `gve`. It's more reasonable to align the name of the driver with the module. Fixes: 893ce44df565 ("gve: Add basic driver framework for Compute Engine Virtual NIC") Cc: csully@google.com Signed-off-by: Junfeng Guo <junfeng.guo@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24gve: Support IPv6 Big TCP on DQCoco Li1-0/+5
Add support for using IPv6 Big TCP on DQ which can handle large TSO/GRO packets. See https://lwn.net/Articles/895398/. This can improve the throughput and CPU usage. Perf test result: ip -d link show $DEV gso_max_size 185000 gso_max_segs 65535 tso_max_size 262143 tso_max_segs 65535 gro_max_size 185000 For performance, tested with neper using 9k MTU on hardware that supports 200Gb/s line rate. In single streams when line rate is not saturated, we expect throughput improvements. When the networking is performing at line rate, we expect cpu usage improvements. Tcp_stream (unidirectional stream test, T=thread, F=flow): skb=180kb, T=1, F=1, no zerocopy: throughput average=64576.88 Mb/s, sender stime=8.3, receiver stime=10.68 skb=64kb, T=1, F=1, no zerocopy: throughput average=64862.54 Mb/s, sender stime=9.96, receiver stime=12.67 skb=180kb, T=1, F=1, yes zerocopy: throughput average=146604.97 Mb/s, sender stime=10.61, receiver stime=5.52 skb=64kb, T=1, F=1, yes zerocopy: throughput average=131357.78 Mb/s, sender stime=12.11, receiver stime=12.25 skb=180kb, T=20, F=100, no zerocopy: throughput average=182411.37 Mb/s, sender stime=41.62, receiver stime=79.4 skb=64kb, T=20, F=100, no zerocopy: throughput average=182892.02 Mb/s, sender stime=57.39, receiver stime=72.69 skb=180kb, T=20, F=100, yes zerocopy: throughput average=182337.65 Mb/s, sender stime=27.94, receiver stime=39.7 skb=64kb, T=20, F=100, yes zerocopy: throughput average=182144.20 Mb/s, sender stime=47.06, receiver stime=39.01 Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Coco Li <lixiaoyan@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230522201552.3585421-1-ziweixiao@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-10gve: Remove the code of clearing PBA bitZiwei Xiao1-13/+0
Clearing the PBA bit from the driver is race prone and it may lead to dropped interrupt events. This could potentially lead to the traffic being completely halted. Fixes: 5e8c5adf95f8 ("gve: DQO: Add core netdev features") Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Bailey Forrest <bcf@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17gve: Add AF_XDP zero-copy support for GQI-QPL formatPraveen Kaligineedi1-1/+173
Adding AF_XDP zero-copy support. Note: Although these changes support AF_XDP socket in zero-copy mode, there is still a copy happening within the driver between XSK buffer pool and QPL bounce buffers in GQI-QPL format. In GQI-QPL queue format, the driver needs to allocate a fixed size memory, the size specified by vNIC device, for RX/TX and register this memory as a bounce buffer with the vNIC device when a queue is created. The number of pages in the bounce buffer is limited and the pages need to be made available to the vNIC by copying the RX data out to prevent head-of-line blocking. Therefore, we cannot pass the XSK buffer pool to the vNIC. The number of copies on RX path from the bounce buffer to XSK buffer is 2 for AF_XDP copy mode (bounce buffer -> allocated page frag -> XSK buffer) and 1 for AF_XDP zero-copy mode (bounce buffer -> XSK buffer). This patch contains the following changes: 1) Enable and disable XSK buffer pool 2) Copy XDP packets from QPL bounce buffers to XSK buffer on rx 3) Copy XDP packets from XSK buffer to QPL bounce buffers and ring the doorbell as part of XDP TX napi poll 4) ndo_xsk_wakeup callback support Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17gve: Add XDP REDIRECT support for GQI-QPL formatPraveen Kaligineedi1-0/+19
This patch contains the following changes: 1) Support for XDP REDIRECT action on rx 2) ndo_xdp_xmit callback support In GQI-QPL queue format, the driver needs to allocate a fixed size memory, the size specified by vNIC device, for RX/TX and register this memory as a bounce buffer with the vNIC device when a queue is created. The number of pages in the bounce buffer is limited and the pages need to be made available to the vNIC by copying the RX data out to prevent head-of-line blocking. The XDP_REDIRECT packets are therefore immediately copied to a newly allocated page. Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17gve: Add XDP DROP and TX support for GQI-QPL formatPraveen Kaligineedi1-19/+403
Add support for XDP PASS, DROP and TX actions. This patch contains the following changes: 1) Support installing/uninstalling XDP program 2) Add dedicated XDP TX queues 3) Add support for XDP DROP action 4) Add support for XDP TX action Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17gve: Changes to add new TX queuesPraveen Kaligineedi1-23/+60
Changes to enable adding and removing TX queues without calling gve_close() and gve_open(). Made the following changes: 1) priv->tx, priv->rx and priv->qpls arrays are allocated based on max tx queues and max rx queues 2) Changed gve_adminq_create_tx_queues(), gve_adminq_destroy_tx_queues(), gve_tx_alloc_rings() and gve_tx_free_rings() functions to add/remove a subset of TX queues rather than all the TX queues. Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-17gve: XDP support GQI-QPL: helper function changesPraveen Kaligineedi1-11/+16
This patch adds/modifies helper functions needed to add XDP support. Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06gve: Fix gve interrupt namesPraveen Kaligineedi1-5/+4
IRQs are currently requested before the netdevice is registered and a proper name is assigned to the device. Changing interrupt name to avoid using the format string in the name. Interrupt name before change: eth%d-ntfy-block.<blk_id> Interrupt name after change: gve-ntfy-blk<blk_id>@pci:<pci_name> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21gve: Adding a new AdminQ command to verify driverJeroen de Borst1-0/+52
Check whether the driver is compatible with the device presented. Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-29net: Remove the obsolte u64_stats_fetch_*_irq() users (drivers).Thomas Gleixner1-6/+6
Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29net: drop the weight argument from netif_napi_addJakub Kicinski1-2/+1
We tell driver developers to always pass NAPI_POLL_WEIGHT as the weight to netif_napi_add(). This may be confusing to newcomers, drop the weight argument, those who really need to tweak the weight can use netif_napi_add_weight(). Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-29net: Use u64_stats_fetch_begin_irq() for stats fetch.Sebastian Andrzej Siewior1-6/+6
On 32bit-UP u64_stats_fetch_begin() disables only preemption. If the reader is in preemptible context and the writer side (u64_stats_update_begin*()) runs in an interrupt context (IRQ or softirq) then the writer can update the stats during the read operation. This update remains undetected. Use u64_stats_fetch_begin_irq() to ensure the stats fetch on 32bit-UP are not interrupted by a writer. 32bit-SMP remains unaffected by this change. Cc: "David S. Miller" <davem@davemloft.net> Cc: Catherine Sullivan <csully@google.com> Cc: David Awogbemila <awogbemila@google.com> Cc: Dimitris Michailidis <dmichail@fungible.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Hans Ulli Kroll <ulli.kroll@googlemail.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jeroen de Borst <jeroendb@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <simon.horman@corigine.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-wireless@vger.kernel.org Cc: netdev@vger.kernel.org Cc: oss-drivers@corigine.com Cc: stable@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16gve: enhance no queue page list detectionHaiyue Wang1-4/+2
The commit a5886ef4f4bf ("gve: Introduce per netdev `enum gve_queue_format`") introduces three queue format type, only GVE_GQI_QPL_FORMAT queue has page list. So it should use the queue page list number to detect the zero size queue page list. Correct the design logic. Using the 'queue_format == GVE_GQI_RDA_FORMAT' may lead to request zero sized memory allocation, like if the queue format is GVE_DQO_RDA_FORMAT. The kernel memory subsystem will return ZERO_SIZE_PTR, which is not NULL address, so the driver can run successfully. Also the code still checks the queue page list number firstly, then accesses the allocated memory, so zero number queue page list allocation will not lead to access fault. Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Reviewed-by: Bailey Forrest <bcf@google.com> Link: https://lore.kernel.org/r/20220215051751.260866-1-haiyue.wang@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-27gve: Fix GFP flags when allocing pagesCatherine Sullivan1-3/+3
Use GFP_ATOMIC when allocating pages out of the hotpath, continue to use GFP_KERNEL when allocating pages during setup. GFP_KERNEL will allow blocking which allows it to succeed more often in a low memory enviornment but in the hotpath we do not want to allow the allocation to block. Fixes: f5cedc84a30d2 ("gve: Add transmit and receive support") Signed-off-by: Catherine Sullivan <csully@google.com> Signed-off-by: David Awogbemila <awogbemila@google.com> Link: https://lore.kernel.org/r/20220126003843.3584521-1-awogbemila@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16gve: Add tx|rx-coalesce-usec for DQOTao Liu1-6/+9
Adding ethtool support for changing rx-coalesce-usec and tx-coalesce-usec when using the DQO queue format. Signed-off-by: Tao Liu <xliutaox@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16gve: Implement suspend/resume/shutdownCatherine Sullivan1-0/+57
Add support for suspend, resume and shutdown. Signed-off-by: Catherine Sullivan <csully@google.com> Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16gve: Update gve_free_queue_page_list signatureCatherine Sullivan1-2/+1
The id field should be a u32 not a signed int. Signed-off-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16gve: Move the irq db indexes out of the ntfy block structCatherine Sullivan1-11/+25
Giving the device access to other kernel structs is not ideal. Move the indexes into their own array and just keep pointers to them in the ntfy block struct. Signed-off-by: Catherine Sullivan <csully@google.com> Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-09gve: Fix off by one in gve_tx_timeout()Dan Carpenter1-1/+1
The priv->ntfy_blocks[] has "priv->num_ntfy_blks" elements so this > needs to be >= to prevent an off by one bug. The priv->ntfy_blocks[] array is allocated in gve_alloc_notify_blocks(). Fixes: 87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-25gve: Implement packet continuation for RX.David Awogbemila1-8/+0
This enables the driver to receive RX packets spread across multiple buffers: For a given multi-fragment packet the "packet continuation" bit is set on all descriptors except the last one. These descriptors' payloads are combined into a single SKB before the SKB is handed to the networking stack. This change adds a "packet buffer size" notion for RX queues. The CreateRxQueue AdminQueue command sent to the device now includes the packet_buffer_size. We opt for a packet_buffer_size of PAGE_SIZE / 2 to give the driver the opportunity to flip pages where we can instead of copying. Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-12gve: Recover from queue stall due to missed IRQJohn Fraker1-1/+47
Don't always reset the driver on a TX timeout. Attempt to recover by kicking the queue in case an IRQ was missed. Fixes: 9e5f7d26a4c08 ("gve: Add workqueue and reset support") Signed-off-by: John Fraker <jfraker@google.com> Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-12gve: Do lazy cleanup in TX pathTao Liu1-3/+3
When TX queue is full, attemt to process enough TX completions to avoid stalling the queue. Fixes: f5cedc84a30d2 ("gve: Add transmit and receive support") Signed-off-by: Tao Liu <xliutaox@google.com> Signed-off-by: Catherine Sullivan <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-12gve: Switch to use napi_complete_doneYangchun Fu1-16/+22
Use napi_complete_done to allow for the use of gro_flush_timeout. Fixes: f5cedc84a30d2 ("gve: Add transmit and receive support") Signed-off-by: Yangchun Fu <yangchun@google.com> Signed-off-by: Catherine Sullivan <csully@google.com> Signed-off-by: David Awogbemila <awogbemila@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-16/+29
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-06gve: report 64bit tx_bytes counter from gve_handle_report_stats()Eric Dumazet1-2/+3
Each tx queue maintains a 64bit counter for bytes, there is no reason to truncate this to 32bit (or this has not been documented) Fixes: 24aeb56f2d38 ("gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yangchun Fu <yangchun@google.com> Cc: Kuo Zhao <kuozhao@google.com> Cc: David Awogbemila <awogbemila@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-06gve: fix gve_get_stats()Eric Dumazet1-4/+9
gve_get_stats() can report wrong numbers if/when u64_stats_fetch_retry() returns true. What is needed here is to sample values in temporary variables, and only use them after each loop is ended. Fixes: f5cedc84a30d ("gve: Add transmit and receive support") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Catherine Sullivan <csully@google.com> Cc: Sagi Shahar <sagis@google.com> Cc: Jon Olson <jonolson@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Luigi Rizzo <lrizzo@google.com> Cc: Jeroen de Borst <jeroendb@google.com> Cc: Tao Liu <xliutaox@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-06gve: Avoid freeing NULL pointerTao Liu1-10/+17
Prevent possible crashes when cleaning up after unsuccessful initializations. Fixes: 893ce44df5658 ("gve: Add basic driver framework for Compute Engine Virtual NIC") Signed-off-by: Tao Liu <xliutaox@google.com> Signed-off-by: Catherine Sully <csully@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>