summaryrefslogtreecommitdiff
path: root/include/linux/skbuff.h
diff options
context:
space:
mode:
authorDavid S. Miller <davem@davemloft.net>2021-02-14 01:32:04 +0300
committerDavid S. Miller <davem@davemloft.net>2021-02-14 01:32:04 +0300
commitc4762993129f48f5f5e233f09c246696815ef263 (patch)
tree69fbf812f67550da5b4ed3781bc650bbba4c2a0e /include/linux/skbuff.h
parent773dc50d71690202afd7b5017c060c6ca8c75dd9 (diff)
parent9243adfc311a20371c3f4d8eaf0af4b135e6fac3 (diff)
downloadlinux-c4762993129f48f5f5e233f09c246696815ef263.tar.xz
Merge branch 'skbuff-introduce-skbuff_heads-bulking-and-reusing'
Alexander Lobakin says: ==================== skbuff: introduce skbuff_heads bulking and reusing Currently, all sorts of skb allocation always do allocate skbuff_heads one by one via kmem_cache_alloc(). On the other hand, we have percpu napi_alloc_cache to store skbuff_heads queued up for freeing and flush them by bulks. We can use this cache not only for bulk-wiping, but also to obtain heads for new skbs and avoid unconditional allocations, as well as for bulk-allocating (like XDP's cpumap code and veth driver already do). As this might affect latencies, cache pressure and lots of hardware and driver-dependent stuff, this new feature is mostly optional and can be issued via: - a new napi_build_skb() function (as a replacement for build_skb()); - existing {,__}napi_alloc_skb() and napi_get_frags() functions; - __alloc_skb() with passing SKB_ALLOC_NAPI in flags. iperf3 showed 35-70 Mbps bumps for both TCP and UDP while performing VLAN NAT on 1.2 GHz MIPS board. The boost is likely to be bigger on more powerful hosts and NICs with tens of Mpps. Note on skbuff_heads from distant slabs or pfmemalloc'ed slabs: - kmalloc()/kmem_cache_alloc() itself allows by default allocating memory from the remote nodes to defragment their slabs. This is controlled by sysctl, but according to this, skbuff_head from a remote node is an OK case; - The easiest way to check if the slab of skbuff_head is remote or pfmemalloc'ed is: if (!dev_page_is_reusable(virt_to_head_page(skb))) /* drop it */; ...*but*, regarding that most slabs are built of compound pages, virt_to_head_page() will hit unlikely-branch every single call. This check costed at least 20 Mbps in test scenarios and seems like it'd be better to _not_ do this. Since v5 [4]: - revert flags-to-bool conversion and simplify flags testing in __alloc_skb() (Alexander Duyck). Since v4 [3]: - rebase on top of net-next and address kernel build robot issue; - reorder checks a bit in __alloc_skb() to make new condition even more harmless. Since v3 [2]: - make the feature mostly optional, so driver developers could decide whether to use it or not (Paolo Abeni). This reuses the old flag for __alloc_skb() and introduces a new napi_build_skb(); - reduce bulk-allocation size from 32 to 16 elements (also Paolo). This equals to the value of XDP's devmap and veth batch processing (which were tested a lot) and should be sane enough; - don't waste cycles on explicit in_serving_softirq() check. Since v2 [1]: - also cover {,__}alloc_skb() and {,__}build_skb() cases (became handy after the changes that pass tiny skbs requests to kmalloc layer); - cover the cache with KASAN instrumentation (suggested by Eric Dumazet, help of Dmitry Vyukov); - completely drop redundant __kfree_skb_flush() (also Eric); - lots of code cleanups; - expand the commit message with NUMA and pfmemalloc points (Jakub). Since v1 [0]: - use one unified cache instead of two separate to greatly simplify the logics and reduce hotpath overhead (Edward Cree); - new: recycle also GRO_MERGED_FREE skbs instead of immediate freeing; - correct performance numbers after optimizations and performing lots of tests for different use cases. [0] https://lore.kernel.org/netdev/20210111182655.12159-1-alobakin@pm.me [1] https://lore.kernel.org/netdev/20210113133523.39205-1-alobakin@pm.me [2] https://lore.kernel.org/netdev/20210209204533.327360-1-alobakin@pm.me [3] https://lore.kernel.org/netdev/20210210162732.80467-1-alobakin@pm.me [4] https://lore.kernel.org/netdev/20210211185220.9753-1-alobakin@pm.me ==================== Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'include/linux/skbuff.h')
-rw-r--r--include/linux/skbuff.h4
1 files changed, 3 insertions, 1 deletions
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 0a4e91a2f873..6d0a33d1c0db 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1087,6 +1087,8 @@ struct sk_buff *build_skb(void *data, unsigned int frag_size);
struct sk_buff *build_skb_around(struct sk_buff *skb,
void *data, unsigned int frag_size);
+struct sk_buff *napi_build_skb(void *data, unsigned int frag_size);
+
/**
* alloc_skb - allocate a network buffer
* @size: size to allocate
@@ -2919,7 +2921,7 @@ static inline struct sk_buff *napi_alloc_skb(struct napi_struct *napi,
}
void napi_consume_skb(struct sk_buff *skb, int budget);
-void __kfree_skb_flush(void);
+void napi_skb_free_stolen_head(struct sk_buff *skb);
void __kfree_skb_defer(struct sk_buff *skb);
/**