summaryrefslogtreecommitdiff
path: root/net/core
diff options
context:
space:
mode:
authorAlexander Lobakin <aleksander.lobakin@intel.com>2023-08-04 21:05:26 +0300
committerJakub Kicinski <kuba@kernel.org>2023-08-07 23:05:53 +0300
commit06d0fbdad612cb8def19065cf1fa14fc34dba9f8 (patch)
treee902a88ce2ea04869be76384380ce95718e5c686 /net/core
parent75eaf63ea7afeafd026ffef03bdc69e31f10829b (diff)
downloadlinux-06d0fbdad612cb8def19065cf1fa14fc34dba9f8.tar.xz
page_pool: place frag_* fields in one cacheline
On x86_64, frag_* fields of struct page_pool are scattered across two cachelines despite the summary size of 24 bytes. All three fields are used in pretty much the same places, but the last field, ::frag_users, is pushed out to the next CL, provoking unwanted false-sharing on hotpath (frags allocation code). There are some holes and cold members to move around. Move frag_* one block up, placing them right after &page_pool_params perfectly at the beginning of CL2. This doesn't do any meaningful to the second block, as those are some destroy-path cold structures, and doesn't do anything to ::alloc_stats, which still starts at 200-byte offset, 8 bytes after CL3 (still fitting into 1 cacheline). On my setup, this yields 1-2% of Mpps when using PP frags actively. When it comes to 32-bit architectures with 32-byte CL: &page_pool_params plus ::pad is 44 bytes, the block taken care of is 16 bytes within one CL, so there should be at least no regressions from the actual change. ::pages_state_hold_cnt is not related directly to that triple, but is paired currently with ::frags_offset and decoupling them would mean either two 4-byte holes or more invasive layout changes. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20230804180529.2483231-4-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'net/core')
0 files changed, 0 insertions, 0 deletions