summaryrefslogtreecommitdiff
path: root/net/core/page_pool_user.c
AgeCommit message (Collapse)AuthorFilesLines
2024-03-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-1/+2
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: net/core/page_pool_user.c 0b11b1c5c320 ("netdev: let netlink core handle -EMSGSIZE errors") 429679dcf7d9 ("page_pool: fix netlink dump stop/resume") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-06netdev: let netlink core handle -EMSGSIZE errorsJakub Kicinski1-2/+0
Previous change added -EMSGSIZE handling to af_netlink, we don't have to hide these errors any longer. Theoretically the error handling changes from: if (err == -EMSGSIZE) to if (err == -EMSGSIZE && skb->len) everywhere, but in practice it doesn't matter. All messages fit into NLMSG_GOODSIZE, so overflow of an empty skb cannot happen. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04page_pool: fix netlink dump stop/resumeJakub Kicinski1-1/+2
If message fills up we need to stop writing. 'break' will only get us out of the iteration over pools of a single netdev, we need to also stop walking netdevs. This results in either infinite dump, or missing pools, depending on whether message full happens on the last netdev (infinite dump) or non-last (missing pools). Fixes: 950ab53b77ab ("net: page_pool: implement GET in the netlink API") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-30net: page_pool: fix general protection fault in page_pool_unlistEric Dumazet1-1/+3
syzbot was able to trigger a crash [1] in page_pool_unlist() page_pool_list() only inserts a page pool into a netdev page pool list if a netdev was set in params. Even if the kzalloc() call in page_pool_create happens to initialize pool->user.list, I chose to be more explicit in page_pool_list() adding one INIT_HLIST_NODE(). We could test in page_pool_unlist() if netdev was set, but since netdev can be changed to lo, it seems more robust to check if pool->user.list is hashed before calling hlist_del(). [1] Illegal XDP return value 4294946546 on prog (id 2) dev N/A, expect packet loss! general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 0 PID: 5064 Comm: syz-executor391 Not tainted 6.7.0-rc2-syzkaller-00533-ga379972973a8 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023 RIP: 0010:__hlist_del include/linux/list.h:988 [inline] RIP: 0010:hlist_del include/linux/list.h:1002 [inline] RIP: 0010:page_pool_unlist+0xd1/0x170 net/core/page_pool_user.c:342 Code: df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 90 00 00 00 4c 8b a3 f0 06 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 75 68 48 85 ed 49 89 2c 24 74 24 e8 1b ca 07 f9 48 8d RSP: 0018:ffffc900039ff768 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffff88814ae02000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff88814ae026f0 RBP: 0000000000000000 R08: 0000000000000000 R09: fffffbfff1d57fdc R10: ffffffff8eabfee3 R11: ffffffff8aa0008b R12: 0000000000000000 R13: ffff88814ae02000 R14: dffffc0000000000 R15: 0000000000000001 FS: 000055555717a380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000002555398 CR3: 0000000025044000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> __page_pool_destroy net/core/page_pool.c:851 [inline] page_pool_release+0x507/0x6b0 net/core/page_pool.c:891 page_pool_destroy+0x1ac/0x4c0 net/core/page_pool.c:956 xdp_test_run_teardown net/bpf/test_run.c:216 [inline] bpf_test_run_xdp_live+0x1578/0x1af0 net/bpf/test_run.c:388 bpf_prog_test_run_xdp+0x827/0x1530 net/bpf/test_run.c:1254 bpf_prog_test_run kernel/bpf/syscall.c:4041 [inline] __sys_bpf+0x11bf/0x4920 kernel/bpf/syscall.c:5402 __do_sys_bpf kernel/bpf/syscall.c:5488 [inline] __se_sys_bpf kernel/bpf/syscall.c:5486 [inline] __x64_sys_bpf+0x78/0xc0 kernel/bpf/syscall.c:5486 Fixes: 083772c9f972 ("net: page_pool: record pools per netdev") Reported-and-tested-by: syzbot+f9f8efb58a4db2ca98d0@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20231130092259.3797753-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-28net: page_pool: expose page pool stats via netlinkJakub Kicinski1-0/+103
Dump the stats into netlink. More clever approaches like dumping the stats per-CPU for each CPU individually to see where the packets get consumed can be implemented in the future. A trimmed example from a real (but recently booted system): $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \ --dump page-pool-stats-get [{'info': {'id': 19, 'ifindex': 2}, 'alloc-empty': 48, 'alloc-fast': 3024, 'alloc-refill': 0, 'alloc-slow': 48, 'alloc-slow-high-order': 0, 'alloc-waive': 0, 'recycle-cache-full': 0, 'recycle-cached': 0, 'recycle-released-refcnt': 0, 'recycle-ring': 0, 'recycle-ring-full': 0}, {'info': {'id': 18, 'ifindex': 2}, 'alloc-empty': 66, 'alloc-fast': 11811, 'alloc-refill': 35, 'alloc-slow': 66, 'alloc-slow-high-order': 0, 'alloc-waive': 0, 'recycle-cache-full': 1145, 'recycle-cached': 6541, 'recycle-released-refcnt': 0, 'recycle-ring': 1275, 'recycle-ring-full': 0}, {'info': {'id': 17, 'ifindex': 2}, 'alloc-empty': 73, 'alloc-fast': 62099, 'alloc-refill': 413, ... Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: report when page pool was destroyedJakub Kicinski1-0/+12
Report when page pool was destroyed. Together with the inflight / memory use reporting this can serve as a replacement for the warning about leaked page pools we currently print to dmesg. Example output for a fake leaked page pool using some hacks in netdevsim (one "live" pool, and one "leaked" on the same dev): $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \ --dump page-pool-get [{'id': 2, 'ifindex': 3}, {'id': 1, 'ifindex': 3, 'destroyed': 133, 'inflight': 1}] Tested-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: report amount of memory held by page poolsJakub Kicinski1-0/+8
Advanced deployments need the ability to check memory use of various system components. It makes it possible to make informed decisions about memory allocation and to find regressions and leaks. Report memory use of page pools. Report both number of references and bytes held. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: add netlink notifications for state changesJakub Kicinski1-0/+36
Generate netlink notifications about page pool state changes. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: implement GET in the netlink APIJakub Kicinski1-0/+127
Expose the very basic page pool information via netlink. Example using ynl-py for a system with 9 queues: $ ./cli.py --no-schema --spec netlink/specs/netdev.yaml \ --dump page-pool-get [{'id': 19, 'ifindex': 2, 'napi-id': 147}, {'id': 18, 'ifindex': 2, 'napi-id': 146}, {'id': 17, 'ifindex': 2, 'napi-id': 145}, {'id': 16, 'ifindex': 2, 'napi-id': 144}, {'id': 15, 'ifindex': 2, 'napi-id': 143}, {'id': 14, 'ifindex': 2, 'napi-id': 142}, {'id': 13, 'ifindex': 2, 'napi-id': 141}, {'id': 12, 'ifindex': 2, 'napi-id': 140}, {'id': 11, 'ifindex': 2, 'napi-id': 139}, {'id': 10, 'ifindex': 2, 'napi-id': 138}] Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: stash the NAPI ID for easier accessJakub Kicinski1-1/+3
To avoid any issues with race conditions on accessing napi and having to think about the lifetime of NAPI objects in netlink GET - stash the napi_id to which page pool was linked at creation time. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: record pools per netdevJakub Kicinski1-0/+84
Link the page pools with netdevs. This needs to be netns compatible so we have two options. Either we record the pools per netns and have to worry about moving them as the netdev gets moved. Or we record them directly on the netdev so they move with the netdev without any extra work. Implement the latter option. Since pools may outlast netdev we need a place to store orphans. In time honored tradition use loopback for this purpose. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-11-28net: page_pool: id the page poolsJakub Kicinski1-0/+36
To give ourselves the flexibility of creating netlink commands and ability to refer to page pool instances in uAPIs create IDs for page pools. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Shakeel Butt <shakeelb@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>