diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-10-31 18:10:11 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-10-31 18:10:11 +0300 |
commit | 89ed67ef126c4160349c1b96fdb775ea6170ac90 (patch) | |
tree | 98caaf8bba44b21f9345a0af1dd2bd9987764e27 /tools/testing/selftests/bpf/bpf_experimental.h | |
parent | 5a6a09e97199d6600d31383055f9d43fbbcbe86f (diff) | |
parent | f1c73396133cb3d913e2075298005644ee8dfade (diff) | |
download | linux-89ed67ef126c4160349c1b96fdb775ea6170ac90.tar.xz |
Merge tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Support usec resolution of TCP timestamps, enabled selectively by a
route attribute.
- Defer regular TCP ACK while processing socket backlog, try to send
a cumulative ACK at the end. Increase single TCP flow performance
on a 200Gbit NIC by 20% (100Gbit -> 120Gbit).
- The Fair Queuing (FQ) packet scheduler:
- add built-in 3 band prio / WRR scheduling
- support bypass if the qdisc is mostly idle (5% speed up for TCP RR)
- improve inactive flow reporting
- optimize the layout of structures for better cache locality
- Support TCP Authentication Option (RFC 5925, TCP-AO), a more modern
replacement for the old MD5 option.
- Add more retransmission timeout (RTO) related statistics to
TCP_INFO.
- Support sending fragmented skbs over vsock sockets.
- Make sure we send SIGPIPE for vsock sockets if socket was
shutdown().
- Add sysctl for ignoring lower limit on lifetime in Router
Advertisement PIO, based on an in-progress IETF draft.
- Add sysctl to control activation of TCP ping-pong mode.
- Add sysctl to make connection timeout in MPTCP configurable.
- Support rcvlowat and notsent_lowat on MPTCP sockets, to help apps
limit the number of wakeups.
- Support netlink GET for MDB (multicast forwarding), allowing user
space to request a single MDB entry instead of dumping the entire
table.
- Support selective FDB flushing in the VXLAN tunnel driver.
- Allow limiting learned FDB entries in bridges, prevent OOM attacks.
- Allow controlling via configfs netconsole targets which were
created via the kernel cmdline at boot, rather than via configfs at
runtime.
- Support multiple PTP timestamp event queue readers with different
filters.
- MCTP over I3C.
BPF:
- Add new veth-like netdevice where BPF program defines the logic of
the xmit routine. It can operate in L3 and L2 mode.
- Support exceptions - allow asserting conditions which should never
be true but are hard for the verifier to infer. With some extra
flexibility around handling of the exit / failure:
https://lwn.net/Articles/938435/
- Add support for local per-cpu kptr, allow allocating and storing
per-cpu objects in maps. Access to those objects operates on the
value for the current CPU.
This allows to deprecate local one-off implementations of per-CPU
storage like BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE maps.
- Extend cgroup BPF sockaddr hooks for UNIX sockets. The use case is
for systemd to re-implement the LogNamespace feature which allows
running multiple instances of systemd-journald to process the logs
of different services.
- Enable open-coded task_vma iteration, after maple tree conversion
made it hard to directly walk VMAs in tracing programs.
- Add open-coded task, css_task and css iterator support. One of the
use cases is customizable OOM victim selection via BPF.
- Allow source address selection with bpf_*_fib_lookup().
- Add ability to pin BPF timer to the current CPU.
- Prevent creation of infinite loops by combining tail calls and
fentry/fexit programs.
- Add missed stats for kprobes to retrieve the number of missed
kprobe executions and subsequent executions of BPF programs.
- Inherit system settings for CPU security mitigations.
- Add BPF v4 CPU instruction support for arm32 and s390x.
Changes to common code:
- overflow: add DEFINE_FLEX() for on-stack definition of structs with
flexible array members.
- Process doc update with more guidance for reviewers.
Driver API:
- Simplify locking in WiFi (cfg80211 and mac80211 layers), use wiphy
mutex in most places and remove a lot of smaller locks.
- Create a common DPLL configuration API. Allow configuring and
querying state of PLL circuits used for clock syntonization, in
network time distribution.
- Unify fragmented and full page allocation APIs in page pool code.
Let drivers be ignorant of PAGE_SIZE.
- Rework PHY state machine to avoid races with calls to phy_stop().
- Notify DSA drivers of MAC address changes on user ports, improve
correctness of offloads which depend on matching port MAC
addresses.
- Allow antenna control on injected WiFi frames.
- Reduce the number of variants of napi_schedule().
- Simplify error handling when composing devlink health messages.
Misc:
- A lot of KCSAN data race "fixes", from Eric.
- A lot of __counted_by() annotations, from Kees.
- A lot of strncpy -> strscpy and printf format fixes.
- Replace master/slave terminology with conduit/user in DSA drivers.
- Handful of KUnit tests for netdev and WiFi core.
Removed:
- AppleTalk COPS.
- AppleTalk ipddp.
- TI AR7 CPMAC Ethernet driver.
Drivers:
- Ethernet high-speed NICs:
- Intel (100G, ice, idpf):
- add a driver for the Intel E2000 IPUs
- make CRC/FCS stripping configurable
- cross-timestamping for E823 devices
- basic support for E830 devices
- use aux-bus for managing client drivers
- i40e: report firmware versions via devlink
- nVidia/Mellanox:
- support 4-port NICs
- increase max number of channels to 256
- optimize / parallelize SF creation flow
- Broadcom (bnxt):
- enhance NIC temperature reporting
- support PAM4 speeds and lane configuration
- Marvell OcteonTX2:
- PTP pulse-per-second output support
- enable hardware timestamping for VFs
- Solarflare/AMD:
- conntrack NAT offload and offload for tunnels
- Wangxun (ngbe/txgbe):
- expose HW statistics
- Pensando/AMD:
- support PCI level reset
- narrow down the condition under which skbs are linearized
- Netronome/Corigine (nfp):
- support CHACHA20-POLY1305 crypto in IPsec offload
- Ethernet NICs embedded, slower, virtual:
- Synopsys (stmmac):
- add Loongson-1 SoC support
- enable use of HW queues with no offload capabilities
- enable PPS input support on all 5 channels
- increase TX coalesce timer to 5ms
- RealTek USB (r8152): improve efficiency of Rx by using GRO frags
- xen: support SW packet timestamping
- add drivers for implementations based on TI's PRUSS (AM64x EVM)
- nVidia/Mellanox Ethernet datacenter switches:
- avoid poor HW resource use on Spectrum-4 by better block
selection for IPv6 multicast forwarding and ordering of blocks
in ACL region
- Ethernet embedded switches:
- Microchip:
- support configuring the drive strength for EMI compliance
- ksz9477: partial ACL support
- ksz9477: HSR offload
- ksz9477: Wake on LAN
- Realtek:
- rtl8366rb: respect device tree config of the CPU port
- Ethernet PHYs:
- support Broadcom BCM5221 PHYs
- TI dp83867: support hardware LED blinking
- CAN:
- add support for Linux-PHY based CAN transceivers
- at91_can: clean up and use rx-offload helpers
- WiFi:
- MediaTek (mt76):
- new sub-driver for mt7925 USB/PCIe devices
- HW wireless <> Ethernet bridging in MT7988 chips
- mt7603/mt7628 stability improvements
- Qualcomm (ath12k):
- WCN7850:
- enable 320 MHz channels in 6 GHz band
- hardware rfkill support
- enable IEEE80211_HW_SINGLE_SCAN_ON_ALL_BANDS to
make scan faster
- read board data variant name from SMBIOS
- QCN9274: mesh support
- RealTek (rtw89):
- TDMA-based multi-channel concurrency (MCC)
- Silicon Labs (wfx):
- Remain-On-Channel (ROC) support
- Bluetooth:
- ISO: many improvements for broadcast support
- mark BCM4378/BCM4387 as BROKEN_LE_CODED
- add support for QCA2066
- btmtksdio: enable Bluetooth wakeup from suspend"
* tag 'net-next-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1816 commits)
net: pcs: xpcs: Add 2500BASE-X case in get state for XPCS drivers
net: bpf: Use sockopt_lock_sock() in ip_sock_set_tos()
net: mana: Use xdp_set_features_flag instead of direct assignment
vxlan: Cleanup IFLA_VXLAN_PORT_RANGE entry in vxlan_get_size()
iavf: delete the iavf client interface
iavf: add a common function for undoing the interrupt scheme
iavf: use unregister_netdev
iavf: rely on netdev's own registered state
iavf: fix the waiting time for initial reset
iavf: in iavf_down, don't queue watchdog_task if comms failed
iavf: simplify mutex_trylock+sleep loops
iavf: fix comments about old bit locks
doc/netlink: Update schema to support cmd-cnt-name and cmd-max-name
tools: ynl: introduce option to process unknown attributes or types
ipvlan: properly track tx_errors
netdevsim: Block until all devices are released
nfp: using napi_build_skb() to replace build_skb()
net: dsa: microchip: ksz9477: Fix spelling mistake "Enery" -> "Energy"
net: dsa: microchip: Ensure Stable PME Pin State for Wake-on-LAN
net: dsa: microchip: Refactor switch shutdown routine for WoL preparation
...
Diffstat (limited to 'tools/testing/selftests/bpf/bpf_experimental.h')
-rw-r--r-- | tools/testing/selftests/bpf/bpf_experimental.h | 346 |
1 files changed, 346 insertions, 0 deletions
diff --git a/tools/testing/selftests/bpf/bpf_experimental.h b/tools/testing/selftests/bpf/bpf_experimental.h index 209811b1993a..1386baf9ae4a 100644 --- a/tools/testing/selftests/bpf/bpf_experimental.h +++ b/tools/testing/selftests/bpf/bpf_experimental.h @@ -131,4 +131,350 @@ extern int bpf_rbtree_add_impl(struct bpf_rb_root *root, struct bpf_rb_node *nod */ extern struct bpf_rb_node *bpf_rbtree_first(struct bpf_rb_root *root) __ksym; +/* Description + * Allocates a percpu object of the type represented by 'local_type_id' in + * program BTF. User may use the bpf_core_type_id_local macro to pass the + * type ID of a struct in program BTF. + * + * The 'local_type_id' parameter must be a known constant. + * The 'meta' parameter is rewritten by the verifier, no need for BPF + * program to set it. + * Returns + * A pointer to a percpu object of the type corresponding to the passed in + * 'local_type_id', or NULL on failure. + */ +extern void *bpf_percpu_obj_new_impl(__u64 local_type_id, void *meta) __ksym; + +/* Convenience macro to wrap over bpf_percpu_obj_new_impl */ +#define bpf_percpu_obj_new(type) ((type __percpu_kptr *)bpf_percpu_obj_new_impl(bpf_core_type_id_local(type), NULL)) + +/* Description + * Free an allocated percpu object. All fields of the object that require + * destruction will be destructed before the storage is freed. + * + * The 'meta' parameter is rewritten by the verifier, no need for BPF + * program to set it. + * Returns + * Void. + */ +extern void bpf_percpu_obj_drop_impl(void *kptr, void *meta) __ksym; + +struct bpf_iter_task_vma; + +extern int bpf_iter_task_vma_new(struct bpf_iter_task_vma *it, + struct task_struct *task, + unsigned long addr) __ksym; +extern struct vm_area_struct *bpf_iter_task_vma_next(struct bpf_iter_task_vma *it) __ksym; +extern void bpf_iter_task_vma_destroy(struct bpf_iter_task_vma *it) __ksym; + +/* Convenience macro to wrap over bpf_obj_drop_impl */ +#define bpf_percpu_obj_drop(kptr) bpf_percpu_obj_drop_impl(kptr, NULL) + +/* Description + * Throw a BPF exception from the program, immediately terminating its + * execution and unwinding the stack. The supplied 'cookie' parameter + * will be the return value of the program when an exception is thrown, + * and the default exception callback is used. Otherwise, if an exception + * callback is set using the '__exception_cb(callback)' declaration tag + * on the main program, the 'cookie' parameter will be the callback's only + * input argument. + * + * Thus, in case of default exception callback, 'cookie' is subjected to + * constraints on the program's return value (as with R0 on exit). + * Otherwise, the return value of the marked exception callback will be + * subjected to the same checks. + * + * Note that throwing an exception with lingering resources (locks, + * references, etc.) will lead to a verification error. + * + * Note that callbacks *cannot* call this helper. + * Returns + * Never. + * Throws + * An exception with the specified 'cookie' value. + */ +extern void bpf_throw(u64 cookie) __ksym; + +/* This macro must be used to mark the exception callback corresponding to the + * main program. For example: + * + * int exception_cb(u64 cookie) { + * return cookie; + * } + * + * SEC("tc") + * __exception_cb(exception_cb) + * int main_prog(struct __sk_buff *ctx) { + * ... + * return TC_ACT_OK; + * } + * + * Here, exception callback for the main program will be 'exception_cb'. Note + * that this attribute can only be used once, and multiple exception callbacks + * specified for the main program will lead to verification error. + */ +#define __exception_cb(name) __attribute__((btf_decl_tag("exception_callback:" #name))) + +#define __bpf_assert_signed(x) _Generic((x), \ + unsigned long: 0, \ + unsigned long long: 0, \ + signed long: 1, \ + signed long long: 1 \ +) + +#define __bpf_assert_check(LHS, op, RHS) \ + _Static_assert(sizeof(&(LHS)), "1st argument must be an lvalue expression"); \ + _Static_assert(sizeof(LHS) == 8, "Only 8-byte integers are supported\n"); \ + _Static_assert(__builtin_constant_p(__bpf_assert_signed(LHS)), "internal static assert"); \ + _Static_assert(__builtin_constant_p((RHS)), "2nd argument must be a constant expression") + +#define __bpf_assert(LHS, op, cons, RHS, VAL) \ + ({ \ + (void)bpf_throw; \ + asm volatile ("if %[lhs] " op " %[rhs] goto +2; r1 = %[value]; call bpf_throw" \ + : : [lhs] "r"(LHS), [rhs] cons(RHS), [value] "ri"(VAL) : ); \ + }) + +#define __bpf_assert_op_sign(LHS, op, cons, RHS, VAL, supp_sign) \ + ({ \ + __bpf_assert_check(LHS, op, RHS); \ + if (__bpf_assert_signed(LHS) && !(supp_sign)) \ + __bpf_assert(LHS, "s" #op, cons, RHS, VAL); \ + else \ + __bpf_assert(LHS, #op, cons, RHS, VAL); \ + }) + +#define __bpf_assert_op(LHS, op, RHS, VAL, supp_sign) \ + ({ \ + if (sizeof(typeof(RHS)) == 8) { \ + const typeof(RHS) rhs_var = (RHS); \ + __bpf_assert_op_sign(LHS, op, "r", rhs_var, VAL, supp_sign); \ + } else { \ + __bpf_assert_op_sign(LHS, op, "i", RHS, VAL, supp_sign); \ + } \ + }) + +/* Description + * Assert that a conditional expression is true. + * Returns + * Void. + * Throws + * An exception with the value zero when the assertion fails. + */ +#define bpf_assert(cond) if (!(cond)) bpf_throw(0); + +/* Description + * Assert that a conditional expression is true. + * Returns + * Void. + * Throws + * An exception with the specified value when the assertion fails. + */ +#define bpf_assert_with(cond, value) if (!(cond)) bpf_throw(value); + +/* Description + * Assert that LHS is equal to RHS. This statement updates the known value + * of LHS during verification. Note that RHS must be a constant value, and + * must fit within the data type of LHS. + * Returns + * Void. + * Throws + * An exception with the value zero when the assertion fails. + */ +#define bpf_assert_eq(LHS, RHS) \ + ({ \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, ==, RHS, 0, true); \ + }) + +/* Description + * Assert that LHS is equal to RHS. This statement updates the known value + * of LHS during verification. Note that RHS must be a constant value, and + * must fit within the data type of LHS. + * Returns + * Void. + * Throws + * An exception with the specified value when the assertion fails. + */ +#define bpf_assert_eq_with(LHS, RHS, value) \ + ({ \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, ==, RHS, value, true); \ + }) + +/* Description + * Assert that LHS is less than RHS. This statement updates the known + * bounds of LHS during verification. Note that RHS must be a constant + * value, and must fit within the data type of LHS. + * Returns + * Void. + * Throws + * An exception with the value zero when the assertion fails. + */ +#define bpf_assert_lt(LHS, RHS) \ + ({ \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, <, RHS, 0, false); \ + }) + +/* Description + * Assert that LHS is less than RHS. This statement updates the known + * bounds of LHS during verification. Note that RHS must be a constant + * value, and must fit within the data type of LHS. + * Returns + * Void. + * Throws + * An exception with the specified value when the assertion fails. + */ +#define bpf_assert_lt_with(LHS, RHS, value) \ + ({ \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, <, RHS, value, false); \ + }) + +/* Description + * Assert that LHS is greater than RHS. This statement updates the known + * bounds of LHS during verification. Note that RHS must be a constant + * value, and must fit within the data type of LHS. + * Returns + * Void. + * Throws + * An exception with the value zero when the assertion fails. + */ +#define bpf_assert_gt(LHS, RHS) \ + ({ \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, >, RHS, 0, false); \ + }) + +/* Description + * Assert that LHS is greater than RHS. This statement updates the known + * bounds of LHS during verification. Note that RHS must be a constant + * value, and must fit within the data type of LHS. + * Returns + * Void. + * Throws + * An exception with the specified value when the assertion fails. + */ +#define bpf_assert_gt_with(LHS, RHS, value) \ + ({ \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, >, RHS, value, false); \ + }) + +/* Description + * Assert that LHS is less than or equal to RHS. This statement updates the + * known bounds of LHS during verification. Note that RHS must be a + * constant value, and must fit within the data type of LHS. + * Returns + * Void. + * Throws + * An exception with the value zero when the assertion fails. + */ +#define bpf_assert_le(LHS, RHS) \ + ({ \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, <=, RHS, 0, false); \ + }) + +/* Description + * Assert that LHS is less than or equal to RHS. This statement updates the + * known bounds of LHS during verification. Note that RHS must be a + * constant value, and must fit within the data type of LHS. + * Returns + * Void. + * Throws + * An exception with the specified value when the assertion fails. + */ +#define bpf_assert_le_with(LHS, RHS, value) \ + ({ \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, <=, RHS, value, false); \ + }) + +/* Description + * Assert that LHS is greater than or equal to RHS. This statement updates + * the known bounds of LHS during verification. Note that RHS must be a + * constant value, and must fit within the data type of LHS. + * Returns + * Void. + * Throws + * An exception with the value zero when the assertion fails. + */ +#define bpf_assert_ge(LHS, RHS) \ + ({ \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, >=, RHS, 0, false); \ + }) + +/* Description + * Assert that LHS is greater than or equal to RHS. This statement updates + * the known bounds of LHS during verification. Note that RHS must be a + * constant value, and must fit within the data type of LHS. + * Returns + * Void. + * Throws + * An exception with the specified value when the assertion fails. + */ +#define bpf_assert_ge_with(LHS, RHS, value) \ + ({ \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, >=, RHS, value, false); \ + }) + +/* Description + * Assert that LHS is in the range [BEG, END] (inclusive of both). This + * statement updates the known bounds of LHS during verification. Note + * that both BEG and END must be constant values, and must fit within the + * data type of LHS. + * Returns + * Void. + * Throws + * An exception with the value zero when the assertion fails. + */ +#define bpf_assert_range(LHS, BEG, END) \ + ({ \ + _Static_assert(BEG <= END, "BEG must be <= END"); \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, >=, BEG, 0, false); \ + __bpf_assert_op(LHS, <=, END, 0, false); \ + }) + +/* Description + * Assert that LHS is in the range [BEG, END] (inclusive of both). This + * statement updates the known bounds of LHS during verification. Note + * that both BEG and END must be constant values, and must fit within the + * data type of LHS. + * Returns + * Void. + * Throws + * An exception with the specified value when the assertion fails. + */ +#define bpf_assert_range_with(LHS, BEG, END, value) \ + ({ \ + _Static_assert(BEG <= END, "BEG must be <= END"); \ + barrier_var(LHS); \ + __bpf_assert_op(LHS, >=, BEG, value, false); \ + __bpf_assert_op(LHS, <=, END, value, false); \ + }) + +struct bpf_iter_css_task; +struct cgroup_subsys_state; +extern int bpf_iter_css_task_new(struct bpf_iter_css_task *it, + struct cgroup_subsys_state *css, unsigned int flags) __weak __ksym; +extern struct task_struct *bpf_iter_css_task_next(struct bpf_iter_css_task *it) __weak __ksym; +extern void bpf_iter_css_task_destroy(struct bpf_iter_css_task *it) __weak __ksym; + +struct bpf_iter_task; +extern int bpf_iter_task_new(struct bpf_iter_task *it, + struct task_struct *task, unsigned int flags) __weak __ksym; +extern struct task_struct *bpf_iter_task_next(struct bpf_iter_task *it) __weak __ksym; +extern void bpf_iter_task_destroy(struct bpf_iter_task *it) __weak __ksym; + +struct bpf_iter_css; +extern int bpf_iter_css_new(struct bpf_iter_css *it, + struct cgroup_subsys_state *start, unsigned int flags) __weak __ksym; +extern struct cgroup_subsys_state *bpf_iter_css_next(struct bpf_iter_css *it) __weak __ksym; +extern void bpf_iter_css_destroy(struct bpf_iter_css *it) __weak __ksym; + #endif |