summaryrefslogtreecommitdiff
path: root/include/net/netfilter/nf_tables.h
AgeCommit message (Collapse)AuthorFilesLines
2022-12-22netfilter: nf_tables: honor set timeout and garbage collection updatesPablo Neira Ayuso1-1/+12
Set timeout and garbage collection interval updates are ignored on updates. Add transaction to update global set element timeout and garbage collection interval. Fixes: 96518518cc41 ("netfilter: add nftables") Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-12-21netfilter: nf_tables: consolidate set descriptionPablo Neira Ayuso1-0/+12
Add the following fields to the set description: - key type - data type - object type - policy - gc_int: garbage collection interval) - timeout: element timeout This prepares for stricter set type checks on updates in a follow up patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-11-15netfilter: nf_tables: Introduce NFT_MSG_GETRULE_RESETPhil Sutter1-1/+1
Analogous to NFT_MSG_GETOBJ_RESET, but for rules: Reset stateful expressions like counters or quotas. The latter two are the only consumers, adjust their 'dump' callbacks to respect the parameter introduced earlier. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-11-15netfilter: nf_tables: Extend nft_expr_ops::dump callback parametersPhil Sutter1-1/+2
Add a 'reset' flag just like with nft_object_ops::dump. This will be useful to reset "anonymous stateful objects", e.g. simple rule counters. No functional change intended. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-10-25netfilter: nft_inner: add percpu inner contextPablo Neira Ayuso1-0/+1
Add NFT_PKTINFO_INNER_FULL flag to annotate that inner offsets are available. Store nft_inner_tun_ctx object in percpu area to cache existing inner offsets for this skbuff. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-10-25netfilter: nft_inner: support for inner tunnel header matchingPablo Neira Ayuso1-0/+5
This new expression allows you to match on the inner headers that are encapsulated by any of the existing tunneling protocols. This expression parses the inner packet to set the link, network and transport offsets, so the existing expressions (with a few updates) can be reused to match on the inner headers. The inner expression supports for different tunnel combinations such as: - ethernet frame over IPv4/IPv6 packet, eg. VxLAN. - IPv4/IPv6 packet over IPv4/IPv6 packet, eg. IPIP. - IPv4/IPv6 packet over IPv4/IPv6 + transport header, eg. GRE. - transport header (ESP or SCTP) over transport header (usually UDP) The following fields are used to describe the tunnel protocol: - flags, which describe how to parse the inner headers: NFT_PAYLOAD_CTX_INNER_TUN, the tunnel provides its own header. NFT_PAYLOAD_CTX_INNER_ETHER, the ethernet frame is available as inner header. NFT_PAYLOAD_CTX_INNER_NH, the network header is available as inner header. NFT_PAYLOAD_CTX_INNER_TH, the transport header is available as inner header. For example, VxLAN sets on all of these flags. While GRE only sets on NFT_PAYLOAD_CTX_INNER_NH and NFT_PAYLOAD_CTX_INNER_TH. Then, ESP over UDP only sets on NFT_PAYLOAD_CTX_INNER_TH. The tunnel description is composed of the following attributes: - header size: in case the tunnel comes with its own header, eg. VxLAN. - type: this provides a hint to userspace on how to delinearize the rule. This is useful for VxLAN and Geneve since they run over UDP, since transport does not provide a hint. This is also useful in case hardware offload is ever supported. The type is not currently interpreted by the kernel. - expression: currently only payload supported. Follow up patch adds also inner meta support which is required by autogenerated dependencies. The exthdr expression should be supported too at some point. There is a new inner_ops operation that needs to be set on to allow to use an existing expression from the inner expression. This patch adds a new NFT_PAYLOAD_TUN_HEADER base which allows to match on the tunnel header fields, eg. vxlan vni. The payload expression is embedded into nft_inner private area and this private data area is passed to the payload inner eval function via direct call. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-10-25netfilter: nf_tables: reduce nft_pktinfo by 8 bytesFlorian Westphal1-2/+2
structure is reduced from 32 to 24 bytes. While at it, also check that iphdrlen is sane, this is guaranteed for NFPROTO_IPV4 but not for ingress or bridge, so add checks for this. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-24netfilter: nf_tables: make table handle allocation per-netns friendlyPablo Neira Ayuso1-0/+1
mutex is per-netns, move table_netns to the pernet area. *read-write* to 0xffffffff883a01e8 of 8 bytes by task 6542 on cpu 0: nf_tables_newtable+0x6dc/0xc00 net/netfilter/nf_tables_api.c:1221 nfnetlink_rcv_batch net/netfilter/nfnetlink.c:513 [inline] nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:634 [inline] nfnetlink_rcv+0xa6a/0x13a0 net/netfilter/nfnetlink.c:652 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x652/0x730 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x643/0x740 net/netlink/af_netlink.c:1921 Fixes: f102d66b335a ("netfilter: nf_tables: use dedicated mutex to guard transactions") Reported-by: Abhishek Shah <abhishek.shah@columbia.edu> Reviewed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-09netfilter: nf_tables: disallow jump to implicit chain from set elementPablo Neira Ayuso1-0/+5
Extend struct nft_data_desc to add a flag field that specifies nft_data_init() is being called for set element data. Use it to disallow jump to implicit chain from set element, only jump to chain via immediate expression is allowed. Fixes: d0e2c7de92c7 ("netfilter: nf_tables: add NFT_CHAIN_BINDING") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-09netfilter: nf_tables: upfront validation of data via nft_data_init()Pablo Neira Ayuso1-2/+2
Instead of parsing the data and then validate that type and length are correct, pass a description of the expected data so it can be validated upfront before parsing it to bail out earlier. This patch adds a new .size field to specify the maximum size of the data area. The .len field is optional and it is used as an input/output field, it provides the specific length of the expected data in the input path. If then .len field is not specified, then obtained length from the netlink attribute is stored. This is required by cmp, bitwise, range and immediate, which provide no netlink attribute that describes the data length. The immediate expression uses the destination register type to infer the expected data type. Relying on opencoded validation of the expected data might lead to subtle bugs as described in 7e6bc1f6cabc ("netfilter: nf_tables: stricter validation of element data"). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-08-09netfilter: nf_tables: validate variable length element extensionPablo Neira Ayuso1-1/+3
Update template to validate variable length extensions. This patch adds a new .ext_len[id] field to the template to store the expected extension length. This is used to sanity check the initialization of the variable length extension. Use PTR_ERR() in nft_set_elem_init() to report errors since, after this update, there are two reason why this might fail, either because of ENOMEM or insufficient room in the extension field (EINVAL). Kernels up until 7e6bc1f6cabc ("netfilter: nf_tables: stricter validation of element data") allowed to copy more data to the extension than was allocated. This ext_len field allows to validate if the destination has the correct size as additional check. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-07-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextJakub Kicinski1-0/+15
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next: 1) Simplify nf_ct_get_tuple(), from Jackie Liu. 2) Add format to request_module() call, from Bill Wendling. 3) Add /proc/net/stats/nf_flowtable to monitor in-flight pending hardware offload objects to be processed, from Vlad Buslov. 4) Missing rcu annotation and accessors in the netfilter tree, from Florian Westphal. 5) Merge h323 conntrack helper nat hooks into single object, also from Florian. 6) A batch of update to fix sparse warnings treewide, from Florian Westphal. 7) Move nft_cmp_fast_mask() where it used, from Florian. 8) Missing const in nf_nat_initialized(), from James Yonan. 9) Use bitmap API for Maglev IPVS scheduler, from Christophe Jaillet. 10) Use refcount_inc instead of _inc_not_zero in flowtable, from Florian Westphal. 11) Remove pr_debug in xt_TPROXY, from Nathan Cancellor. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: xt_TPROXY: remove pr_debug invocations netfilter: flowtable: prefer refcount_inc netfilter: ipvs: Use the bitmap API to allocate bitmaps netfilter: nf_nat: in nf_nat_initialized(), use const struct nf_conn * netfilter: nf_tables: move nft_cmp_fast_mask to where its used netfilter: nf_tables: use correct integer types netfilter: nf_tables: add and use BE register load-store helpers netfilter: nf_tables: use the correct get/put helpers netfilter: x_tables: use correct integer types netfilter: nfnetlink: add missing __be16 cast netfilter: nft_set_bitmap: Fix spelling mistake netfilter: h323: merge nat hook pointers into one netfilter: nf_conntrack: use rcu accessors where needed netfilter: nf_conntrack: add missing __rcu annotations netfilter: nf_flow_table: count pending offload workqueue tasks net/sched: act_ct: set 'net' pointer when creating new nf_flow_table netfilter: conntrack: use correct format characters netfilter: conntrack: use fallthrough to cleanup ==================== Link: https://lore.kernel.org/r/20220720230754.209053-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-11netfilter: nf_tables: add and use BE register load-store helpersFlorian Westphal1-0/+15
Same as the existing ones, no conversions. This is just for sparse sake only so that we no longer mix be16/u16 and be32/u32 types. Alternative is to add __force __beX in various places, but this seems nicer. objdiff shows no changes. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-07-09netfilter: nf_tables: replace BUG_ON by element length checkPablo Neira Ayuso1-5/+9
BUG_ON can be triggered from userspace with an element with a large userdata area. Replace it by length check and return EINVAL instead. Over time extensions have been growing in size. Pick a sufficiently old Fixes: tag to propagate this fix. Fixes: 7d7402642eaf ("netfilter: nf_tables: variable sized set element keys / data") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-27netfilter: nf_tables: avoid skb access on nf_stolenFlorian Westphal1-6/+10
When verdict is NF_STOLEN, the skb might have been freed. When tracing is enabled, this can result in a use-after-free: 1. access to skb->nf_trace 2. access to skb->mark 3. computation of trace id 4. dump of packet payload To avoid 1, keep a cached copy of skb->nf_trace in the trace state struct. Refresh this copy whenever verdict is != STOLEN. Avoid 2 by skipping skb->mark access if verdict is STOLEN. 3 is avoided by precomputing the trace id. Only dump the packet when verdict is not "STOLEN". Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-06-02netfilter: nf_tables: delete flowtable hooks via transaction listPablo Neira Ayuso1-1/+0
Remove inactive bool field in nft_hook object that was introduced in abadb2f865d7 ("netfilter: nf_tables: delete devices from flowtable"). Move stale flowtable hooks to transaction list instead. Deleting twice the same device does not result in ENOENT. Fixes: abadb2f865d7 ("netfilter: nf_tables: delete devices from flowtable") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-03-20netfilter: nf_tables: cancel tracking for clobbered destination registersPablo Neira Ayuso1-0/+14
Output of expressions might be larger than one single register, this might clobber existing data. Reset tracking for all destination registers that required to store the expression output. This patch adds three new helper functions: - nft_reg_track_update: cancel previous register tracking and update it. - nft_reg_track_cancel: cancel any previous register tracking info. - __nft_reg_track_cancel: cancel only one single register tracking info. Partial register clobbering detection is also supported by checking the .num_reg field which describes the number of register that are used. This patch updates the following expressions: - meta_bridge - bitwise - byteorder - meta - payload to use these helper functions. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-03-20netfilter: nf_tables: do not reduce read-only expressionsPablo Neira Ayuso1-0/+8
Skip register tracking for expressions that perform read-only operations on the registers. Define and use a cookie pointer NFT_REDUCE_READONLY to avoid defining stubs for these expressions. This patch re-enables register tracking which was disabled in ed5f85d42290 ("netfilter: nf_tables: disable register tracking"). Follow up patches add remaining register tracking for existing expressions. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-02-20netfilter: nf_tables_offload: incorrect flow offload action array sizePablo Neira Ayuso1-1/+1
immediate verdict expression needs to allocate one slot in the flow offload action array, however, immediate data expression does not need to do so. fwd and dup expression need to allocate one slot, this is missing. Add a new offload_action interface to report if this expression needs to allocate one slot in the flow offload action array. Fixes: be2861dc36d7 ("netfilter: nft_{fwd,dup}_netdev: add offload support") Reported-and-tested-by: Nick Gregory <Nick.Gregory@Sophos.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-10netfilter: nft_bitwise: track register operationsPablo Neira Ayuso1-0/+2
Check if the destination register already contains the data that this bitwise expression performs. This allows to skip this redundant operation. If the destination contains a different bitwise operation, cancel the register tracking information. If the destination contains no bitwise operation, update the register tracking information. Update the payload and meta expression to check if this bitwise operation has been already performed on the register. Hence, both the payload/meta and the bitwise expressions are reduced. There is also a special case: If source register != destination register and source register is not updated by a previous bitwise operation, then transfer selector from the source register to the destination register. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-10netfilter: nf_tables: add register tracking infrastructurePablo Neira Ayuso1-0/+12
This patch adds new infrastructure to skip redundant selector store operations on the same register to achieve a performance boost from the packet path. This is particularly noticeable in pure linear rulesets but it also helps in rulesets which are already heaving relying in maps to avoid ruleset linear inspection. The idea is to keep data of the most recurrent store operations on register to reuse them with cmp and lookup expressions. This infrastructure allows for dynamic ruleset updates since the ruleset blob reduction happens from the kernel. Userspace still needs to be updated to maximize register utilization to cooperate to improve register data reuse / reduce number of store on register operations. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-10netfilter: nf_tables: add NFT_REG32_NUMPablo Neira Ayuso1-1/+3
Add a definition including the maximum number of 32-bits registers that are used a scratchpad memory area to store data. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-01-10netfilter: nf_tables: add rule blob layoutPablo Neira Ayuso1-4/+18
This patch adds a blob layout per chain to represent the ruleset in the packet datapath. size (unsigned long) struct nft_rule_dp struct nft_expr ... struct nft_rule_dp struct nft_expr ... struct nft_rule_dp (is_last=1) The new structure nft_rule_dp represents the rule in a more compact way (smaller memory footprint) compared to the control-plane nft_rule structure. The ruleset blob is a read-only data structure. The first field contains the blob size, then the rules containing expressions. There is a trailing rule which is used by the tracing infrastructure which is equivalent to the NULL rule marker in the previous representation. The blob size field does not include the size of this trailing rule marker. The ruleset blob is generated from the commit path. This patch reuses the infrastructure available since 0cbc06b3faba ("netfilter: nf_tables: remove synchronize_rcu in commit phase") to build the array of rules per chain. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-01netfilter: nft_payload: support for inner header matching / manglingPablo Neira Ayuso1-0/+2
Allow to match and mangle on inner headers / payload data after the transport header. There is a new field in the pktinfo structure that stores the inner header offset which is calculated only when requested. Only TCP and UDP supported at this stage. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-11-01netfilter: nf_tables: convert pktinfo->tprot_set to flags fieldPablo Neira Ayuso1-2/+6
Generalize boolean field to store more flags on the pktinfo structure. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-10-02netfilter: nf_tables: honor NLM_F_CREATE and NLM_F_EXCL in event notificationPablo Neira Ayuso1-1/+1
Include the NLM_F_CREATE and NLM_F_EXCL flags in netlink event notifications, otherwise userspace cannot distiguish between create and add commands. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-06-07Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-6/+0
Bug fixes overlapping feature additions and refactoring, mostly. Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-29netfilter: nf_tables: remove xt_action_param from nft_pktinfoFlorian Westphal1-12/+13
Init it on demand in the nft_compat expression. This reduces size of nft_pktinfo from 48 to 24 bytes on x86_64. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29netfilter: nf_tables: remove unused arg in nft_set_pktinfo_unspec()Florian Westphal1-2/+1
The functions pass extra skb arg, but either its not used or the helpers can already access it via pkt->skb. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29netfilter: nf_tables: add and use nft_thoff helperFlorian Westphal1-0/+5
This allows to change storage placement later on without changing readers. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-29netfilter: nf_tables: add and use nft_sk helperFlorian Westphal1-0/+5
This allows to change storage placement later on without changing readers. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-05-24netfilter: nf_tables: fix table flag updatesPablo Neira Ayuso1-6/+0
The dormant flag need to be updated from the preparation phase, otherwise, two consecutive requests to dorm a table in the same batch might try to remove the same hooks twice, resulting in the following warning: hook not found, pf 3 num 0 WARNING: CPU: 0 PID: 334 at net/netfilter/core.c:480 __nf_unregister_net_hook+0x1eb/0x610 net/netfilter/core.c:480 Modules linked in: CPU: 0 PID: 334 Comm: kworker/u4:5 Not tainted 5.12.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net RIP: 0010:__nf_unregister_net_hook+0x1eb/0x610 net/netfilter/core.c:480 This patch is a partial revert of 0ce7cf4127f1 ("netfilter: nftables: update table flags from the commit phase") to restore the previous behaviour. However, there is still another problem: A batch containing a series of dorm-wakeup-dorm table and vice-versa also trigger the warning above since hook unregistration happens from the preparation phase, while hook registration occurs from the commit phase. To fix this problem, this patch adds two internal flags to annotate the original dormant flag status which are __NFT_TABLE_F_WAS_DORMANT and __NFT_TABLE_F_WAS_AWAKEN, to restore it from the abort path. The __NFT_TABLE_F_UPDATE bitmask allows to handle the dormant flag update with one single transaction. Reported-by: syzbot+7ad5cd1615f2d89c6e7e@syzkaller.appspotmail.com Fixes: 0ce7cf4127f1 ("netfilter: nftables: update table flags from the commit phase") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-04-27netfilter: nftables: add catch-all set element supportPablo Neira Ayuso1-0/+5
This patch extends the set infrastructure to add a special catch-all set element. If the lookup fails to find an element (or range) in the set, then the catch-all element is selected. Users can specify a mapping, expression(s) and timeout to be attached to the catch-all element. This patch adds a catchall list to the set, this list might contain more than one single catch-all element (e.g. in case that the catch-all element is removed and a new one is added in the same transaction). However, most of the time, there will be either one element or no elements at all in this list. The catch-all element is identified via NFT_SET_ELEM_CATCHALL flag and such special element has no NFTA_SET_ELEM_KEY attribute. There is a new nft_set_elem_catchall object that stores a reference to the dummy catch-all element (catchall->elem) whose layout is the same of the set element type to reuse the existing set element codebase. The set size does not apply to the catch-all element, users can define a catch-all element even if the set is full. The check for valid set element flags hava been updates to report EOPNOTSUPP in case userspace requests flags that are not supported when using new userspace nftables and old kernel. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-04-26netfilter: nftables: add nft_pernet() helper functionPablo Neira Ayuso1-0/+8
Consolidate call to net_generic(net, nf_tables_net_id) in this wrapper function. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-04-18netfilter: nftables: counter hardware offload supportPablo Neira Ayuso1-0/+2
This patch adds the .offload_stats operation to synchronize hardware stats with the expression data. Update the counter expression to use this new interface. The hardware stats are retrieved from the netlink dump path via FLOW_CLS_STATS command to the driver. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-04-06netfilter: nf_tables: use net_generic infra for transaction dataFlorian Westphal1-0/+11
This moves all nf_tables pernet data from struct net to a net_generic extension, with the exception of the gencursor. The latter is used in the data path and also outside of the nf_tables core. All others are only used from the configuration plane. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-03-31netfilter: nft_log: perform module load from nf_tablesFlorian Westphal1-0/+5
modprobe calls from the nf_logger_find_get() API causes deadlock in very special cases because they occur with the nf_tables transaction mutex held. In the specific case of nf_log, deadlock is via: A nf_tables -> transaction mutex -> nft_log -> modprobe -> nf_log_syslog \ -> pernet_ops rwsem -> wait for C B netlink event -> rtnl_mutex -> nf_tables transaction mutex -> wait for A C close() -> ip6mr_sk_done -> rtnl_mutex -> wait for B Earlier patch added NFLOG/xt_LOG module softdeps to avoid the need to load the backend module during a transaction. For nft_log we would have to add a softdep for both nfnetlink_log or nf_log_syslog, since we do not know in advance which of the two backends are going to be configured. This defers the modprobe op until after the transaction mutex is released. Tested-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-03-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller1-0/+3
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-18netfilter: nftables: update table flags from the commit phasePablo Neira Ayuso1-3/+6
Do not update table flags from the preparation phase. Store the flags update into the transaction, then update the flags from the commit phase. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-03-18netfilter: nftables: allow to update flowtable flagsPablo Neira Ayuso1-0/+3
Honor flowtable flags from the control update path. Disallow disabling to toggle hardware offload support though. Fixes: 8bb69f3b2918 ("netfilter: nf_tables: add flowtable offload control plane") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-02-15netfilter: nftables: introduce table ownershipPablo Neira Ayuso1-0/+6
A userspace daemon like firewalld might need to monitor for netlink updates to detect its ruleset removal by the (global) flush ruleset command to ensure ruleset persistency. This adds extra complexity from userspace and, for some little time, the firewall policy is not in place. This patch adds the NFT_TABLE_F_OWNER flag which allows a userspace program to own the table that creates in exclusivity. Tables that are owned... - can only be updated and removed by the owner, non-owners hit EPERM if they try to update it or remove it. - are destroyed when the owner closes the netlink socket or the process is gone (implicit netlink socket closure). - are skipped by the global flush ruleset command. - are listed in the global ruleset. The userspace process that sets on the NFT_TABLE_F_OWNER flag need to leave open the netlink socket. A new NFTA_TABLE_OWNER netlink attribute specifies the netlink port ID to identify the owner from userspace. This patch also updates error reporting when an unknown table flag is specified to change it from EINVAL to EOPNOTSUPP given that EINVAL is usually reserved to report for malformed netlink messages to userspace. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-02-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextJakub Kicinski1-6/+5
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next 1) Remove indirection and use nf_ct_get() instead from nfnetlink_log and nfnetlink_queue, from Florian Westphal. 2) Add weighted random twos choice least-connection scheduling for IPVS, from Darby Payne. 3) Add a __hash placeholder in the flow tuple structure to identify the field to be included in the rhashtable key hash calculation. 4) Add a new nft_parse_register_load() and nft_parse_register_store() to consolidate register load and store in the core. 5) Statify nft_parse_register() since it has no more module clients. 6) Remove redundant assignment in nft_cmp, from Colin Ian King. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next: netfilter: nftables: remove redundant assignment of variable err netfilter: nftables: statify nft_parse_register() netfilter: nftables: add nft_parse_register_store() and use it netfilter: nftables: add nft_parse_register_load() and use it netfilter: flowtable: add hash offset field to tuple ipvs: add weighted random twos choice algorithm netfilter: ctnetlink: remove get_ct indirection ==================== Link: https://lore.kernel.org/r/20210206015005.23037-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-28netfilter: nftables: statify nft_parse_register()Pablo Neira Ayuso1-1/+0
This function is not used anymore by any extension, statify it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-01-28netfilter: nftables: add nft_parse_register_store() and use itPablo Neira Ayuso1-4/+4
This new function combines the netlink register attribute parser and the store validation function. This update requires to replace: enum nft_registers dreg:8; in many of the expression private areas otherwise compiler complains with: error: cannot take address of bit-field ‘dreg’ when passing the register field as reference. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-01-28netfilter: nftables: add nft_parse_register_load() and use itPablo Neira Ayuso1-1/+1
This new function combines the netlink register attribute parser and the load validation function. This update requires to replace: enum nft_registers sreg:8; in many of the expression private areas otherwise compiler complains with: error: cannot take address of bit-field ‘sreg’ when passing the register field as reference. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-01-16netfilter: nft_dynset: honor stateful expressions in set definitionPablo Neira Ayuso1-0/+2
If the set definition contains stateful expressions, allocate them for the newly added entries from the packet path. Fixes: 65038428b2c6 ("netfilter: nf_tables: allow to specify stateful expression in set definition") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-12-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextJakub Kicinski1-37/+58
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next 1) Missing dependencies in NFT_BRIDGE_REJECT, from Randy Dunlap. 2) Use atomic_inc_return() instead of atomic_add_return() in IPVS, from Yejune Deng. 3) Simplify check for overquota in xt_nfacct, from Kaixu Xia. 4) Move nfnl_acct_list away from struct net, from Miao Wang. 5) Pass actual sk in reject actions, from Jan Engelhardt. 6) Add timeout and protoinfo to ctnetlink destroy events, from Florian Westphal. 7) Four patches to generalize set infrastructure to support for multiple expressions per set element. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next: netfilter: nftables: netlink support for several set element expressions netfilter: nftables: generalize set extension to support for several expressions netfilter: nftables: move nft_expr before nft_set netfilter: nftables: generalize set expressions support netfilter: ctnetlink: add timeout and protoinfo to destroy events netfilter: use actual socket sk for REJECT action netfilter: nfnl_acct: remove data from struct net netfilter: Remove unnecessary conversion to bool ipvs: replace atomic_add_return() netfilter: nft_reject_bridge: fix build errors due to code movement ==================== Link: https://lore.kernel.org/r/20201212230513.3465-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-12netfilter: nftables: generalize set extension to support for several expressionsPablo Neira Ayuso1-8/+28
This patch replaces NFT_SET_EXPR by NFT_SET_EXT_EXPRESSIONS. This new extension allows to attach several expressions to one set element (not only one single expression as NFT_SET_EXPR provides). This patch prepares for support for several expressions per set element in the netlink userspace API. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-12-12netfilter: nftables: move nft_expr before nft_setPablo Neira Ayuso1-28/+26
Move the nft_expr structure definition before nft_set. Expressions are used by rules and sets, remove unnecessary forward declarations. This comes as preparation to support for multiple expressions per set element. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-12-12netfilter: nftables: generalize set expressions supportPablo Neira Ayuso1-1/+4
Currently, the set infrastucture allows for one single expressions per element. This patch extends the existing infrastructure to allow for up to two expressions. This is not updating the netlink API yet, this is coming as an initial preparation patch. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>