summaryrefslogtreecommitdiff
path: root/net/netfilter/ipset/ip_set_hash_ipmark.c
AgeCommit message (Collapse)AuthorFilesLines
2023-01-02netfilter: ipset: Rework long task execution when adding/deleting entriesJozsef Kadlecsik1-6/+7
When adding/deleting large number of elements in one step in ipset, it can take a reasonable amount of time and can result in soft lockup errors. The patch 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") tried to fix it by limiting the max elements to process at all. However it was not enough, it is still possible that we get hung tasks. Lowering the limit is not reasonable, so the approach in this patch is as follows: rely on the method used at resizing sets and save the state when we reach a smaller internal batch limit, unlock/lock and proceed from the saved state. Thus we can avoid long continuous tasks and at the same time removed the limit to add/delete large number of elements in one step. The nfnl mutex is held during the whole operation which prevents one to issue other ipset commands in parallel. Fixes: 5f7b51bf09ba ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") Reported-by: syzbot+9204e7399656300bf271@syzkaller.appspotmail.com Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2021-08-04netfilter: ipset: Limit the maximal range of consecutive elements to add/deleteJozsef Kadlecsik1-1/+9
The range size of consecutive elements were not limited. Thus one could define a huge range which may result soft lockup errors due to the long execution time. Now the range size is limited to 2^20 entries. Reported-by: Brad Spengler <spender@grsecurity.net> Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-10-31netfilter: ipset: Expose the initval hash parameter to userspaceJozsef Kadlecsik1-1/+2
It makes possible to reproduce exactly the same set after a save/restore. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2020-10-31netfilter: ipset: Add bucketsize parameter to all hash typesJozsef Kadlecsik1-2/+4
The parameter defines the upper limit in any hash bucket at adding new entries from userspace - if the limit would be exceeded, ipset doubles the hash size and rehashes. It means the set may consume more memory but gives faster evaluation at matching in the set. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-10-08netfilter: ipset: remove inline from static functions in .c files.Jeremy Sowden1-4/+4
The inline function-specifier should not be used for static functions defined in .c files since it bloats the kernel. Instead leave the compiler to decide which functions to inline. While a couple of the files affected (ip_set_*_gen.h) are technically headers, they contain templates for generating the common parts of particular set-types and so we treat them like .c files. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Acked-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextPablo Neira Ayuso1-7/+2
Resolve conflict between d2912cb15bdd ("treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500") removing the GPL disclaimer and fe03d4745675 ("Update my email address") which updates Jozsef Kadlecsik's email. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-06-19treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500Thomas Gleixner1-4/+1
Based on 2 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation this program is free software you can redistribute it and or modify it under the terms of the gnu general public license version 2 as published by the free software foundation # extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 4122 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Enrico Weigelt <info@metux.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Allison Randal <allison@lohutok.net> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190604081206.933168790@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-10Update my email addressJozsef Kadlecsik1-1/+1
It's better to use my kadlec@netfilter.org email address in the source code. I might not be able to use kadlec@blackhole.kfki.hu in the future. Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2017-09-26netfilter: ipset: Fix adding an IPv4 range containing more than 2^31 addressesJozsef Kadlecsik1-1/+1
Wrong comparison prevented the hash types to add a range with more than 2^31 addresses but reported as a success. Fixes Netfilter's bugzilla id #1005, reported by Oleg Serditov and Oliver Ford. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-11-10netfilter: ipset: Make struct htype per ipset familyJozsef Kadlecsik1-5/+5
Before this patch struct htype created at the first source of ip_set_hash_gen.h and it is common for both IPv4 and IPv6 set variants. Make struct htype per ipset family and use NLEN to make nets array fixed size to simplify struct htype allocation. Ported from a patch proposed by Sergey Popovich <popovich_sergei@mail.ua>. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14netfilter: ipset: Fix coding styles reported by checkpatch.plJozsef Kadlecsik1-6/+3
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14netfilter: ipset: Introduce RCU locking in hash:* typesJozsef Kadlecsik1-0/+1
Three types of data need to be protected in the case of the hash types: a. The hash buckets: standard rcu pointer operations are used. b. The element blobs in the hash buckets are stored in an array and a bitmap is used for book-keeping to tell which elements in the array are used or free. c. Networks per cidr values and the cidr values themselves are stored in fix sized arrays and need no protection. The values are modified in such an order that in the worst case an element testing is repeated once with the same cidr value. The ipset hash approach uses arrays instead of lists and therefore is incompatible with rhashtable. Performance is tested by Jesper Dangaard Brouer: Simple drop in FORWARD ~~~~~~~~~~~~~~~~~~~~~~ Dropping via simple iptables net-mask match:: iptables -t raw -N simple || iptables -t raw -F simple iptables -t raw -I simple -s 198.18.0.0/15 -j DROP iptables -t raw -D PREROUTING -j simple iptables -t raw -I PREROUTING -j simple Drop performance in "raw": 11.3Mpps Generator: sending 12.2Mpps (tx:12264083 pps) Drop via original ipset in RAW table ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Create a set with lots of elements:: sudo ./ipset destroy test echo "create test hash:ip hashsize 65536" > test.set for x in `seq 0 255`; do for y in `seq 0 255`; do echo "add test 198.18.$x.$y" >> test.set done done sudo ./ipset restore < test.set Dropping via ipset:: iptables -t raw -F iptables -t raw -N net198 || iptables -t raw -F net198 iptables -t raw -I net198 -m set --match-set test src -j DROP iptables -t raw -I PREROUTING -j net198 Drop performance in "raw" with ipset: 8Mpps Perf report numbers ipset drop in "raw":: + 24.65% ksoftirqd/1 [ip_set] [k] ip_set_test - 21.42% ksoftirqd/1 [kernel.kallsyms] [k] _raw_read_lock_bh - _raw_read_lock_bh + 99.88% ip_set_test - 19.42% ksoftirqd/1 [kernel.kallsyms] [k] _raw_read_unlock_bh - _raw_read_unlock_bh + 99.72% ip_set_test + 4.31% ksoftirqd/1 [ip_set_hash_ip] [k] hash_ip4_kadt + 2.27% ksoftirqd/1 [ixgbe] [k] ixgbe_fetch_rx_buffer + 2.18% ksoftirqd/1 [ip_tables] [k] ipt_do_table + 1.81% ksoftirqd/1 [ip_set_hash_ip] [k] hash_ip4_test + 1.61% ksoftirqd/1 [kernel.kallsyms] [k] __netif_receive_skb_core + 1.44% ksoftirqd/1 [kernel.kallsyms] [k] build_skb + 1.42% ksoftirqd/1 [kernel.kallsyms] [k] ip_rcv + 1.36% ksoftirqd/1 [kernel.kallsyms] [k] __local_bh_enable_ip + 1.16% ksoftirqd/1 [kernel.kallsyms] [k] dev_gro_receive + 1.09% ksoftirqd/1 [kernel.kallsyms] [k] __rcu_read_unlock + 0.96% ksoftirqd/1 [ixgbe] [k] ixgbe_clean_rx_irq + 0.95% ksoftirqd/1 [kernel.kallsyms] [k] __netdev_alloc_frag + 0.88% ksoftirqd/1 [kernel.kallsyms] [k] kmem_cache_alloc + 0.87% ksoftirqd/1 [xt_set] [k] set_match_v3 + 0.85% ksoftirqd/1 [kernel.kallsyms] [k] inet_gro_receive + 0.83% ksoftirqd/1 [kernel.kallsyms] [k] nf_iterate + 0.76% ksoftirqd/1 [kernel.kallsyms] [k] put_compound_page + 0.75% ksoftirqd/1 [kernel.kallsyms] [k] __rcu_read_lock Drop via ipset in RAW table with RCU-locking ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ With RCU locking, the RW-lock is gone. Drop performance in "raw" with ipset with RCU-locking: 11.3Mpps Performance-tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14netfilter: ipset: Make sure we always return line number on batchSergey Popovich1-6/+6
Even if we return with generic IPSET_ERR_PROTOCOL it is good idea to return line number if we called in batch mode. Moreover we are not always exiting with IPSET_ERR_PROTOCOL. For example hash:ip,port,net may return IPSET_ERR_HASH_RANGE_UNSUPPORTED or IPSET_ERR_INVALID_CIDR. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14netfilter: ipset: Permit CIDR equal to the host address CIDR in IPv6Sergey Popovich1-3/+9
Permit userspace to supply CIDR length equal to the host address CIDR length in netlink message. Prohibit any other CIDR length for IPv6 variant of the set. Also return -IPSET_ERR_HASH_RANGE_UNSUPPORTED instead of generic -IPSET_ERR_PROTOCOL in IPv6 variant of hash:ip,port,net when IPSET_ATTR_IP_TO attribute is given. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-06-14netfilter: ipset: Check extensions attributes before getting extensions.Sergey Popovich1-13/+1
Make all extensions attributes checks within ip_set_get_extensions() and reduce number of duplicated code. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2015-05-13netfilter: ipset: Improve preprocessor macros checksSergey Popovich1-3/+3
Check if mandatory MTYPE, HTYPE and HOST_MASK macros defined. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Fix hashing for ipv6 setsSergey Popovich1-3/+0
HKEY_DATALEN remains defined after first inclusion of ip_set_hash_gen.h, so it is incorrectly reused for IPv6 code. Undefine HKEY_DATALEN in ip_set_hash_gen.h at the end. Also remove some useless defines of HKEY_DATALEN in ip_set_hash_{ip{,mark,port},netiface}.c as ip_set_hash_gen.h defines it correctly for such set types anyway. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Check for comment netlink attribute lengthSergey Popovich1-1/+2
Ensure userspace supplies string not longer than IPSET_MAX_COMMENT_SIZE. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Return bool values instead of intSergey Popovich1-4/+4
Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Use HOST_MASK literal to represent host address CIDR lenSergey Popovich1-1/+1
Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Return ipset error instead of boolSergey Popovich1-4/+10
Statement ret = func1() || func2() returns 0 when both func1() and func2() return 0, or 1 if func1() or func2() returns non-zero. However in our case func1() and func2() returns error code on failure, so it seems good to propagate such error codes, rather than returning 1 in case of failure. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Preprocessor directices cleanupSergey Popovich1-3/+0
* Undefine mtype_data_reset_elem before defining. * Remove duplicated mtype_gc_init undefine, move mtype_gc_init define closer to mtype_gc define. * Use htype instead of HTYPE in IPSET_TOKEN(HTYPE, _create)(). * Remove PF definition from sets: no more used. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Fix sparse warningJozsef Kadlecsik1-2/+2
"warning: cast to restricted __be32" warnings are fixed Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-09-16netfilter: ipset: Add skbinfo extension kernel support for the hash set types.Anton Danilov1-2/+12
Add skbinfo extension kernel support for the hash set types. Inroduce the new revisions of all hash set types. Signed-off-by: Anton Danilov <littlesmilingcloud@gmail.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-03-06netfilter: ipset: add forceadd kernel support for hash set typesJosh Hunt1-1/+1
Adds a new property for hash set types, where if a set is created with the 'forceadd' option and the set becomes full the next addition to the set may succeed and evict a random entry from the set. To keep overhead low eviction is done very simply. It checks to see which bucket the new entry would be added. If the bucket's pos value is non-zero (meaning there's at least one entry in the bucket) it replaces the first entry in the bucket. If pos is zero, then it continues down the normal add process. This property is useful if you have a set for 'ban' lists where it may not matter if you release some entries from the set early. Signed-off-by: Josh Hunt <johunt@akamai.com> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-03-06netfilter: ipset: add markmask for hash:ip,mark data typeVytas Dauksa1-0/+9
Introduce packet mark mask for hash:ip,mark data type. This allows to set mark bit filter for the ip set. Change-Id: Id8dd9ca7e64477c4f7b022a1d9c1a5b187f1c96e Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
2014-03-06netfilter: ipset: add hash:ip,mark data type to ipsetVytas Dauksa1-0/+312
Introduce packet mark support with new ip,mark hash set. This includes userspace and kernelspace code, hash:ip,mark set tests and man page updates. The intended use of ip,mark set is similar to the ip:port type, but for protocols which don't use a predictable port number. Instead of port number it matches a firewall mark determined by a layer 7 filtering program like opendpi. As well as allowing or blocking traffic it will also be used for accounting packets and bytes sent for each protocol. Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>