summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2018-06-15l2tp: reject creation of non-PPP sessions on L2TPv2 tunnelsGuillaume Nault1-0/+6
The /proc/net/pppol2tp handlers (pppol2tp_seq_*()) iterate over all L2TPv2 tunnels, and rightfully expect that only PPP sessions can be found there. However, l2tp_netlink accepts creating Ethernet sessions regardless of the underlying tunnel version. This confuses pppol2tp_seq_session_show(), which expects that l2tp_session_priv() returns a pppol2tp_session structure. When the session is an Ethernet pseudo-wire, a struct l2tp_eth_sess is returned instead. This leads to invalid memory access when pppol2tp_session_get_sock() later tries to dereference ps->sk. Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15ipv6: Only emit append events for appended routesIdo Schimmel1-3/+2
Current code will emit an append event in the FIB notification chain for any route added with NLM_F_APPEND set, even if the route was not appended to any existing route. This is inconsistent with IPv4 where such an event is only emitted when the new route is appended after an existing one. Align IPv6 behavior with IPv4, thereby allowing listeners to more easily handle these events. Fixes: f34436a43092 ("net/ipv6: Simplify route replace and appending into multipath route") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15Merge tag 'mac80211-for-davem-2018-06-15' of ↵David S. Miller3-6/+9
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== A handful of fixes: * missing RCU grace period enforcement led to drivers freeing data structures before; fix from Dedy Lansky. * hwsim module init error paths were messed up; fixed it myself after a report from Colin King (who had sent a partial patch) * kernel-doc tag errors; fix from Luca Coelho * initialize the on-stack sinfo data structure when getting station information; fix from Sven Eckelmann * TXQ state dumping is now done from init, and when TXQs aren't initialized yet at that point, bad things happen, move the initialization; fix from Toke Høiland-Jørgensen. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15cfg80211: fix rcu in cfg80211_unregister_wdevDedy Lansky1-0/+1
Callers of cfg80211_unregister_wdev can free the wdev object immediately after this function returns. This may crash the kernel because this wdev object is still in use by other threads. Add synchronize_rcu() after list_del_rcu to make sure wdev object can be safely freed. Signed-off-by: Dedy Lansky <dlansky@codeaurora.org> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2018-06-15mac80211: Move up init of TXQsToke Høiland-Jørgensen1-6/+6
On init, ieee80211_if_add() dumps the interface. Since that now includes a dump of the TXQ state, we need to initialise that before the dump happens. So move up the TXQ initialisation to to before the call to ieee80211_if_add(). Fixes: 52539ca89f36 ("cfg80211: Expose TXQ stats and parameters to userspace") Reported-by: Niklas Cassel <niklas.cassel@linaro.org> Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk> Tested-by: Niklas Cassel <niklas.cassel@linaro.org> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2018-06-15cfg80211: initialize sinfo in cfg80211_get_stationSven Eckelmann1-0/+2
Most of the implementations behind cfg80211_get_station will not initialize sinfo to zero before manipulating it. For example, the member "filled", which indicates the filled in parts of this struct, is often only modified by enabling certain bits in the bitfield while keeping the remaining bits in their original state. A caller without a preinitialized sinfo.filled can then no longer decide which parts of sinfo were filled in by cfg80211_get_station (or actually the underlying implementations). cfg80211_get_station must therefore take care that sinfo is initialized to zero. Otherwise, the caller may tries to read information which was not filled in and which must therefore also be considered uninitialized. In batadv_v_elp_get_throughput's case, an invalid "random" expected throughput may be stored for this neighbor and thus the B.A.T.M.A.N V algorithm may switch to non-optimal neighbors for certain destinations. Fixes: 7406353d43c8 ("cfg80211: implement cfg80211_get_station cfg80211 API") Reported-by: Thomas Lauer <holminateur@gmail.com> Reported-by: Marcel Schmidt <ff.z-casparistrasse@mailbox.org> Cc: b.a.t.m.a.n@lists.open-mesh.org Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
2018-06-15rds: avoid unenecessary cong_update in loop transportSantosh Shilimkar3-0/+11
Loop transport which is self loopback, remote port congestion update isn't relevant. Infact the xmit path already ignores it. Receive path needs to do the same. Reported-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com Reviewed-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15l2tp: clean up stale tunnel or session in pppol2tp_connect's error pathGuillaume Nault1-0/+10
pppol2tp_connect() may create a tunnel or a session. Remove them in case of error. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15l2tp: prevent pppol2tp_connect() from creating kernel socketsGuillaume Nault1-0/+9
If 'fd' is negative, l2tp_tunnel_create() creates a tunnel socket using the configuration passed in 'tcfg'. Currently, pppol2tp_connect() sets the relevant fields to zero, tricking l2tp_tunnel_create() into setting up an unusable kernel socket. We can't set 'tcfg' with the required fields because there's no way to get them from the current connect() parameters. So let's restrict kernel sockets creation to the netlink API, which is the original use case. Fixes: 789a4a2c61d8 ("l2tp: Add support for static unmanaged L2TPv3 tunnels") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15l2tp: only accept PPP sessions in pppol2tp_connect()Guillaume Nault1-0/+6
l2tp_session_priv() returns a struct pppol2tp_session pointer only for PPPoL2TP sessions. In particular, if the session is an L2TP_PWTYPE_ETH pseudo-wire, l2tp_session_priv() returns a pointer to an l2tp_eth_sess structure, which is much smaller than struct pppol2tp_session. This leads to invalid memory dereference when trying to lock ps->sk_lock. Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15l2tp: fix pseudo-wire type for sessions created by pppol2tp_connect()Guillaume Nault1-0/+1
Define cfg.pw_type so that the new session is created with its .pwtype field properly set (L2TP_PWTYPE_PPP). Not setting the pseudo-wire type had several annoying effects: * Invalid value returned in the L2TP_ATTR_PW_TYPE attribute when dumping sessions with the netlink API. * Impossibility to delete the session using the netlink API (because l2tp_nl_cmd_session_delete() gets the deletion callback function from an array indexed by the session's pseudo-wire type). Also, there are several cases where we should check a session's pseudo-wire type. For example, pppol2tp_connect() should refuse to connect a session that is not PPPoL2TP, but that requires the session's .pwtype field to be properly set. Fixes: f7faffa3ff8e ("l2tp: Add L2TPv3 protocol support") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-15tcp: verify the checksum of the first data segment in a new connectionFrank van der Linden2-0/+8
commit 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") introduced an optimization for the handling of child sockets created for a new TCP connection. But this optimization passes any data associated with the last ACK of the connection handshake up the stack without verifying its checksum, because it calls tcp_child_process(), which in turn calls tcp_rcv_state_process() directly. These lower-level processing functions do not do any checksum verification. Insert a tcp_checksum_complete call in the TCP_NEW_SYN_RECEIVE path to fix this. Fixes: 079096f103fa ("tcp/dccp: install syn_recv requests into ehash table") Signed-off-by: Frank van der Linden <fllinden@amazon.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Balbir Singh <bsingharora@gmail.com> Reviewed-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-14sctp: define sctp_packet_gso_append to build GSO framesXin Long1-10/+18
Now sctp GSO uses skb_gro_receive() to append the data into head skb frag_list. However it actually only needs very few code from skb_gro_receive(). Besides, NAPI_GRO_CB has to be set while most of its members are not needed here. This patch is to add sctp_packet_gso_append() to build GSO frames instead of skb_gro_receive(), and it would avoid many unnecessary checks and make the code clearer. Note that sctp will use page frags instead of frag_list to build GSO frames in another patch. But it may take time, as sctp's GSO frames may have different size. skb_segment() can only split it into the frags with the same size, which would break the border of sctp chunks. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller10-20/+48
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter patches for your net tree: 1) Fix NULL pointer dereference from nf_nat_decode_session() if NAT is not loaded, from Prashant Bhole. 2) Fix socket extension module autoload. 3) Don't bogusly reject sets with the NFT_SET_EVAL flag set on from the dynset extension. 4) Fix races with nf_tables module removal and netns exit path, patches from Florian Westphal. 5) Don't hit BUG_ON if jumpstack goes too deep, instead hit WARN_ON_ONCE, from Taehee Yoo. 6) Another NULL pointer dereference from ctnetlink, again if NAT is not loaded, from Florian Westphal. 7) Fix x_tables match list corruption in xt_connmark module removal path, also from Florian. 8) nf_conncount doesn't properly deal with conntrack zones, hence garbage collector may get rid of entries in a different zone. From Yi-Hung Wei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-13smc: convert to ->poll_maskCong Wang1-9/+3
smc->clcsock is an internal TCP socket, after TCP socket converts to ->poll_mask, ->poll doesn't exist any more. So just convert smc socket to ->poll_mask too. Fixes: 2c7d3dacebd4 ("net/tcp: convert to ->poll_mask") Reported-by: syzbot+f5066e369b2d5fff630f@syzkaller.appspotmail.com Cc: Christoph Hellwig <hch@lst.de> Cc: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12Revert "net: do not allow changing SO_REUSEADDR/SO_REUSEPORT on bound sockets"Bart Van Assche1-14/+1
Revert the patch mentioned in the subject because it breaks at least the Avahi mDNS daemon. That patch namely causes the Ubuntu 18.04 Avahi daemon to fail to start: Jun 12 09:49:24 ubuntu-vm avahi-daemon[529]: Successfully called chroot(). Jun 12 09:49:24 ubuntu-vm avahi-daemon[529]: Successfully dropped remaining capabilities. Jun 12 09:49:24 ubuntu-vm avahi-daemon[529]: No service file found in /etc/avahi/services. Jun 12 09:49:24 ubuntu-vm avahi-daemon[529]: SO_REUSEADDR failed: Structure needs cleaning Jun 12 09:49:24 ubuntu-vm avahi-daemon[529]: SO_REUSEADDR failed: Structure needs cleaning Jun 12 09:49:24 ubuntu-vm avahi-daemon[529]: Failed to create server: No suitable network protocol available Jun 12 09:49:24 ubuntu-vm avahi-daemon[529]: avahi-daemon 0.7 exiting. Jun 12 09:49:24 ubuntu-vm systemd[1]: avahi-daemon.service: Main process exited, code=exited, status=255/n/a Jun 12 09:49:24 ubuntu-vm systemd[1]: avahi-daemon.service: Failed with result 'exit-code'. Jun 12 09:49:24 ubuntu-vm systemd[1]: Failed to start Avahi mDNS/DNS-SD Stack. Fixes: f396922d862a ("net: do not allow changing SO_REUSEADDR/SO_REUSEPORT on bound sockets") Cc: Maciej Żenczykowski <maze@google.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12netfilter: nf_conncount: Fix garbage collection with zonesYi-Hung Wei2-5/+10
Currently, we use check_hlist() for garbage colleciton. However, we use the ‘zone’ from the counted entry to query the existence of existing entries in the hlist. This could be wrong when they are in different zones, and this patch fixes this issue. Fixes: e59ea3df3fc2 ("netfilter: xt_connlimit: honor conntrack zone if available") Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-12netfilter: xt_connmark: fix list corruption on rmmodFlorian Westphal1-1/+1
This needs to use xt_unregister_targets, else new revision is left on the list which then causes list to point to a target struct that has been free'd. Fixes: 472a73e00757 ("netfilter: xt_conntrack: Support bit-shifting for CONNMARK & MARK targets.") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-12netfilter: ctnetlink: avoid null pointer dereferenceFlorian Westphal1-1/+2
Dan Carpenter points out that deref occurs after NULL check, we should re-fetch the pointer and check that instead. Fixes: 2c205dd3981f7 ("netfilter: add struct nf_nat_hook and use it") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-12netfilter: nf_tables: use WARN_ON_ONCE instead of BUG_ON in nft_do_chain()Taehee Yoo1-1/+2
When depth of chain is bigger than NFT_JUMP_STACK_SIZE, the nft_do_chain crashes. But there is no need to crash hard here. Suggested-by: Florian Westphal <fw@strlen.de> Signed-off-by: Taehee Yoo <ap420073@gmail.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-12netfilter: nf_tables: close race between netns exit and rmmodFlorian Westphal2-3/+15
If net namespace is exiting while nf_tables module is being removed we can oops: BUG: unable to handle kernel NULL pointer dereference at 0000000000000040 IP: nf_tables_flowtable_event+0x43/0xf0 [nf_tables] PGD 0 P4D 0 Oops: 0000 [#1] SMP PTI Modules linked in: nf_tables(-) nfnetlink [..] unregister_netdevice_notifier+0xdd/0x130 nf_tables_module_exit+0x24/0x3a [nf_tables] SyS_delete_module+0x1c5/0x240 do_syscall_64+0x74/0x190 Avoid this by attempting to take reference on the net namespace from the notifiers. If it fails the namespace is exiting already, and nft core is taking care of cleanup work. We also need to make sure the netdev hook type gets removed before netns ops removal, else notifier might be invoked with device event for a netns where net->nft was never initialised (because pernet ops was removed beforehand). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-12netfilter: nf_tables: fix module unload raceFlorian Westphal2-6/+16
We must first remove the nfnetlink protocol handler when nf_tables module is unloaded -- we don't want userspace to submit new change requests once we've started to tear down nft state. Furthermore, nfnetlink must not call any subsystem function after call_batch returned -EAGAIN. EAGAIN means the subsys mutex was dropped, so its unlikely but possible that nf_tables subsystem was removed due to 'rmmod nf_tables' on another cpu. Therefore, we must abort batch completely and not move on to next part of the batch. Last, we can't invoke ->abort unless we've checked that the subsystem is still registered. Change netns exit path of nf_tables to make sure any incompleted transaction gets removed on exit. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-12netfilter: nft_dynset: do not reject set updates with NFT_SET_EVALPablo Neira Ayuso1-3/+1
NFT_SET_EVAL is signalling the kernel that this sets can be updated from the evaluation path, even if there are no expressions attached to the element. Otherwise, set updates with no expressions fail. Update description to describe the right semantics. Fixes: 22fe54d5fefc ("netfilter: nf_tables: add support for dynamic set updates") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-12netfilter: nft_socket: fix module autoloadPablo Neira Ayuso1-0/+1
Add alias definition for module autoload when adding socket rules. Fixes: 554ced0a6e29 ("netfilter: nf_tables: add support for native socket matching") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-12tcp: Do not reload skb pointer after skb_gro_receive().David Miller1-2/+0
This is not necessary. skb_gro_receive() will never change what 'head' points to. In it's original implementation (see commit 71d93b39e52e ("net: Add skb_gro_receive")), it did: ==================== + *head = nskb; + nskb->next = p->next; + p->next = NULL; ==================== This sequence was removed in commit 58025e46ea2d ("net: gro: remove obsolete code from skb_gro_receive()") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric Dumazet <edumazet@google.com>
2018-06-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller1-1/+2
Daniel Borkmann says: ==================== pull-request: bpf 2018-06-12 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Avoid an allocation warning in AF_XDP by adding __GFP_NOWARN for the umem setup, from Björn. 2) Silence a warning in bpf fs when an application tries to open(2) a pinned bpf obj due to missing fops. Add a dummy open fop that continues to just bail out in such case, from Daniel. 3) Fix a BPF selftest urandom_read build issue where gcc complains that it gets built twice, from Anders. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12net/ipv6: Ensure cfg is properly initialized in ipv6_create_tempaddrDavid Ahern1-1/+1
Valdis reported a BUG in ipv6_add_addr: [ 1820.832682] BUG: unable to handle kernel NULL pointer dereference at 0000000000000209 [ 1820.832728] RIP: 0010:ipv6_add_addr+0x280/0xd10 [ 1820.832732] Code: 49 8b 1f 0f 84 6a 0a 00 00 48 85 db 0f 84 4e 0a 00 00 48 8b 03 48 8b 53 08 49 89 45 00 49 8b 47 10 49 89 55 08 48 85 c0 74 15 <48> 8b 50 08 48 8b 00 49 89 95 b8 01 00 00 49 89 85 b0 01 00 00 4c [ 1820.832847] RSP: 0018:ffffaa07c2fd7880 EFLAGS: 00010202 [ 1820.832853] RAX: 0000000000000201 RBX: ffffaa07c2fd79b0 RCX: 0000000000000000 [ 1820.832858] RDX: a4cfbfba2cbfa64c RSI: 0000000000000000 RDI: ffffffff8a8e9fa0 [ 1820.832862] RBP: ffffaa07c2fd7920 R08: 000000000000017a R09: ffffffff8a555300 [ 1820.832866] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888d18e71c00 [ 1820.832871] R13: ffff888d0a9b1200 R14: 0000000000000000 R15: ffffaa07c2fd7980 [ 1820.832876] FS: 00007faa51bdb800(0000) GS:ffff888d1d400000(0000) knlGS:0000000000000000 [ 1820.832880] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1820.832885] CR2: 0000000000000209 CR3: 000000021e8f8001 CR4: 00000000001606e0 [ 1820.832888] Call Trace: [ 1820.832898] ? __local_bh_enable_ip+0x119/0x260 [ 1820.832904] ? ipv6_create_tempaddr+0x259/0x5a0 [ 1820.832912] ? __local_bh_enable_ip+0x139/0x260 [ 1820.832921] ipv6_create_tempaddr+0x2da/0x5a0 [ 1820.832926] ? ipv6_create_tempaddr+0x2da/0x5a0 [ 1820.832941] manage_tempaddrs+0x1a5/0x240 [ 1820.832951] inet6_addr_del+0x20b/0x3b0 [ 1820.832959] ? nla_parse+0xce/0x1e0 [ 1820.832968] inet6_rtm_deladdr+0xd9/0x210 [ 1820.832981] rtnetlink_rcv_msg+0x1d4/0x5f0 Looking at the code I found 1 element (peer_pfx) of the newly introduced ifa6_config struct that is not initialized. Use a memset rather than hard coding an init for each struct element. Reported-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Fixes: e6464b8c63619 ("net/ipv6: Convert ipv6_add_addr to struct ifa6_config") Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12tls: fix NULL pointer dereference on pollDaniel Borkmann2-11/+10
While hacking on kTLS, I ran into the following panic from an unprivileged netserver / netperf TCP session: BUG: unable to handle kernel NULL pointer dereference at 0000000000000000 PGD 800000037f378067 P4D 800000037f378067 PUD 3c0e61067 PMD 0 Oops: 0010 [#1] SMP KASAN PTI CPU: 1 PID: 2289 Comm: netserver Not tainted 4.17.0+ #139 Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET47W (1.21 ) 11/28/2016 RIP: 0010: (null) Code: Bad RIP value. RSP: 0018:ffff88036abcf740 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffff88036f5f6800 RCX: 1ffff1006debed26 RDX: ffff88036abcf920 RSI: ffff8803cb1a4f00 RDI: ffff8803c258c280 RBP: ffff8803c258c280 R08: ffff8803c258c280 R09: ffffed006f559d48 R10: ffff88037aacea43 R11: ffffed006f559d49 R12: ffff8803c258c280 R13: ffff8803cb1a4f20 R14: 00000000000000db R15: ffffffffc168a350 FS: 00007f7e631f4700(0000) GS:ffff8803d1c80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffffffffd6 CR3: 00000003ccf64005 CR4: 00000000003606e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? tls_sw_poll+0xa4/0x160 [tls] ? sock_poll+0x20a/0x680 ? do_select+0x77b/0x11a0 ? poll_schedule_timeout.constprop.12+0x130/0x130 ? pick_link+0xb00/0xb00 ? read_word_at_a_time+0x13/0x20 ? vfs_poll+0x270/0x270 ? deref_stack_reg+0xad/0xe0 ? __read_once_size_nocheck.constprop.6+0x10/0x10 [...] Debugging further, it turns out that calling into ctx->sk_poll() is invalid since sk_poll itself is NULL which was saved from the original TCP socket in order for tls_sw_poll() to invoke it. Looks like the recent conversion from poll to poll_mask callback started in 152524231023 ("net: add support for ->poll_mask in proto_ops") missed to eventually convert kTLS, too: TCP's ->poll was converted over to the ->poll_mask in commit 2c7d3dacebd4 ("net/tcp: convert to ->poll_mask") and therefore kTLS wrongly saved the ->poll old one which is now NULL. Convert kTLS over to use ->poll_mask instead. Also instead of POLLIN | POLLRDNORM use the proper EPOLLIN | EPOLLRDNORM bits as the case in tcp_poll_mask() as well that is mangled here. Fixes: 2c7d3dacebd4 ("net/tcp: convert to ->poll_mask") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Watson <davejwatson@fb.com> Tested-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12xsk: silence warning on memory allocation failureBjörn Töpel1-1/+2
syzkaller reported a warning from xdp_umem_pin_pages(): WARNING: CPU: 1 PID: 4537 at mm/slab_common.c:996 kmalloc_slab+0x56/0x70 mm/slab_common.c:996 ... __do_kmalloc mm/slab.c:3713 [inline] __kmalloc+0x25/0x760 mm/slab.c:3727 kmalloc_array include/linux/slab.h:634 [inline] kcalloc include/linux/slab.h:645 [inline] xdp_umem_pin_pages net/xdp/xdp_umem.c:205 [inline] xdp_umem_reg net/xdp/xdp_umem.c:318 [inline] xdp_umem_create+0x5c9/0x10f0 net/xdp/xdp_umem.c:349 xsk_setsockopt+0x443/0x550 net/xdp/xsk.c:531 __sys_setsockopt+0x1bd/0x390 net/socket.c:1935 __do_sys_setsockopt net/socket.c:1946 [inline] __se_sys_setsockopt net/socket.c:1943 [inline] __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1943 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe This is a warning about attempting to allocate more than KMALLOC_MAX_SIZE memory. The request originates from userspace, and if the request is too big, the kernel is free to deny its allocation. In this patch, the failed allocation attempt is silenced with __GFP_NOWARN. Fixes: c0c77d8fb787 ("xsk: add user memory registration support sockopt") Reported-by: syzbot+4abadc5d69117b346506@syzkaller.appspotmail.com Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller11-19/+54
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree: 1) Reject non-null terminated helper names from xt_CT, from Gao Feng. 2) Fix KASAN splat due to out-of-bound access from commit phase, from Alexey Kodanev. 3) Missing conntrack hook registration on IPVS FTP helper, from Julian Anastasov. 4) Incorrect skbuff allocation size in bridge nft_reject, from Taehee Yoo. 5) Fix inverted check on packet xmit to non-local addresses, also from Julian. 6) Fix ebtables alignment compat problems, from Alin Nastac. 7) Hook mask checks are not correct in xt_set, from Serhey Popovych. 8) Fix timeout listing of element in ipsets, from Jozsef. 9) Cap maximum timeout value in ipset, also from Jozsef. 10) Don't allow family option for hash:mac sets, from Florent Fourcot. 11) Restrict ebtables to work with NFPROTO_BRIDGE targets only, this Florian. 12) Another bug reported by KASAN in the rbtree set backend, from Taehee Yoo. 13) Missing __IPS_MAX_BIT update doesn't include IPS_OFFLOAD_BIT. From Gao Feng. 14) Missing initialization of match/target in ebtables, from Florian Westphal. 15) Remove useless nft_dup.h file in include path, from C. Labbe. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12net: dsa: add error handling for pskb_trim_rcsumZhouyang Jia1-1/+2
When pskb_trim_rcsum fails, the lack of error-handling code may cause unexpected results. This patch adds error-handling code after calling pskb_trim_rcsum. Signed-off-by: Zhouyang Jia <jiazhouyang09@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-12ipv6: allow PMTU exceptions to local routesJulian Anastasov1-3/+0
IPVS setups with local client and remote tunnel server need to create exception for the local virtual IP. What we do is to change PMTU from 64KB (on "lo") to 1460 in the common case. Suggested-by: Martin KaFai Lau <kafai@fb.com> Fixes: 45e4fd26683c ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception") Fixes: 7343ff31ebf0 ("ipv6: Don't create clones of host routes.") Signed-off-by: Julian Anastasov <ja@ssi.bg> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds15-40/+60
Pull networking fixes from David Miller: 1) Fix several bpfilter/UMH bugs, in particular make the UMH build not depend upon X86 specific Kconfig symbols. From Alexei Starovoitov. 2) Fix handling of modified context pointer in bpf verifier, from Daniel Borkmann. 3) Kill regression in ifdown/ifup sequences for hv_netvsc driver, from Dexuan Cui. 4) When the bonding primary member name changes, we have to re-evaluate the bond->force_primary setting, from Xiangning Yu. 5) Eliminate possible padding beyone end of SKB in cdc_ncm driver, from Bjørn Mork. 6) RX queue length reported for UDP sockets in procfs and socket diag are inaccurate, from Paolo Abeni. 7) Fix br_fdb_find_port() locking, from Petr Machata. 8) Limit sk_rcvlowat values properly in TCP, from Soheil Hassas Yeganeh. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (23 commits) tcp: limit sk_rcvlowat by the maximum receive buffer net: phy: dp83822: use BMCR_ANENABLE instead of BMSR_ANEGCAPABLE for DP83620 socket: close race condition between sock_close() and sockfs_setattr() net: bridge: Fix locking in br_fdb_find_port() udp: fix rx queue len reported by diag and proc interface cdc_ncm: avoid padding beyond end of skb net/sched: act_simple: fix parsing of TCA_DEF_DATA net: fddi: fix a possible null-ptr-deref net: aquantia: fix unsigned numvecs comparison with less than zero net: stmmac: fix build failure due to missing COMMON_CLK dependency bpfilter: fix race in pipe access bpf, xdp: fix crash in xdp_umem_unaccount_pages xsk: Fix umem fill/completion queue mmap on 32-bit tools/bpf: fix selftest get_cgroup_id_user bpfilter: fix OUTPUT_FORMAT umh: fix race condition net: mscc: ocelot: Fix uninitialized error in ocelot_netdevice_event() bonding: re-evaluate force_primary when the primary slave name changes ip_tunnel: Fix name string concatenate in __ip_tunnel_create() hv_netvsc: Fix a network regression after ifdown/ifup ...
2018-06-11tcp: limit sk_rcvlowat by the maximum receive bufferSoheil Hassas Yeganeh1-5/+7
The user-provided value to setsockopt(SO_RCVLOWAT) can be larger than the maximum possible receive buffer. Such values mute POLLIN signals on the socket which can stall progress on the socket. Limit the user-provided value to half of the maximum receive buffer, i.e., half of sk_rcvbuf when the receive buffer size is set by the user, or otherwise half of sysctl_tcp_rmem[2]. Fixes: d1361840f8c5 ("tcp: fix SO_RCVLOWAT and RCVBUF autotuning") Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-10socket: close race condition between sock_close() and sockfs_setattr()Cong Wang1-3/+15
fchownat() doesn't even hold refcnt of fd until it figures out fd is really needed (otherwise is ignored) and releases it after it resolves the path. This means sock_close() could race with sockfs_setattr(), which leads to a NULL pointer dereference since typically we set sock->sk to NULL in ->release(). As pointed out by Al, this is unique to sockfs. So we can fix this in socket layer by acquiring inode_lock in sock_close() and checking against NULL in sockfs_setattr(). sock_release() is called in many places, only the sock_close() path matters here. And fortunately, this should not affect normal sock_close() as it is only called when the last fd refcnt is gone. It only affects sock_close() with a parallel sockfs_setattr() in progress, which is not common. Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.") Reported-by: shankarapailoor <shankarapailoor@gmail.com> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-09net: bridge: Fix locking in br_fdb_find_port()Petr Machata1-1/+3
Callers of br_fdb_find() need to hold the hash lock, which br_fdb_find_port() doesn't do. However, since br_fdb_find_port() is not doing any actual FDB manipulation, the hash lock is not really needed at all. So convert to br_fdb_find_rcu(), surrounded by rcu_read_lock() / _unlock() pair. The device pointer copied from inside the FDB entry is then kept alive by the RTNL lock, which br_fdb_find_port() asserts. Fixes: 4d4fd36126d6 ("net: bridge: Publish bridge accessor functions") Signed-off-by: Petr Machata <petrm@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-09udp: fix rx queue len reported by diag and proc interfacePaolo Abeni4-6/+7
After commit 6b229cf77d68 ("udp: add batching to udp_rmem_release()") the sk_rmem_alloc field does not measure exactly anymore the receive queue length, because we batch the rmem release. The issue is really apparent only after commit 0d4a6608f68c ("udp: do rmem bulk free even if the rx sk queue is empty"): the user space can easily check for an empty socket with not-0 queue length reported by the 'ss' tool or the procfs interface. We need to use a custom UDP helper to report the correct queue length, taking into account the forward allocation deficit. Reported-by: trevor.francis@46labs.com Fixes: 6b229cf77d68 ("UDP: add batching to udp_rmem_release()") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-09net/sched: act_simple: fix parsing of TCA_DEF_DATADavide Caratti1-9/+6
use nla_strlcpy() to avoid copying data beyond the length of TCA_DEF_DATA netlink attribute, in case it is less than SIMP_MAX_DATA and it does not end with '\0' character. v2: fix errors in the commit message, thanks Hangbin Liu Fixes: fa1b1cff3d06 ("net_cls_act: Make act_simple use of netlink policy.") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-08netfilter: x_tables: initialise match/target check parameter structFlorian Westphal3-0/+4
syzbot reports following splat: BUG: KMSAN: uninit-value in ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162 ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162 xt_check_match+0x1438/0x1650 net/netfilter/x_tables.c:506 ebt_check_match net/bridge/netfilter/ebtables.c:372 [inline] ebt_check_entry net/bridge/netfilter/ebtables.c:702 [inline] The uninitialised access is xt_mtchk_param->nft_compat ... which should be set to 0. Fix it by zeroing the struct beforehand, same for tgchk. ip(6)tables targetinfo uses c99-style initialiser, so no change needed there. Reported-by: syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com Fixes: 55917a21d0cc0 ("netfilter: x_tables: add context to know if extension runs from nft_compat") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2018-06-08net/9p/trans_xen.c: don't inclide rwlock.h directlySebastian Andrzej Siewior1-1/+0
rwlock.h should not be included directly. Instead linux/splinlock.h should be included. One thing it does is to break the RT build. Link: http://lkml.kernel.org/r/20180504100319.11880-1-bigeasy@linutronix.de Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-08net/9p: detect invalid options as much as possibleChengguang Xu1-8/+5
Currently when detecting invalid options in option parsing, some options(e.g. msize) just set errno and allow to continuously validate other options so that it can detect invalid options as much as possible and give proper error messages together. This patch applies same rule to option 'trans' and 'version' when detecting -EINVAL. Link: http://lkml.kernel.org/r/1525340676-34072-1-git-send-email-cgxu519@gmx.com Signed-off-by: Chengguang Xu <cgxu519@gmx.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Cc: Eric Van Hensbergen <ericvh@gmail.com> Cc: Ron Minnich <rminnich@sandia.gov> Cc: Latchesar Ionkov <lucho@ionkov.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-08bpfilter: fix race in pipe accessAlexei Starovoitov1-3/+7
syzbot reported the following crash [ 338.293946] bpfilter: read fail -512 [ 338.304515] kasan: GPF could be caused by NULL-ptr deref or user memory access [ 338.311863] general protection fault: 0000 [#1] SMP KASAN [ 338.344360] RIP: 0010:__vfs_write+0x4a6/0x960 [ 338.426363] Call Trace: [ 338.456967] __kernel_write+0x10c/0x380 [ 338.460928] __bpfilter_process_sockopt+0x1d8/0x35b [ 338.487103] bpfilter_mbox_request+0x4d/0xb0 [ 338.491492] bpfilter_ip_get_sockopt+0x6b/0x90 This can happen when multiple cpus trying to talk to user mode process via bpfilter_mbox_request(). One cpu grabs the mutex while another goes to sleep on the same mutex. Then former cpu sees that umh pipe is down and shuts down the pipes. Later cpu finally acquires the mutex and crashes on freed pipe. Fix the race by using info.pid as an indicator that umh and pipes are healthy and check it after acquiring the mutex. Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module") Reported-by: syzbot+7ade6c94abb2774c0fee@syzkaller.appspotmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2-3/+5
Daniel Borkmann says: ==================== pull-request: bpf 2018-06-08 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix in the BPF verifier to reject modified ctx pointers on helper functions, from Daniel. 2) Fix in BPF kselftests for get_cgroup_id_user() helper to only record the cgroup id for a provided pid in order to reduce test failures from processes interferring with the test, from Yonghong. 3) Fix a crash in AF_XDP's mem accounting when the process owning the sock has CAP_IPC_LOCK capabilities set, from Daniel. 4) Fix an issue for AF_XDP on 32 bit machines where XDP_UMEM_PGOFF_*_RING defines need ULL suffixes and use loff_t type as they are otherwise truncated, from Geert. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-08bpf, xdp: fix crash in xdp_umem_unaccount_pagesDaniel Borkmann1-2/+4
syzkaller was able to trigger the following panic for AF_XDP: BUG: KASAN: null-ptr-deref in atomic64_sub include/asm-generic/atomic-instrumented.h:144 [inline] BUG: KASAN: null-ptr-deref in atomic_long_sub include/asm-generic/atomic-long.h:199 [inline] BUG: KASAN: null-ptr-deref in xdp_umem_unaccount_pages.isra.4+0x3d/0x80 net/xdp/xdp_umem.c:135 Write of size 8 at addr 0000000000000060 by task syz-executor246/4527 CPU: 1 PID: 4527 Comm: syz-executor246 Not tainted 4.17.0+ #89 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1b9/0x294 lib/dump_stack.c:113 kasan_report_error mm/kasan/report.c:352 [inline] kasan_report.cold.7+0x6d/0x2fe mm/kasan/report.c:412 check_memory_region_inline mm/kasan/kasan.c:260 [inline] check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267 kasan_check_write+0x14/0x20 mm/kasan/kasan.c:278 atomic64_sub include/asm-generic/atomic-instrumented.h:144 [inline] atomic_long_sub include/asm-generic/atomic-long.h:199 [inline] xdp_umem_unaccount_pages.isra.4+0x3d/0x80 net/xdp/xdp_umem.c:135 xdp_umem_reg net/xdp/xdp_umem.c:334 [inline] xdp_umem_create+0xd6c/0x10f0 net/xdp/xdp_umem.c:349 xsk_setsockopt+0x443/0x550 net/xdp/xsk.c:531 __sys_setsockopt+0x1bd/0x390 net/socket.c:1935 __do_sys_setsockopt net/socket.c:1946 [inline] __se_sys_setsockopt net/socket.c:1943 [inline] __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1943 do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x49/0xbe In xdp_umem_reg() the call to xdp_umem_account_pages() passed with CAP_IPC_LOCK where we didn't need to end up charging rlimit on memlock for the current user and therefore umem->user continues to be NULL. Later on through fault injection syzkaller triggered a failure in either umem->pgs or umem->pages allocation such that we bail out and undo accounting in xdp_umem_unaccount_pages() where we eventually hit the panic since it tries to deref the umem->user. The code is pretty close to mm_account_pinned_pages() and mm_unaccount_pinned_pages() pair and potentially could reuse it even in a later cleanup, and it appears that the initial commit c0c77d8fb787 ("xsk: add user memory registration support sockopt") got this right while later follow-up introduced the bug via a49049ea2576 ("xsk: simplified umem setup"). Fixes: a49049ea2576 ("xsk: simplified umem setup") Reported-by: syzbot+979217770b09ebf5c407@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-06-08xsk: Fix umem fill/completion queue mmap on 32-bitGeert Uytterhoeven1-1/+1
With gcc-4.1.2 on 32-bit: net/xdp/xsk.c:663: warning: integer constant is too large for ‘long’ type net/xdp/xsk.c:665: warning: integer constant is too large for ‘long’ type Add the missing "ULL" suffixes to the large XDP_UMEM_PGOFF_*_RING values to fix this. net/xdp/xsk.c:663: warning: comparison is always false due to limited range of data type net/xdp/xsk.c:665: warning: comparison is always false due to limited range of data type "unsigned long" is 32-bit on 32-bit systems, hence the offset is truncated, and can never be equal to any of the XDP_UMEM_PGOFF_*_RING values. Use loff_t (and the required cast) to fix this. Fixes: 423f38329d267969 ("xsk: add umem fill queue support and mmap") Fixes: fe2308328cd2f26e ("xsk: add umem completion queue support and mmap") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-06-07bpfilter: fix OUTPUT_FORMATAlexei Starovoitov1-1/+1
CONFIG_OUTPUT_FORMAT is x86 only macro. Used objdump to extract elf file format. Fixes: d2ba09c17a06 ("net: add skeleton of bpfilter kernel module") Reported-by: David S. Miller <davem@davemloft.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-07ip_tunnel: Fix name string concatenate in __ip_tunnel_create()Sultan Alsawaf1-2/+2
By passing a limit of 2 bytes to strncat, strncat is limited to writing fewer bytes than what it's supposed to append to the name here. Since the bounds are checked on the line above this, just remove the string bounds checks entirely since they're unneeded. Signed-off-by: Sultan Alsawaf <sultanxda@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-07net: in virtio_net_hdr only add VLAN_HLEN to csum_start if payload holds vlanWillem de Bruijn1-2/+2
Tun, tap, virtio, packet and uml vector all use struct virtio_net_hdr to communicate packet metadata to userspace. For skbuffs with vlan, the first two return the packet as it may have existed on the wire, inserting the VLAN tag in the user buffer. Then virtio_net_hdr.csum_start needs to be adjusted by VLAN_HLEN bytes. Commit f09e2249c4f5 ("macvtap: restore vlan header on user read") added this feature to macvtap. Commit 3ce9b20f1971 ("macvtap: Fix csum_start when VLAN tags are present") then fixed up csum_start. Virtio, packet and uml do not insert the vlan header in the user buffer. When introducing virtio_net_hdr_from_skb to deduplicate filling in the virtio_net_hdr, the variant from macvtap which adds VLAN_HLEN was applied uniformly, breaking csum offset for packets with vlan on virtio and packet. Make insertion of VLAN_HLEN optional. Convert the callers to pass it when needed. Fixes: e858fae2b0b8f4 ("virtio_net: use common code for virtio_net_hdr and skb GSO conversion") Fixes: 1276f24eeef2 ("packet: use common code for virtio_net_hdr and skb GSO conversion") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-07netfilter: nf_tables: add NFT_LOGLEVEL_* enumeration and use itPablo Neira Ayuso1-5/+5
This is internal, not exposed through uapi, and although it maps with userspace LOG_*, with the introduction of LOGLEVEL_AUDIT we are incurring in namespace pollution. This patch adds the NFT_LOGLEVEL_ enumeration and use it from nft_log. Fixes: 1a893b44de45 ("netfilter: nf_tables: Add audit support to log statement") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-06-07Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds332-8128/+20336
Pull networking updates from David Miller: 1) Add Maglev hashing scheduler to IPVS, from Inju Song. 2) Lots of new TC subsystem tests from Roman Mashak. 3) Add TCP zero copy receive and fix delayed acks and autotuning with SO_RCVLOWAT, from Eric Dumazet. 4) Add XDP_REDIRECT support to mlx5 driver, from Jesper Dangaard Brouer. 5) Add ttl inherit support to vxlan, from Hangbin Liu. 6) Properly separate ipv6 routes into their logically independant components. fib6_info for the routing table, and fib6_nh for sets of nexthops, which thus can be shared. From David Ahern. 7) Add bpf_xdp_adjust_tail helper, which can be used to generate ICMP messages from XDP programs. From Nikita V. Shirokov. 8) Lots of long overdue cleanups to the r8169 driver, from Heiner Kallweit. 9) Add BTF ("BPF Type Format"), from Martin KaFai Lau. 10) Add traffic condition monitoring to iwlwifi, from Luca Coelho. 11) Plumb extack down into fib_rules, from Roopa Prabhu. 12) Add Flower classifier offload support to igb, from Vinicius Costa Gomes. 13) Add UDP GSO support, from Willem de Bruijn. 14) Add documentation for eBPF helpers, from Quentin Monnet. 15) Add TLS tx offload to mlx5, from Ilya Lesokhin. 16) Allow applications to be given the number of bytes available to read on a socket via a control message returned from recvmsg(), from Soheil Hassas Yeganeh. 17) Add x86_32 eBPF JIT compiler, from Wang YanQing. 18) Add AF_XDP sockets, with zerocopy support infrastructure as well. From Björn Töpel. 19) Remove indirect load support from all of the BPF JITs and handle these operations in the verifier by translating them into native BPF instead. From Daniel Borkmann. 20) Add GRO support to ipv6 gre tunnels, from Eran Ben Elisha. 21) Allow XDP programs to do lookups in the main kernel routing tables for forwarding. From David Ahern. 22) Allow drivers to store hardware state into an ELF section of kernel dump vmcore files, and use it in cxgb4. From Rahul Lakkireddy. 23) Various RACK and loss detection improvements in TCP, from Yuchung Cheng. 24) Add TCP SACK compression, from Eric Dumazet. 25) Add User Mode Helper support and basic bpfilter infrastructure, from Alexei Starovoitov. 26) Support ports and protocol values in RTM_GETROUTE, from Roopa Prabhu. 27) Support bulking in ->ndo_xdp_xmit() API, from Jesper Dangaard Brouer. 28) Add lots of forwarding selftests, from Petr Machata. 29) Add generic network device failover driver, from Sridhar Samudrala. * ra.kernel.org:/pub/scm/linux/kernel/git/davem/net-next: (1959 commits) strparser: Add __strp_unpause and use it in ktls. rxrpc: Fix terminal retransmission connection ID to include the channel net: hns3: Optimize PF CMDQ interrupt switching process net: hns3: Fix for VF mailbox receiving unknown message net: hns3: Fix for VF mailbox cannot receiving PF response bnx2x: use the right constant Revert "net: sched: cls: Fix offloading when ingress dev is vxlan" net: dsa: b53: Fix for brcm tag issue in Cygnus SoC enic: fix UDP rss bits netdev-FAQ: clarify DaveM's position for stable backports rtnetlink: validate attributes in do_setlink() mlxsw: Add extack messages for port_{un, }split failures netdevsim: Add extack error message for devlink reload devlink: Add extack to reload and port_{un, }split operations net: metrics: add proper netlink validation ipmr: fix error path when ipmr_new_table fails ip6mr: only set ip6mr_table from setsockopt when ip6mr_new_table succeeds net: hns3: remove unused hclgevf_cfg_func_mta_filter netfilter: provide udp*_lib_lookup for nf_tproxy qed*: Utilize FW 8.37.2.0 ...