Age | Commit message (Collapse) | Author | Files | Lines |
|
8021q is one of the stackable net devices which pass the hardware
timestamping ops to the real device through ndo_eth_ioctl(). This
prevents converting any device driver to the new hwtimestamping API
without regressions.
Remove that limitation in the vlan driver by using the newly introduced
helpers for timestamping through lower devices, that handle both the new
and the old driver API.
Signed-off-by: Maxim Georgiev <glipus@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The stackable net devices with hwtstamping support (vlan, macvlan,
bonding) only pass the hwtstamping ops to the lower (real) device.
These drivers are the first that need to be converted to the new
timestamping API, because if they aren't prepared to handle that,
then no real device driver cannot be converted to the new API either.
After studying what vlan_dev_ioctl(), macvlan_eth_ioctl() and
bond_eth_ioctl() have in common, here we propose two generic
implementations of ndo_hwtstamp_get() and ndo_hwtstamp_set() which
can be called by those 3 drivers, with "dev" being their lower device.
These helpers cover both cases, when the lower driver is converted to
the new API or unconverted.
We need some hacks in case of an unconverted driver, namely to stuff
some pointers in struct kernel_hwtstamp_config which shouldn't have
been there (since the new API isn't supposed to need it). These will
be removed when all drivers will have been converted to the new API.
Signed-off-by: Maxim Georgiev <glipus@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Current hardware timestamping API for NICs requires implementing
.ndo_eth_ioctl() for SIOCGHWTSTAMP and SIOCSHWTSTAMP.
That API has some boilerplate such as request parameter translation
between user and kernel address spaces, handling possible translation
failures correctly, etc. Since it is the same all across the board, it
would be desirable to handle it through generic code.
Here we introduce .ndo_hwtstamp_get() and .ndo_hwtstamp_set(), which
implement that boilerplate and allow drivers to just act upon requests.
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Maxim Georgiev <glipus@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Link: https://lore.kernel.org/r/20230801142824.1772134-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
packet_alloc_skb() is currently calling sock_alloc_send_pskb()
forcing order-0 page allocations.
Switch to PAGE_ALLOC_COSTLY_ORDER, to increase max size by 8x.
Also add logic to increase the linear part if needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230801205254.400094-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refactor alloc_skb_with_frags() to allow bigger packets allocations.
Instead of assuming that only order-0 allocations will be attempted,
use the caller supplied max order.
v2: try harder to use high-order pages, per Willem feedback.
Link: https://lore.kernel.org/netdev/CANn89iJQfmc_KeUr3TeXvsLQwo3ZymyoCr7Y6AnHrkWSuz0yAg@mail.gmail.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tahsin Erdogan <trdgn@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20230801205254.400094-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit b1edeb102397 ("netlabel: Replace protocol/NetLabel linking with refrerence counts")
this declaration is unused and can be removed.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20230801143453.24452-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 642c2c95585d ("ila: xlat changes") removed ila_xlat_outgoing()
and ila_xlat_incoming() functions, then this file became unnecessary.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230801143129.40652-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch enables offload for TC classifier
flower rules which matches against SPI field.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tc flower rules support to classify ESP/AH
packets matching SPI field.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Support for dissecting IPSEC field SPI (which is
32bits in size) for ESP and AH packets.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Instead of allocating from 1 use proper xa_init flag,
to protect ourselves from IDs wrapping back to 0.
Fixes: 759ab1edb56c ("net: store netdevs in an xarray")
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Link: https://lore.kernel.org/all/20230728162350.2a6d4979@hermes.local/
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230731171159.988962-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently checksum is recalculated and dsa tag stripped even if we later
don't find the dev.
To improve code, exit early if we don't find the dev and skip additional
operation on the skb since it will be freed anyway.
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/20230730074113.21889-1-ansuelsmth@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add extack to warn that delete was rejected because
the class is still in use
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add extack to warn that delete was rejected because
the class is still in use
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add extack to warn that delete was rejected because
the class is still in use
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add extack to warn that delete was rejected because
the class is still in use
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The 'filter_cnt' counter is used to control a Qdisc class lifetime.
Each filter referecing this class by its id will eventually
increment/decrement this counter in their respective
'add/update/delete' routines.
As these operations are always serialized under rtnl lock, we don't
need an atomic type like 'refcount_t'.
It also means that we lose the overflow/underflow checks already
present in refcount_t, which are valuable to hunt down bugs
where the unsigned counter wraps around as it aids automated tools
like syzkaller to scream in such situations.
Wrap the open coded increment/decrement into helper functions and
add overflow checks to the operations.
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
commit f421436a591d ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
introducted these but never implemented.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230729123456.36340-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These are never implemented since introduction in
commit d021c344051a ("VSOCK: Introduce VM Sockets")
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20230729122036.32988-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
commit f9aab6f2ce57 ("net/smc: immediate freeing in smc_lgr_cleanup_early()")
left behind smc_lgr_schedule_free_work_fast() declaration.
And since commit 349d43127dac ("net/smc: fix kernel panic caused by race of smc_sock")
smc_ib_modify_qp_reset() is not used anymore.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Link: https://lore.kernel.org/r/20230729121929.17180-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3") started
applying strict rules to standard string functions.
It does not work well with conventional socket code around each protocol-
specific sockaddr_XXX struct, which is cast from sockaddr_storage and has
a bigger size than fortified functions expect. See these commits:
commit 06d4c8a80836 ("af_unix: Fix fortify_panic() in unix_bind_bsd().")
commit ecb4534b6a1c ("af_unix: Terminate sun_path when bind()ing pathname socket.")
commit a0ade8404c3b ("af_packet: Fix warning of fortified memcpy() in packet_getname().")
We must cast the protocol-specific address back to sockaddr_storage
to call such functions.
However, in the case of getsockaddr(SO_PEERNAME), the rationale is a bit
unclear as the buffer is defined by char[128] which is the same size as
sockaddr_storage.
Let's use sockaddr_storage explicitly.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As 32bits of dissector->used_keys are exhausted,
increase the size to 64bits.
This is base change for ESP/AH flow dissector patch.
Please find patch and discussions at
https://lore.kernel.org/netdev/ZMDNjD46BvZ5zp5I@corigine.com/T/#t
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw
Tested-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Chuck Lever says:
====================
In-kernel support for the TLS Alert protocol
IMO the kernel doesn't need user space (ie, tlshd) to handle the TLS
Alert protocol. Instead, a set of small helper functions can be used
to handle sending and receiving TLS Alerts for in-kernel TLS
consumers.
====================
Merged on top of a tag in case it's needed in the NFS tree.
Link: https://lore.kernel.org/r/169047923706.5241.1181144206068116926.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add observability for the new TLS Alert infrastructure.
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047947409.5241.14548832149596892717.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the helpers to parse the level and description fields in
incoming alerts. "Warning" alerts are discarded, and "fatal"
alerts mean the session is no longer valid.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047944747.5241.1974889594004407123.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Kernel TLS consumers can replace common TLS Alert parsing code with
these helpers.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047942074.5241.13791647439480672048.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Before closing a TCP connection, the TLS protocol wants peers to
send session close Alert notifications. Add those in both the RPC
client and server.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047939404.5241.14392506226409865832.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This helper sends an alert only if a TLS session was established.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047936730.5241.618595693821012638.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Kernel TLS consumers will need definitions of various parts of the
TLS protocol, but often do not need the function declarations and
other infrastructure provided in <net/tls.h>.
Break out existing standardized protocol elements into a separate
header, and make room for a few more elements in subsequent patches.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Link: https://lore.kernel.org/r/169047931374.5241.7713175865185969309.stgit@oracle-102.nfsv4bat.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
accept_ra_min_rtr_lft only considered the lifetime of the default route
and discarded entire RAs accordingly.
This change renames accept_ra_min_rtr_lft to accept_ra_min_lft, and
applies the value to individual RA sections; in particular, router
lifetime, PIO preferred lifetime, and RIO lifetime. If any of those
lifetimes are lower than the configured value, the specific RA section
is ignored.
In order for the sysctl to be useful to Android, it should really apply
to all lifetimes in the RA, since that is what determines the minimum
frequency at which RAs must be processed by the kernel. Android uses
hardware offloads to drop RAs for a fraction of the minimum of all
lifetimes present in the RA (some networks have very frequent RAs (5s)
with high lifetimes (2h)). Despite this, we have encountered networks
that set the router lifetime to 30s which results in very frequent CPU
wakeups. Instead of disabling IPv6 (and dropping IPv6 ethertype in the
WiFi firmware) entirely on such networks, it seems better to ignore the
misconfigured routers while still processing RAs from other IPv6 routers
on the same network (i.e. to support IoT applications).
The previous implementation dropped the entire RA based on router
lifetime. This turned out to be hard to expand to the other lifetimes
present in the RA in a consistent manner; dropping the entire RA based
on RIO/PIO lifetimes would essentially require parsing the whole thing
twice.
Fixes: 1671bcfd76fd ("net: add sysctl accept_ra_min_rtr_lft")
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Patrick Rohr <prohr@google.com>
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230726230701.919212-1-prohr@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Reap the benefits of easier iteration thanks to the xarray.
Convert just the genetlink ones, those are easier to test.
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230726185530.2247698-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Iterating over the netdev hash table for netlink dumps is hard.
Dumps are done in "chunks" so we need to save the position
after each chunk, so we know where to restart from. Because
netdevs are stored in a hash table we remember which bucket
we were in and how many devices we dumped.
Since we don't hold any locks across the "chunks" - devices may
come and go while we're dumping. If that happens we may miss
a device (if device is deleted from the bucket we were in).
We indicate to user space that this may have happened by setting
NLM_F_DUMP_INTR. User space is supposed to dump again (I think)
if it sees that. Somehow I doubt most user space gets this right..
To illustrate let's look at an example:
System state:
start: # [A, B, C]
del: B # [A, C]
with the hash table we may dump [A, B], missing C completely even
tho it existed both before and after the "del B".
Add an xarray and use it to allocate ifindexes. This way we
can iterate ifindexes in order, without the worry that we'll
skip one. We may still generate a dump of a state which "never
existed", for example for a set of values and sequence of ops:
System state:
start: # [A, B]
add: C # [A, C, B]
del: B # [A, C]
we may generate a dump of [A], if C got an index between A and B.
System has never been in such state. But I'm 90% sure that's perfectly
fine, important part is that we can't _miss_ devices which exist before
and after. User space which wants to mirror kernel's state subscribes
to notifications and does periodic dumps so it will know that C exists
from the notification about its creation or from the next dump
(next dump is _guaranteed_ to include C, if it doesn't get removed).
To avoid any perf regressions keep the hash table for now. Most
net namespaces have very few devices and microbenchmarking 1M lookups
on Skylake I get the following results (not counting loopback
to number of devs):
#devs | hash | xa | delta
2 | 18.3 | 20.1 | + 9.8%
16 | 18.3 | 20.1 | + 9.5%
64 | 18.3 | 26.3 | +43.8%
128 | 20.4 | 26.3 | +28.6%
256 | 20.0 | 26.4 | +32.1%
1024 | 26.6 | 26.7 | + 0.2%
8192 |541.3 | 33.5 | -93.8%
No surprises since the hash table has 256 entries.
The microbenchmark scans indexes in order, if the pattern is more
random xa starts to win at 512 devices already. But that's a lot
of devices, in practice.
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230726185530.2247698-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add extack info for IPv6 address add/delete, which would be useful for
users to understand the problem without having to read kernel code.
Suggested-by: Beniamino Galvani <bgalvani@redhat.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ETHTOOL_GRXFH correctly copies in the full struct ethtool_rxnfc when
FLOW_RSS is set; ETHTOOL_SRXFH needs a similar code path to handle the
FLOW_RSS case so that ethtool can set the flow hash for custom RSS
contexts (if supported by the driver).
The copy code from ETHTOOL_GRXFH has been pulled out in to a helper so
that it can be called in both ETHTOOL_{G,S}RXFH code paths.
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it as merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Acked-by: Alex Elder <elder@linaro.org>
Reviewed-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230727014944.3972546-1-robh@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Florian Westphal says:
====================
netfilter updates for net-next
1. silence a harmless warning for CONFIG_NF_CONNTRACK_PROCFS=n builds,
from Zhu Wang.
2, 3:
Allow NLA_POLICY_MASK to be used with BE16/BE32 types, and replace a few
manual checks with nla_policy based one in nf_tables, from myself.
4: cleanup in ctnetlink to validate while parsing rather than
using two steps, from Lin Ma.
5: refactor boyer-moore textsearch by moving a small chunk to
a helper function, rom Jeremy Sowden.
* tag 'nf-next-23-07-27' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
lib/ts_bm: add helper to reduce indentation and improve readability
netfilter: conntrack: validate cta_ip via parsing
netfilter: nf_tables: use NLA_POLICY_MASK to test for valid flag options
netlink: allow be16 and be32 types in all uint policy checks
nf_conntrack: fix -Wunused-const-variable=
====================
Link: https://lore.kernel.org/r/20230727133604.8275-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Hannes Reinecke says:
====================
net/tls: fixes for NVMe-over-TLS
here are some small fixes to get NVMe-over-TLS up and running.
The first set are just minor modifications to have MSG_EOR handled
for TLS, but the second set implements the ->read_sock() callback
for tls_sw.
The ->read_sock() callbacks return -EIO when encountering any TLS
Alert message, but as that's the default behaviour anyway I guess
we can get away with it.
====================
Applied on top of the tag in case Sagi gets convinced to pull it.
Link: https://lore.kernel.org/r/20230726191556.41714-1-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Implement ->read_sock() function for use with nvme-tcp.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Cc: Boris Pismenny <boris.pismenny@gmail.com>
Link: https://lore.kernel.org/r/20230726191556.41714-7-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Split tls_rx_reader_{lock,unlock} into an 'acquire/release' and
the actual locking part.
With that we can use the tls_rx_reader_lock in situations where
the socket is already locked.
Suggested-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230726191556.41714-6-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TLS resets the protocol operations, so the read_sock() callback might
be changed, too.
In this case using sock->ops->readsock() in tls_strp_read_copyin() will
enter an infinite recursion if the read_sock() callback is calling
tls_rx_rec_wait() which will call into sock->ops->readsock() via
tls_strp_read_copyin().
But as tls_strp_read_copyin() is supposed to produce data from the
consumed socket and that socket is always a TCP socket we can call
tcp_read_sock() directly without having to deal with callbacks.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230726191556.41714-5-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tls_push_data() MSG_MORE, but bails out on MSG_EOR.
Seeing that MSG_EOR is basically the opposite of MSG_MORE
this patch adds handling MSG_EOR by treating it as the
absence of MSG_MORE.
Consequently we should return an error when both are set.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230726191556.41714-3-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tls_sw_sendmsg() already handles MSG_MORE, but bails
out on MSG_EOR.
Seeing that MSG_EOR is basically the opposite of
MSG_MORE this patch adds handling MSG_EOR by treating
it as the negation of MSG_MORE.
And erroring out if MSG_EOR is specified with MSG_MORE.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20230726191556.41714-2-hare@suse.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is never used, so can remove it.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230726143239.9904-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit 19e3a9c90c53 ("net: bridge: convert multicast to generic rhashtable")
this is not used, so can remove it.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://lore.kernel.org/r/20230726143141.11704-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Removes superfluous (and misplaced) comment from ndisc_router_discovery.
Signed-off-by: Patrick Rohr <prohr@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230726184742.342825-1-prohr@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This adds support of MSG_PEEK flag for SOCK_SEQPACKET type of socket.
Difference with SOCK_STREAM is that this callback returns either length
of the message or error.
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This reworks current implementation of MSG_PEEK logic:
1) Replaces 'skb_queue_walk_safe()' with 'skb_queue_walk()'. There is
no need in the first one, as there are no removes of skb in loop.
2) Removes nested while loop - MSG_PEEK logic could be implemented
without it: just iterate over skbs without removing it and copy
data from each until destination buffer is not full.
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
Reviewed-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In current ctnetlink_parse_tuple_ip() function, nested parsing and
validation is splitting as two parts, which could be cleanup to a
simplified form. As the nla_parse_nested_deprecated function
supports validation in the fly. These two finially reach same place
__nla_validate_parse with same validate flag.
nla_parse_nested_deprecated
__nla_parse(.., NL_VALIDATE_LIBERAL, ..)
__nla_validate_parse
nla_validate_nested_deprecated
__nla_validate_nested(.., NL_VALIDATE_LIBERAL, ..)
__nla_validate
__nla_validate_parse
This commit removes the call to nla_validate_nested_deprecated and pass
cta_ip_nla_policy when do parsing.
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
|
|
nf_tables relies on manual test of netlink attributes coming from userspace
even in cases where this could be handled via netlink policy.
Convert a bunch of 'flag' attributes to use NLA_POLICY_MASK checks.
Signed-off-by: Florian Westphal <fw@strlen.de>
|