summaryrefslogtreecommitdiff
path: root/net/mptcp/pm.c
AgeCommit message (Collapse)AuthorFilesLines
2024-02-05mptcp: annotate lockless access for tokenPaolo Abeni1-1/+1
The token field is manipulated under the msk socket lock and accessed lockless in a few spots, add proper ONCE annotation Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-27mptcp: drop useless ssk in pm_subflow_check_nextGeliang Tang1-1/+1
The code using 'ssk' parameter of mptcp_pm_subflow_check_next() has been dropped in commit "95d686517884 (mptcp: fix subflow accounting on close)". So drop this useless parameter ssk. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20231025-send-net-next-20231025-v1-4-db8f25f798eb@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-23mptcp: drop last_snd and MPTCP_RESET_SCHEDULERGeliang Tang1-8/+1
Since the burst check conditions have moved out of the function mptcp_subflow_get_send(), it makes all msk->last_snd useless. This patch drops them as well as the macro MPTCP_RESET_SCHEDULER. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20230821-upstream-net-next-20230818-v1-2-0c860fb256a8@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10mptcp: unify pm set_flags interfacesGeliang Tang1-0/+9
This patch unifies the three PM set_flags() interfaces: mptcp_pm_nl_set_flags() in mptcp/pm_netlink.c for the in-kernel PM and mptcp_userspace_pm_set_flags() in mptcp/pm_userspace.c for the userspace PM. They'll be switched in the common PM infterface mptcp_pm_set_flags() in mptcp/pm.c based on whether token is NULL or not. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10mptcp: unify pm get_flags_and_ifindex_by_idGeliang Tang1-0/+14
This patch unifies the three PM get_flags_and_ifindex_by_id() interfaces: mptcp_pm_nl_get_flags_and_ifindex_by_id() in mptcp/pm_netlink.c for the in-kernel PM and mptcp_userspace_pm_get_flags_and_ifindex_by_id() in mptcp/pm_userspace.c for the userspace PM. They'll be switched in the common PM infterface mptcp_pm_get_flags_and_ifindex_by_id() in mptcp/pm.c based on whether mptcp_pm_is_userspace() or not. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-10mptcp: unify pm get_local_id interfacesGeliang Tang1-1/+17
This patch unifies the three PM get_local_id() interfaces: mptcp_pm_nl_get_local_id() in mptcp/pm_netlink.c for the in-kernel PM and mptcp_userspace_pm_get_local_id() in mptcp/pm_userspace.c for the userspace PM. They'll be switched in the common PM infterface mptcp_pm_get_local_id() in mptcp/pm.c based on whether mptcp_pm_is_userspace() or not. Also put together the declarations of these three functions in protocol.h. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-4/+19
Cross-merge networking fixes after downstream PR. Conflicts: net/sched/sch_taprio.c d636fc5dd692 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping") dced11ef84fb ("net/sched: taprio: don't overwrite "sch" variable in taprio_dump_class_stats()") net/ipv4/sysctl_net_ipv4.c e209fee4118f ("net/ipv4: ping_group_range: allow GID from 2147483648 to 4294967294") ccce324dabfe ("tcp: make the first N SYN RTO backoffs linear") https://lore.kernel.org/all/20230605100816.08d41a7b@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-05mptcp: update userspace pm infosGeliang Tang1-4/+19
Increase pm subflows counter on both server side and client side when userspace pm creates a new subflow, and decrease the counter when it closes a subflow. Increase add_addr_signaled counter in mptcp_nl_cmd_announce() when the address is announced by userspace PM. This modification is similar to how the in-kernel PM is updating the counter: when additional subflows are created/removed. Fixes: 9ab4807c84a4 ("mptcp: netlink: Add MPTCP_PM_CMD_ANNOUNCE") Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/329 Cc: stable@vger.kernel.org Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-19mptcp: introduces more address related mibsPaolo Abeni1-2/+4
Currently we don't track explicitly a few events related to address management suboption handling; this patch adds new mibs for ADD_ADDR and RM_ADDR options tx and for missed tx events due to internal storage exhaustion. The self-tests must be updated to properly handle different mibs with the same/shared prefix. Additionally removes a couple of warning tracking the loss event. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/378 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-17mptcp: drop unneeded argumentPaolo Abeni1-2/+2
After commit 3a236aef280e ("mptcp: refactor passive socket initialization"), every mptcp_pm_fully_established() call is always invoked with a GFP_ATOMIC argument. We can then drop it. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-14mptcp: netlink: respect v4/v6-only socketsMatthieu Baerts1-0/+25
If an MPTCP socket has been created with AF_INET6 and the IPV6_V6ONLY option has been set, the userspace PM would allow creating subflows using IPv4 addresses, e.g. mapped in v6. The kernel side of userspace PM will also accept creating subflows with local and remote addresses having different families. Depending on the subflow socket's family, different behaviours are expected: - If AF_INET is forced with a v6 address, the kernel will take the last byte of the IP and try to connect to that: a new subflow is created but to a non expected address. - If AF_INET6 is forced with a v4 address, the kernel will try to connect to a v4 address (v4-mapped-v6). A -EBADF error from the connect() part is then expected. It is then required to check the given families can be accepted. This is done by using a new helper for addresses family matching, taking care of IPv4 vs IPv4-mapped-IPv6 addresses. This helper will be re-used later by the in-kernel path-manager to use mixed IPv4 and IPv6 addresses. While at it, a clear error message is now reported if there are some conflicts with the families that have been passed by the userspace. Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29mptcp: invoke MP_FAIL response when neededGeliang Tang1-5/+4
mptcp_mp_fail_no_response shouldn't be invoked on each worker run, it should be invoked only when MP_FAIL response timeout occurs. This patch refactors the MP_FAIL response logic. It leverages the fact that only the MPC/first subflow can gracefully fail to avoid unneeded subflows traversal: the failing subflow can be only msk->first. A new 'fail_tout' field is added to the subflow context to record the MP_FAIL response timeout and use such field to reliably share the timeout timer between the MP_FAIL event and the MPTCP socket close timeout. Finally, a new ack is generated to send out MP_FAIL notification as soon as we hit the relevant condition, instead of waiting a possibly unbound time for the next data packet. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/281 Fixes: d9fb797046c5 ("mptcp: Do not traverse the subflow connection list without lock") Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29mptcp: fix error mibs accountingPaolo Abeni1-1/+0
The current accounting for MP_FAIL and FASTCLOSE is not very accurate: both can be increased even when the related option is not really sent. Move the accounting into the correct place. Fixes: eb7f33654dc1 ("mptcp: add the mibs for MP_FAIL") Fixes: 1e75629cb964 ("mptcp: add the mibs for MP_FASTCLOSE") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-20mptcp: Check for orphaned subflow before handling MP_FAIL timerMat Martineau1-5/+2
MP_FAIL timeout (waiting for a peer to respond to an MP_FAIL with another MP_FAIL) is implemented using the MPTCP socket's sk_timer. That timer is also used at MPTCP socket close, so it's important to not have the two timer users interfere with each other. At MPTCP socket close, all subflows are orphaned before sk_timer is manipulated. By checking the SOCK_DEAD flag on the subflows, each subflow can determine if the timer is safe to alter without acquiring any MPTCP-level lock. This replaces code that was using the mptcp_data_lock and MPTCP-level socket state checks that did not correctly protect the timer. Fixes: 49fa1919d6bc ("mptcp: reset subflow when MP_FAIL doesn't respond") Reviewed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-20mptcp: stop using the mptcp_has_another_subflow() helperPaolo Abeni1-1/+1
The mentioned helper requires the msk socket lock, and the current callers don't own it nor can't acquire it, so the access is racy. All the current callers are really checking for infinite mapping fallback, and the latter condition is explicitly tracked by the relevant msk variable: we can safely remove the caller usage - and the caller itself. The issue is present since MP_FAIL implementation, but the fix only applies since the infinite fallback support, ence the somewhat unexpected fixes tag. Fixes: 0530020a7c8f ("mptcp: track and update contiguous data status") Acked-and-tested-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-3/+2
drivers/net/ethernet/mellanox/mlx5/core/main.c b33886971dbc ("net/mlx5: Initialize flow steering during driver probe") 40379a0084c2 ("net/mlx5_fpga: Drop INNOVA TLS support") f2b41b32cde8 ("net/mlx5: Remove ipsec_ops function table") https://lore.kernel.org/all/20220519040345.6yrjromcdistu7vh@sx1/ 16d42d313350 ("net/mlx5: Drain fw_reset when removing device") 8324a02c342a ("net/mlx5: Add exit route when waiting for FW") https://lore.kernel.org/all/20220519114119.060ce014@canb.auug.org.au/ tools/testing/selftests/net/mptcp/mptcp_join.sh e274f7154008 ("selftests: mptcp: add subflow limits test-cases") b6e074e171bc ("selftests: mptcp: add infinite map testcase") 5ac1d2d63451 ("selftests: mptcp: Add tests for userspace PM type") https://lore.kernel.org/all/20220516111918.366d747f@canb.auug.org.au/ net/mptcp/options.c ba2c89e0ea74 ("mptcp: fix checksum byte order") 1e39e5a32ad7 ("mptcp: infinite mapping sending") ea66758c1795 ("tcp: allow MPTCP to update the announced window") https://lore.kernel.org/all/20220519115146.751c3a37@canb.auug.org.au/ net/mptcp/pm.c 95d686517884 ("mptcp: fix subflow accounting on close") 4d25247d3ae4 ("mptcp: bypass in-kernel PM restrictions for non-kernel PMs") https://lore.kernel.org/all/20220516111435.72f35dca@canb.auug.org.au/ net/mptcp/subflow.c ae66fb2ba6c3 ("mptcp: Do TCP fallback on early DSS checksum failure") 0348c690ed37 ("mptcp: add the fallback check") f8d4bcacff3b ("mptcp: infinite mapping receiving") https://lore.kernel.org/all/20220519115837.380bb8d4@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-14mptcp: fix subflow accounting on closePaolo Abeni1-3/+2
If the PM closes a fully established MPJ subflow or the subflow creation errors out in it's early stage the subflows counter is not bumped accordingly. This change adds the missing accounting, additionally taking care of updating accordingly the 'accept_subflow' flag. Fixes: a88c9e496937 ("mptcp: do not block subflows creation on errors") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04mptcp: handle local addrs announced by userspace PMsKishen Maloor1-0/+1
This change adds an internal function to store/retrieve local addrs announced by userspace PM implementations to/from its kernel context. The function addresses the requirements of three scenarios: 1) ADD_ADDR announcements (which require that a local id be provided), 2) retrieving the local id associated with an address, and also where one may need to be assigned, and 3) reissuance of ADD_ADDRs when there's a successful match of addr/id. The list of all stored local addr entries is held under the MPTCP sock structure. Memory for these entries is allocated from the sock option buffer, so the list of addrs is bounded by optmem_max. The list if not released via REMOVE_ADDR signals is ultimately freed when the sock is destructed. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mptcp: reflect remote port (not 0) in ANNOUNCED eventsKishen Maloor1-2/+4
Per RFC 8684, if no port is specified in an ADD_ADDR message, MPTCP SHOULD attempt to connect to the specified address on the same port as the port that is already in use by the subflow on which the ADD_ADDR signal was sent. To facilitate that, this change reflects the specific remote port in use by that subflow in MPTCP_EVENT_ANNOUNCED events. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04mptcp: bypass in-kernel PM restrictions for non-kernel PMsKishen Maloor1-2/+13
Current limits on the # of addresses/subflows must apply only to in-kernel PM managed sockets. Thus this change removes such restrictions on connections overseen by non-kernel (e.g. userspace) PMs. This change also ensures that the kernel does not record stats inside struct mptcp_pm_data updated along kernel code paths when exercised via non-kernel PMs. Additionally, address announcements are acknolwedged and subflow requests are honored only when it's deemed that a userspace path manager is active at the time. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-30mptcp: Add a per-namespace sysctl to set the default path manager typeMat Martineau1-11/+23
The new net.mptcp.pm_type sysctl determines which path manager will be used by each newly-created MPTCP socket. v2: Handle builds without CONFIG_SYSCTL v3: Clarify logic for type-specific PM init (Geliang Tang and Paolo Abeni) Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-30mptcp: Bypass kernel PM when userspace PM is enabledMat Martineau1-1/+1
When a MPTCP connection is managed by a userspace PM, bypass the kernel PM for incoming advertisements and subflow events. Netlink events are still sent to userspace. v2: Remove unneeded check in mptcp_pm_rm_addr_received() (Kishen Maloor) v3: Add and use helper function for PM mode (Paolo Abeni) Acked-by: Paolo Abeni <pabeni@redhat.com> Co-developed-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Kishen Maloor <kishen.maloor@intel.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-30mptcp: Add a member to mptcp_pm_data to track kernel vs userspace modeMat Martineau1-0/+4
When adding support for netlink path management commands, the kernel needs to know whether paths are being controlled by the in-kernel path manager or a userspace PM. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-30mptcp: Remove redundant assignments in path manager initMat Martineau1-14/+18
A few members of the mptcp_pm_data struct were assigned to hard-coded values in mptcp_pm_data_reset(), and then immediately changed in mptcp_pm_nl_data_init(). Instead, flatten all the assignments in to mptcp_pm_data_reset(). v2: Resolve conflicts due to rename of mptcp_pm_data_reset() v4: Resolve conflict in mptcp_pm_data_reset() Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27mptcp: reset subflow when MP_FAIL doesn't respondGeliang Tang1-0/+8
This patch adds a new msk->flags bit MPTCP_FAIL_NO_RESPONSE, then reuses sk_timer to trigger a check if we have not received a response from the peer after sending MP_FAIL. If the peer doesn't respond properly, reset the subflow. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27mptcp: add MP_FAIL response supportGeliang Tang1-1/+9
This patch adds a new struct member mp_fail_response_expect in struct mptcp_subflow_context to support MP_FAIL response. In the single subflow with checksum error and contiguous data special case, a MP_FAIL is sent in response to another MP_FAIL. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23mptcp: infinite mapping sendingGeliang Tang1-0/+6
This patch adds the infinite mapping sending logic. Add a new flag send_infinite_map in struct mptcp_subflow_context. Set it true when a single contiguous subflow is in use and the allow_infinite_fallback flag is true in mptcp_pm_mp_fail_received(). In mptcp_sendmsg_frag(), if this flag is true, call the new function mptcp_update_infinite_map() to set the infinite mapping. Add a new flag infinite_map in struct mptcp_ext, set it true in mptcp_update_infinite_map(), and check this flag in a new helper mptcp_check_infinite_map(). In mptcp_update_infinite_map(), set data_len to 0, and clear the send_infinite_map flag, then do fallback. In mptcp_established_options(), use the helper mptcp_check_infinite_map() to let the infinite mapping DSS can be sent out in the fallback mode. Suggested-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11mptcp: reset the packet scheduler on incoming MP_PRIOPaolo Abeni1-4/+15
When an incoming MP_PRIO option changes the backup status of any subflow, we need to reset the packet scheduler status, or the next send could keep using the previously selected subflow, without taking in account the new priorities. Reported-by: Davide Caratti <dcaratti@redhat.com> Fixes: 40453a5c61f4 ("mptcp: add the incoming MP_PRIO support") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski1-2/+6
tools/testing/selftests/net/mptcp/mptcp_join.sh 34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers") 857898eb4b28 ("selftests: mptcp: add missing join check") 6ef84b1517e0 ("selftests: mptcp: more robust signal race test") https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/ drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c fb7e76ea3f3b6 ("net/mlx5e: TC, Skip redundant ct clear actions") c63741b426e11 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information") 09bf97923224f ("net/mlx5e: TC, Move pedit_headers_action to parse_attr") 84ba8062e383 ("net/mlx5e: Test CT and SAMPLE on flow attr") efe6f961cd2e ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow") 3b49a7edec1d ("net/mlx5e: TC, Reject rules with multiple CT actions") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-19mptcp: add mibs counter for ignored incoming optionsPaolo Abeni1-2/+6
The MPTCP in kernel path manager has some constraints on incoming addresses announce processing, so that in edge scenarios it can end-up dropping (ignoring) some of such announces. The above is not very limiting in practice since such scenarios are very uncommon and MPTCP will recover due to ADD_ADDR retransmissions. This patch adds a few MIB counters to account for such drop events to allow easier introspection of the critical scenarios. Fixes: f7efc7771eac ("mptcp: drop argument port from mptcp_pm_announce_addr") Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17mptcp: constify a bunch of of helpersPaolo Abeni1-2/+2
A few pm-related helpers don't touch arguments which lacking the const modifier, let's constify them. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17mptcp: drop port parameter of mptcp_pm_add_addr_signalGeliang Tang1-3/+4
Drop the port parameter of mptcp_pm_add_addr_signal() and reflect it to avoid passing too many parameters. Signed-off-by: Geliang Tang <geliang.tang@suse.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-07mptcp: do not block subflows creation on errorsPaolo Abeni1-2/+21
If the MPTCP configuration allows for multiple subflows creation, and the first additional subflows never reach the fully established status - e.g. due to packets drop or reset - the in kernel path manager do not move to the next subflow. This patch introduces a new PM helper to cope with MPJ subflow creation failure and delay and hook it where appropriate. Such helper triggers additional subflow creation, as needed and updates the PM subflow counter, if the current one is closing. Additionally start all the needed additional subflows as soon as the MPTCP socket is fully established, so we don't have to cope with slow MPJ handshake blocking the next subflow creation. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07mptcp: keep track of local endpoint still available for each mskPaolo Abeni1-0/+1
Include into the path manager status a bitmap tracking the list of local endpoints still available - not yet used - for the relevant mptcp socket. Keep such map updated at endpoint creation/deletion time, so that we can easily skip already used endpoint at local address selection time. The endpoint used by the initial subflow is lazyly accounted at subflow creation time: the usage bitmap is be up2date before endpoint selection and we avoid such unneeded task in some relevant scenarios - e.g. busy servers accepting incoming subflows but not creating any additional ones nor annuncing additional addresses. Overall this allows for fair local endpoints usage in case of subflow failure. As a side effect, this patch also enforces that each endpoint is used at most once for each mptcp connection. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-07mptcp: full disconnect implementationPaolo Abeni1-3/+7
The current mptcp_disconnect() implementation lacks several steps, we additionally need to reset the msk socket state and flush the subflow list. Factor out the needed helper to avoid code duplication. Additionally ensure that the initial subflow is disposed only after mptcp_close(), just reset it at disconnect time. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-25mptcp: MP_FAIL suboption receivingGeliang Tang1-0/+5
This patch added handling for receiving MP_FAIL suboption. Add a new members mp_fail and fail_seq in struct mptcp_options_received. When MP_FAIL suboption is received, set mp_fail to 1 and save the sequence number to fail_seq. Then invoke mptcp_pm_mp_fail_received to deal with the MP_FAIL suboption. Signed-off-by: Geliang Tang <geliangtang@xiaomi.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24mptcp: remove MPTCP_ADD_ADDR_IPV6 and MPTCP_ADD_ADDR_PORTYonglong Li1-5/+1
MPTCP_ADD_ADDR_IPV6 and MPTCP_ADD_ADDR_PORT are not necessary, we can get these info from pm.local or pm.remote. Drop mptcp_pm_should_add_signal_ipv6 and mptcp_pm_should_add_signal_port too. Co-developed-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24mptcp: build ADD_ADDR/echo-ADD_ADDR option according pm.add_signalYonglong Li1-5/+9
According to the MPTCP_ADD_ADDR_SIGNAL or MPTCP_ADD_ADDR_ECHO flag, build the ADD_ADDR/ADD_ADDR_ECHO option. In mptcp_pm_add_addr_signal(), use opts->addr to save the announced ADD_ADDR or ADD_ADDR_ECHO address. Co-developed-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24mptcp: fix ADD_ADDR and RM_ADDR maybe flush addr_signal each otherYonglong Li1-3/+10
ADD_ADDR shares pm.addr_signal with RM_ADDR, so after RM_ADDR/ADD_ADDR has done, we should not clean ADD_ADDR/RM_ADDR's addr_signal. Co-developed-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24mptcp: make MPTCP_ADD_ADDR_SIGNAL and MPTCP_ADD_ADDR_ECHO separateYonglong Li1-6/+10
Use MPTCP_ADD_ADDR_SIGNAL only for the action of sending ADD_ADDR, and use MPTCP_ADD_ADDR_ECHO only for the action of sending ADD_ADDR echo. Use msk->pm.local to save the announced ADD_ADDR address only, and reuse msk->pm.remote to save the announced ADD_ADDR_ECHO address. To prepare for the next patch. Co-developed-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-24mptcp: move drop_other_suboptions check under pm lockYonglong Li1-2/+13
This patch moved the drop_other_suboptions check from mptcp_established_options_add_addr() into mptcp_pm_add_addr_signal(), do it under the PM lock to avoid the race between this check and mptcp_pm_add_addr_signal(). For this, added a new parameter for mptcp_pm_add_addr_signal() to get the drop_other_suboptions value. And drop the other suboptions after the option length check if drop_other_suboptions is true. Additionally, always drop the other suboption for TCP pure ack: that makes both the code simpler and the MPTCP behaviour more consistent. Co-developed-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Yonglong Li <liyonglong@chinatelecom.cn> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14mptcp: add mibs for stale subflows processingPaolo Abeni1-0/+2
This allows monitoring exceptional events like active backup scenarios. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14mptcp: faster active backup recoveryPaolo Abeni1-0/+2
The msk can use backup subflows to transmit in-sequence data only if there are no other active subflow. On active backup scenario, the MPTCP connection can do forward progress only due to MPTCP retransmissions - rtx can pick backup subflows. This patch introduces a new flag flow MPTCP subflows: if the underlying TCP connection made no progresses for long time, and there are other less problematic subflows available, the given subflow become stale. Stale subflows are not considered active: if all non backup subflows become stale, the MPTCP scheduler can pick backup subflows for plain transmissions. Stale subflows can return in active state, as soon as any reply from the peer is observed. Active backup scenarios can now leverage the available b/w with no restrinction. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-14mptcp: less aggressive retransmission strategyPaolo Abeni1-0/+17
The current mptcp re-inject strategy is very aggressive, we have mptcp-level retransmissions even on single subflow connection, if the link in-use is lossy. Let's be a little more conservative: we do retransmit only if at least a subflow has write and rtx queue empty. Additionally use the backup subflows only if the active subflows are stale - no progresses in at least an rtx period and ignore stale subflows for rtx timeout update Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/207 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-23mptcp: add deny_join_id0 in mptcp_options_receivedGeliang Tang1-0/+1
This patch added a new flag named deny_join_id0 in struct mptcp_options_received. Set it when MP_CAPABLE with the flag MPTCP_CAP_DENYJOIN_ID0 is received. Also add a new flag remote_deny_join_id0 in struct mptcp_pm_data. When the flag deny_join_id0 is set, set this remote_deny_join_id0 flag. In mptcp_pm_create_subflow_or_signal_addr, if the remote_deny_join_id0 flag is set, and the remote address id is zero, stop this connection. Suggested-by: Florian Westphal <fw@strlen.de> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-27mptcp: rename mptcp_pm_nl_add_addr_send_ackGeliang Tang1-1/+1
Since mptcp_pm_nl_add_addr_send_ack is now used for both ADD_ADDR and RM_ADDR cases, rename it to mptcp_pm_nl_addr_send_ack. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-27mptcp: send ack for rm_addrGeliang Tang1-0/+1
This patch changes the sending ACK conditions for the ADD_ADDR, send an ACK packet for RM_ADDR too. In mptcp_pm_remove_addr, invoke mptcp_pm_nl_add_addr_send_ack to send the ACK packet. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-27mptcp: move to next addr when subflow creation failGeliang Tang1-0/+15
When an invalid address was announced, the subflow couldn't be created for this address. Therefore mptcp_pm_nl_subflow_established couldn't be invoked. Then the next addresses in the local address list didn't have a chance to be announced. This patch invokes the new function mptcp_pm_add_addr_echoed when the address is echoed. In it, use mptcp_lookup_anno_list_by_saddr to check whether this address is in the anno_list. If it is, PM schedules the status MPTCP_PM_SUBFLOW_ESTABLISHED to invoke mptcp_pm_create_subflow_or_signal_addr to deal with the next address in the local address list. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-27mptcp: drop unused subflow in mptcp_pm_subflow_establishedGeliang Tang1-2/+1
This patch drops the unused parameter subflow in mptcp_pm_subflow_established(). Fixes: 926bdeab5535 ("mptcp: Implement path manager interface commands") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-27mptcp: drop argument port from mptcp_pm_announce_addrGeliang Tang1-3/+3
Drop the redundant argument 'port' from mptcp_pm_announce_addr, use the port field of another argument 'addr' instead. Fixes: 0f5c9e3f079f ("mptcp: add port parameter for mptcp_pm_announce_addr") Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>