summaryrefslogtreecommitdiff
path: root/include/rdma/ib_addr.h
AgeCommit message (Collapse)AuthorFilesLines
2018-10-16RDMA/core: Annotate timeout as unsigned longLeon Romanovsky1-1/+1
The ucma users supply timeout in u32 format, it means that any number with most significant bit set will be converted to negative value by various rdma_*, cma_* and sa_query functions, which treat timeout as int. In the lowest level, the timeout is converted back to be unsigned long. Remove this ambiguous conversion by updating all function signatures to receive unsigned long. Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16RDMA/core: Align multiple functions to kernel coding styleLeon Romanovsky1-2/+1
This patch changes the small number of functions to be aligned to kernel coding style. It is needed to minimize the diffstat of the following patch. It doesn't change any functionality. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-13RDMA: Remove duplicated include from ib_addr.hYueHaibing1-1/+0
Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-13RDMA/core: Consider net ns of gid attribute for RoCEParav Pandit1-0/+3
When resolving destination address or route, when net namespace is unavailable, refer to the net namespace of the netdevice of the SGID attribute. This is typically the case for requests arriving from the network for RoCE ports. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-13RDMA/core: Rename rdma_copy_addr to rdma_copy_src_l2_addrParav Pandit1-4/+0
Now that rdma_copy_addr() only copies the source addresses and all callers are interested in copying only source addresses, simplify it to drop the destination address argument. Given that it only copies source layer2 addresses, rename it to rdma_copy_src_l2_addr for better code readability. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-31RDMA/core: Constify dst_addr argumentParav Pandit1-2/+2
Following APIs are not supposed to modify addr or dest_addr contents. Therefore make those function argument const for better code readability. 1. rdma_resolve_ip() 2. rdma_addr_size() 3. rdma_resolve_addr() Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-25IB/cm: Keep track of the sgid_attr that created the cm idParav Pandit1-0/+2
Hold reference to the the sgid_attr which is used in a cm_id until the cm_id is destroyed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-04-18RDMA/rdma_cm: Delete rdma_addr_clientJason Gunthorpe1-19/+1
The only thing it does is block module unload while work is posted from rdma_resolve_ip(). However, this is not the right place to do this. The users of rdma_resolve_ip() must ensure their own module does not unload until rdma_resolve_ip() calls the callback, or until rdma_addr_cancel() is called. Similarly callers to rdma_addr_find_l2_eth_by_grh() must ensure their module does not unload while they are calling code. The only two users are already safe, so there is no need for this. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-04-07Merge tag 'for-linus-unmerged' of ↵Linus Torvalds1-9/+0
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "Doug and I are at a conference next week so if another PR is sent I expect it to only be bug fixes. Parav noted yesterday that there are some fringe case behavior changes in his work that he would like to fix, and I see that Intel has a number of rc looking patches for HFI1 they posted yesterday. Parav is again the biggest contributor by patch count with his ongoing work to enable container support in the RDMA stack, followed by Leon doing syzkaller inspired cleanups, though most of the actual fixing went to RC. There is one uncomfortable series here fixing the user ABI to actually work as intended in 32 bit mode. There are lots of notes in the commit messages, but the basic summary is we don't think there is an actual 32 bit kernel user of drivers/infiniband for several good reasons. However we are seeing people want to use a 32 bit user space with 64 bit kernel, which didn't completely work today. So in fixing it we required a 32 bit rxe user to upgrade their userspace. rxe users are still already quite rare and we think a 32 bit one is non-existing. - Fix RDMA uapi headers to actually compile in userspace and be more complete - Three shared with netdev pull requests from Mellanox: * 7 patches, mostly to net with 1 IB related one at the back). This series addresses an IRQ performance issue (patch 1), cleanups related to the fix for the IRQ performance problem (patches 2-6), and then extends the fragmented completion queue support that already exists in the net side of the driver to the ib side of the driver (patch 7). * Mostly IB, with 5 patches to net that are needed to support the remaining 10 patches to the IB subsystem. This series extends the current 'representor' framework when the mlx5 driver is in switchdev mode from being a netdev only construct to being a netdev/IB dev construct. The IB dev is limited to raw Eth queue pairs only, but by having an IB dev of this type attached to the representor for a switchdev port, it enables DPDK to work on the switchdev device. * All net related, but needed as infrastructure for the rdma driver - Updates for the hns, i40iw, bnxt_re, cxgb3, cxgb4, hns drivers - SRP performance updates - IB uverbs write path cleanup patch series from Leon - Add RDMA_CM support to ib_srpt. This is disabled by default. Users need to set the port for ib_srpt to listen on in configfs in order for it to be enabled (/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port) - TSO and Scatter FCS support in mlx4 - Refactor of modify_qp routine to resolve problems seen while working on new code that is forthcoming - More refactoring and updates of RDMA CM for containers support from Parav - mlx5 'fine grained packet pacing', 'ipsec offload' and 'device memory' user API features - Infrastructure updates for the new IOCTL interface, based on increased usage - ABI compatibility bug fixes to fully support 32 bit userspace on 64 bit kernel as was originally intended. See the commit messages for extensive details - Syzkaller bugs and code cleanups motivated by them" * tag 'for-linus-unmerged' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (199 commits) IB/rxe: Fix for oops in rxe_register_device on ppc64le arch IB/mlx5: Device memory mr registration support net/mlx5: Mkey creation command adjustments IB/mlx5: Device memory support in mlx5_ib net/mlx5: Query device memory capabilities IB/uverbs: Add device memory registration ioctl support IB/uverbs: Add alloc/free dm uverbs ioctl support IB/uverbs: Add device memory capabilities reporting IB/uverbs: Expose device memory capabilities to user RDMA/qedr: Fix wmb usage in qedr IB/rxe: Removed GID add/del dummy routines RDMA/qedr: Zero stack memory before copying to user space IB/mlx5: Add ability to hash by IPSEC_SPI when creating a TIR IB/mlx5: Add information for querying IPsec capabilities IB/mlx5: Add IPsec support for egress and ingress {net,IB}/mlx5: Add ipsec helper IB/mlx5: Add modify_flow_action_esp verb IB/mlx5: Add implementation for create and destroy action_xfrm IB/uverbs: Introduce ESP steering match filter IB/uverbs: Add modify ESP flow_action ...
2018-03-29RDMA/ucma: Introduce safer rdma_addr_size() variantsRoland Dreier1-0/+2
There are several places in the ucma ABI where userspace can pass in a sockaddr but set the address family to AF_IB. When that happens, rdma_addr_size() will return a size bigger than sizeof struct sockaddr_in6, and the ucma kernel code might end up copying past the end of a buffer not sized for a struct sockaddr_ib. Fix this by introducing new variants int rdma_addr_size_in6(struct sockaddr_in6 *addr); int rdma_addr_size_kss(struct __kernel_sockaddr_storage *addr); that are type-safe for the types used in the ucma ABI and return 0 if the size computed is bigger than the size of the type passed in. We can use these new variants to check what size userspace has passed in before copying any addresses. Reported-by: <syzbot+6800425d54ed3ed8135d@syzkaller.appspotmail.com> Signed-off-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-16IB/core: Move rdma_addr_find_l2_eth_by_grh to core_priv.hParav Pandit1-5/+0
Before commit [1], rdma_addr_find_l2_eth_by_grh() was an exported function and therefore declaration in include/rdma/ib_addr.h was fine. But now that its scope is limited to ib_core module, its better to have it in core_priv.h. [1] commit 1060f8653414 ("IB/{core/cm}: Fix generating a return AH for RoCEE") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-16IB/core: Remove rdma_resolve_ip_route() as exported symbolParav Pandit1-4/+0
rdma_resolve_ip_route() is used only by ib_core module. Therefore it is removed as an exported symbol. Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-01-19RDMA/core: Simplify rdma_addr_get_sgid() to not support RoCEParav Pandit1-26/+7
Now that all callers who care about RoCE addresses have been converted to use rdma_read_gids() simplify rdma_addr_get_sgid() to only support real GID addresses. Callers should only use it for OPA and IB transports. The now deleted implementation for RoCE has several bugs related to IPv6 support and incorrect/inconsistent 'GID' addresses compared to the CM paths. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-19RDMA/{core, cma}: Simplify rdma_translate_ipParav Pandit1-1/+1
Since no caller needs vlan, rdma_translate_ip is simplified to avoid vlan pointer. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-19IB/core: Removed unused functionParav Pandit1-1/+0
rdma_addr_find_smac_by_sgid() is exported symbol not used by any kernel module. Therefore its removed. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-12-18IB/{core/cm}: Fix generating a return AH for RoCEEParav Pandit1-1/+1
When computing a UD reverse path (return AH) from a WC the code was not doing a route lookup anchored in a specific netdevice. This caused several bugs, including broken IPv6 link-local address support in RoCEv2. [1] This fixes the lookup by determining the GID table entry that the HW matched to the SGID for the WC and then using the netdevice from that entry to perform the route and ND lookup for the 'DGID' to build a return AH. RoCE GID table management ensures that right upper netdevices of the physical netdevices are added. Therefore init_ah_from_wc doesn't need to perform such check. Now that route lookup is done based on the netdevice of the GID entry, simplify code to not have ifindex and vlan pointers. As part of that, refactor to have netdevice as input parameter. This is already discussed at [2]. Finally ib_init_ah_from_wc resolves dmac for unicast GID in similar way as what ib_resolve_eth_dmac() does. So ib_resolve_eth_dmac is refactored to split for unicast and non unicast GIDs, so that it can be reused by ib_init_ah_from_wc. While we are at refactoring ib_resolve_eth_dmac(), it is further simplified (a) to avoid hoplimit as optional parameter, as there is only one user who always queries hoplimit. (b) for empty line. (c) avoided zero initialization of ret. (d) removed as exported symbol as only ib core uses it. For IPv6, this is tested using simple rping test as below. rping -sv -a ::0 rping -c -a fe80::268a:7ff:fe55:4661%ens2f1 -C 1 -v -d [1] https://www.spinics.net/lists/linux-rdma/msg45690.html [2] https://www.spinics.net/lists/linux-rdma/msg45710.html Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2017-11-14RDMA/core: Make function rdma_copy_addr return voidYuval Shaia1-2/+3
Function returns zero - make it void. While there make struct net_device const. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18Merge branch 'for-next-early' into for-nextDoug Ledford1-2/+2
The early for-next branch was based on v4.14-rc2, while the shared pull request I got from Mellanox used a v4.14-rc4 base. I'm making the branch that was the shared Mellanox pull request the new for-next branch and merging the early for-next branch into it. Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-18IB/core: Fix calculation of maximum RoCE MTUParav Pandit1-3/+4
The original code only took into consideration the largest header possible after the IB_BTH_BYTES. This was incorrect, as the largest possible header size is the largest possible combination of headers we might run into. The new code accounts for all possible headers in the largest possible combination and subtracts that from the MTU to make sure that all packets will fit on the wire. Link: https://www.spinics.net/lists/linux-rdma/msg54558.html Fixes: 3c86aa70bf67 ("RDMA/cm: Add RDMA CM support for IBoE devices") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reported-by: Roland Dreier <roland@purestorage.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-10-15IB/core: Fix endianness annotation in rdma_is_multicast_addr()Bart Van Assche1-2/+2
Since ipv4_addr is a big endian 32-bit number, annotate it as such. Fixes: commit be1d325a3358 ("IB/core: Set RoCEv2 MGID according to spec") Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-18Merge branch 'misc' into k.o/for-nextDoug Ledford1-1/+2
Conflicts: drivers/infiniband/core/iwcm.c - The rdma_netlink patches in HEAD and the iwarp cm workqueue fix (don't use WQ_MEM_RECLAIM, we aren't safe for that context) touched the same code. Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-18infiniband: avoid overflow warningArnd Bergmann1-1/+2
A sockaddr_in structure on the stack getting passed into rdma_ip2gid triggers this warning, since we memcpy into a larger sockaddr_in6 structure: In function 'memcpy', inlined from 'rdma_ip2gid' at include/rdma/ib_addr.h:175:3, inlined from 'addr_event.isra.4.constprop' at drivers/infiniband/core/roce_gid_mgmt.c:693:2, inlined from 'inetaddr_event' at drivers/infiniband/core/roce_gid_mgmt.c:716:9: include/linux/string.h:305:4: error: call to '__read_overflow2' declared with attribute error: detected read beyond size of object passed as 2nd parameter The warning seems appropriate here, but the code is also clearly correct, so we really just want to shut up this instance of the output. The best way I found so far is to avoid the memcpy() call and instead replace it with a struct assignment. Fixes: 6974f0c4555e ("include/linux/string.h: add the option of fortified string.h functions") Cc: Daniel Micay <danielmicay@gmail.com> Cc: Kees Cook <keescook@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-24IB/core: Set RoCEv2 MGID according to specNoa Osherovich1-1/+7
RoCEv2 Annex states that for RoCEv2 over IPv4, the corresponding IPv4 address is encoded into the GID according to the following rule: GID= :ffff:<IPv4 address> Remove the 0xff0e prefix for RoCEv2 packets with IPv4 and leave it zeroed and change rdma_is_multicast_addr() to consider the new logic. Signed-off-by: Noa Osherovich <noaos@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-20IB/cma: Fix reference count leak when no ipv4 addresses are setKalderon, Michal1-2/+4
Once in_dev_get is called to receive in_device pointer, the in_device reference counter is increased, but if there are no ipv4 addresses configured on the net-device the ifa_list will be null, resulting in a flow that doesn't call in_dev_put to decrease the ref_cnt. This was exposed when running RoCE over ipv6 without any ipv4 addresses configured Fixes: commit 8e3867310c90 ("IB/cma: Fix a race condition in iboe_addr_get_sgid()") Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-07net-next: treewide use is_vlan_dev() helper function.Parav Pandit1-4/+2
This patch makes use of is_vlan_dev() function instead of flag comparison which is exactly done by is_vlan_dev() helper function. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jon Maxwell <jmaxwell37@gmail.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-12-22IB/cma: Fix a race condition in iboe_addr_get_sgid()Bart Van Assche1-2/+4
Code that dereferences the struct net_device ip_ptr member must be protected with an in_dev_get() / in_dev_put() pair. Hence insert calls to these functions. Fixes: commit 7b85627b9f02 ("IB/cma: IBoE (RoCE) IP-based GID addressing") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Cc: Roland Dreier <roland@purestorage.com> Cc: <stable@vger.kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-02-26net: rdma: use __ethtool_get_ksettingsDavid Decotigny1-8/+6
Signed-off-by: David Decotigny <decot@googlers.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-01-19IB/core: Use hop-limit from IP stack for RoCEMatan Barak1-1/+3
Previously, IPV6_DEFAULT_HOPLIMIT was used as the hop limit value for RoCE. Fixing that by taking ip4_dst_hoplimit and ip6_dst_hoplimit as hop limit values. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19IB/core: Rename rdma_addr_find_dmac_by_grhMatan Barak1-2/+3
rdma_addr_find_dmac_by_grh resolves dmac, vlan_id and if_index and downsteram patch will also add hop_limit as an output parameter, thus we rename it to rdma_addr_find_l2_eth_by_grh. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Validate route when we init ahMatan Barak1-3/+7
In order to make sure API users don't try to use SGIDs which don't conform to the routing table, validate the route before searching the RoCE GID table. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Add rdma_network_type to wcSomnath Kotur1-0/+1
Providers should tell IB core the wc's network type. This is used in order to search for the proper GID in the GID table. When using HCAs that can't provide this info, IB core tries to deep examine the packet and extract the GID type by itself. We choose sgid_index and type from all the matching entries in RDMA-CM based on hint from the IP stack and we set hop_limit for the IP packet based on above hint from IP stack. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Somnath Kotur <Somnath.Kotur@Avagotech.Com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-28IB/addr: Pass network namespace as a parameterGuy Shapiro1-1/+15
Add network namespace support to the ib_addr module. For that, all the address resolution and matching should be done using the appropriate namespace instead of init_net. This is achieved by: 1. Adding an explicit network namespace argument to exported function that require a namespace. 2. Saving the namespace in the rdma_addr_client structure. 3. Using it when calling networking functions. In order to preserve the behavior of calling modules, &init_net is passed as the parameter in calls from other modules. This is modified as namespace support is added on more levels. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-22IB/core: Use GID table in AH creation and dmac resolutionMatan Barak1-1/+1
Previously, vlan id and source MAC were used from QP attributes. Since the net device is now stored in the GID attributes, they could be used instead of getting this information from the QP attributes. IB_QP_SMAC, IB_QP_ALT_SMAC, IB_QP_VID and IB_QP_ALT_VID were removed because there is no known libibverbs that uses them. This commit also modifies the vendors (mlx4, ocrdma) drivers in order to use the new approach. ocrdma driver changes were done by Somnath Kotur <Somnath.Kotur@Avagotech.Com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-06-02IB/core cleanup: Add const to args - agent_send_responseIra Weiny1-3/+3
In order to support constant callers of agent_send_response we add const specifiers to the its pointer arguments. Adjust the call tree accordingly. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Hal Rosenstock <hal@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-05-05IB/core: change rdma_gid2ip into void function as it always return zeroHonggang LI1-2/+1
Signed-off-by: Honggang Li <honli@redhat.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2014-01-19IB/cma: IBoE (RoCE) IP-based GID addressingMoni Shoua1-23/+12
Currently, the IB core and specifically the RDMA-CM assumes that IBoE (RoCE) gids encode related Ethernet netdevice interface MAC address and possibly VLAN id. Change GIDs to be treated as they encode interface IP address. Since Ethernet layer 2 address parameters are not longer encoded within gids, we have to extend the Infiniband address structures (e.g. ib_ah_attr) with layer 2 address parameters, namely mac and vlan. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-01-15IB/core: Ethernet L2 attributes in verbs/cm structuresMatan Barak1-1/+41
This patch add the support for Ethernet L2 attributes in the verbs/cm/cma structures. When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority in a similar manner that the IB L2 (and the L4 PKEY) attributes are used. Thus, those attributes were added to the following structures: * ib_ah_attr - added dmac * ib_qp_attr - added smac and vlan_id, (sl remains vlan priority) * ib_wc - added smac, vlan_id * ib_sa_path_rec - added smac, dmac, vlan_id * cm_av - added smac and vlan_id For the path record structure, extra care was taken to avoid the new fields when packing it into wire format, so we don't break the IB CM and SA wire protocol. On the active side, the CM fills. its internal structures from the path provided by the ULP. We add there taking the ETH L2 attributes and placing them into the CM Address Handle (struct cm_av). On the passive side, the CM fills its internal structures from the WC associated with the REQ message. We add there taking the ETH L2 attributes from the WC. When the HW driver provides the required ETH L2 attributes in the WC, they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core code checks for the presence of these flags, and in their absence does address resolution from the ib_init_ah_from_wc() helper function. ib_modify_qp_is_ok is also updated to consider the link layer. Some parameters are mandatory for Ethernet link layer, while they are irrelevant for IB. Vendor drivers are modified to support the new function signature. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-06-21IB/addr: Add AF_IB support to ip_addr_sizeSean Hefty1-5/+1
Add support for AF_IB to ip_addr_size, and rename the function to account for the change. Give the compiler more control over whether the call should be inline or not by moving the definition into the .c file, removing the static inline, and exporting it. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-01-04rdma/core: Fix sparse warningsSean Hefty1-1/+1
Clean up sparse warnings in the rdma core layer. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-09-16net: consolidate and fix ethtool_ops->get_settings callingJiri Pirko1-1/+5
This patch does several things: - introduces __ethtool_get_settings which is called from ethtool code and from drivers as well. Put ASSERT_RTNL there. - dev_ethtool_get_settings() is replaced by __ethtool_get_settings() - changes calling in drivers so rtnl locking is respected. In iboe_get_rate was previously ->get_settings() called unlocked. This fixes it. Also prb_calc_retire_blk_tmo() in af_packet.c had the same problem. Also fixed by calling __dev_get_by_index() instead of dev_get_by_index() and holding rtnl_lock for both calls. - introduces rtnl_lock in bnx2fc_vport_create() and fcoe_vport_create() so bnx2fc_if_create() and fcoe_if_create() are called locked as they are from other places. - use __ethtool_get_settings() in bonding code Signed-off-by: Jiri Pirko <jpirko@redhat.com> v2->v3: -removed dev_ethtool_get_settings() -added ASSERT_RTNL into __ethtool_get_settings() -prb_calc_retire_blk_tmo - use __dev_get_by_index() and lock around it and __ethtool_get_settings() call v1->v2: add missing export_symbol Reviewed-by: Ben Hutchings <bhutchings@solarflare.com> [except FCoE bits] Acked-by: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2011-04-30ethtool: Call ethtool's get/set_settings callbacks with cleaned dataDavid Decotigny1-6/+7
This makes sure that when a driver calls the ethtool's get/set_settings() callback of another driver, the data passed to it is clean. This guarantees that speed_hi will be zeroed correctly if the called callback doesn't explicitely set it: we are sure we don't get a corrupted speed from the underlying driver. We also take care of setting the cmd field appropriately (ETHTOOL_GSET/SSET). This applies to dev_ethtool_get_settings(), which now makes sure it sets up that ethtool command parameter correctly before passing it to drivers. This also means that whoever calls dev_ethtool_get_settings() does not have to clean the ethtool command parameter. This function also becomes an exported symbol instead of an inline. All drivers visible to make allyesconfig under x86_64 have been updated. Signed-off-by: David Decotigny <decot@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2010-10-25IB/core: Add VLAN support for IBoEEli Cohen1-4/+39
Add 802.1q VLAN support to IBoE. The VLAN tag is encoded within the GID derived from a link local address in the following way: GID[11] GID[12] contain the VLAN ID when the GID contains a VLAN. The 3 bits user priority field of the packets are identical to the 3 bits of the SL. In case of rdma_cm apps, the TOS field is used to generate the SL field by doing a shift right of 5 bits effectively taking to 3 MS bits of the TOS field. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2010-10-14RDMA/cm: Add RDMA CM support for IBoE devicesEli Cohen1-1/+98
Add support for IBoE device binding and IP --> GID resolution. Path resolving and multicast joining are implemented within cma.c by filling in the responses and running callbacks in the CMA work queue. IP --> GID resolution always yields IPv6 link local addresses; remote GIDs are derived from the destination MAC address of the remote port. Multicast GIDs are always mapped to multicast MACs as is done in IPv6. (IPv4 multicast is enabled by translating IPv4 multicast addresses to IPv6 multicast as described in <http://www.mail-archive.com/ipng@sunroof.eng.sun.com/msg02134.html>.) Some helper functions are added to ib_addr.h. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-20RDMA/cm: fix loopback address supportSean Hefty1-21/+10
The RDMA CM is intended to support the use of a loopback address when establishing a connection; however, the behavior of the CM when loopback addresses are used is confusing and does not always work, depending on whether loopback was specified by the server, the client, or both. The defined behavior of rdma_bind_addr is to associate an RDMA device with an rdma_cm_id, as long as the user specified a non- zero address. (ie they weren't just trying to reserve a port) Currently, if the loopback address is passed to rdam_bind_addr, no device is associated with the rdma_cm_id. Fix this. If a loopback address is specified by the client as the destination address for a connection, it will fail to establish a connection. This is true even if the server is listing across all addresses or on the loopback address itself. The issue is that the server tries to translate the IP address carried in the REQ message to a local net_device address, which fails. The translation is not needed in this case, since the REQ carries the actual HW address that should be used. Finally, cleanup loopback support to be more transport neutral. Replace separate calls to get/set the sgid and dgid from the device address to a single call that behaves correctly depending on the format of the device address. And support both IPv4 and IPv6 address formats. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ Fixed RDS build by s/ib_addr_get/rdma_addr_get/ - Roland ] Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-19IB/addr: Store net_device type instead of translating to RDMA transportSean Hefty1-1/+2
The struct rdma_dev_addr stores net_device address information: the source device address, destination hardware address, and broadcast address. For consistency, store the net_device type rather than converting it to the rdma_node_type. The type indicates the format of the various hardware addresses, which is what we're concerned with, and not the RDMA node type that the address may map to. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2009-11-19RDMA/cma: Replace net_device pointer with indexSean Hefty1-1/+1
Provide the device interface when resolving route information to ensure that the correct outbound device is used. This will also simplify processing of sin6_scope_id for IPv6 support. Based on work from: David Wilder <dwilder@us.ibm.com> Jason Gunthorpe <jgunthrope@obsidianresearch.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-15RDMA/addr: Keep pointer to netdevice in struct rdma_dev_addrOr Gerlitz1-0/+1
Keep a pointer to the local (src) netdevice in struct rdma_dev_addr, and copy it in as part of rdma_copy_addr(). Use rdma_translate_ip() in cma_new_conn_id() to reduce some code duplication and also make sure the src_dev member gets set. In a high-availability configuration the netdevice pointer can be used by the RDMA CM to align RDMA sessions to use the same links as the IP stack does under fail-over and route change cases. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2008-07-15RDMA: Fix license textSean Hefty1-19/+23
The license text for several files references a third software license that was inadvertently copied in. Update the license to what was intended. This update was based on a request from HP. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>
2007-02-17IB/sa: Track multicast join/leave requestsSean Hefty1-0/+6
The IB SA tracks multicast join/leave requests on a per port basis and does not do any reference counting: if two users of the same port join the same group, and one leaves that group, then the SA will remove the port from the group even though there is one user who wants to stay a member left. Therefore, in order to support multiple users of the same multicast group from the same port, we need to perform reference counting locally. To do this, add an multicast submodule to ib_sa to perform reference counting of multicast join/leave operations. Modify ib_ipoib (the only in-kernel user of multicast) to use the new interface. Signed-off-by: Roland Dreier <rolandd@cisco.com>
2006-11-03RDMA/addr: Use client registration to fix module unload raceSean Hefty1-1/+19
Require registration with ib_addr module to prevent caller from unloading while a callback is in progress. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>