summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
AgeCommit message (Collapse)AuthorFilesLines
2014-12-11net/mlx4: Change QP allocation schemeEugenia Emantayev1-1/+1
When using BF (Blue-Flame), the QPN overrides the VLAN, CV, and SV fields in the WQE. Thus, BF may only be used for QPNs with bits 6,7 unset. The current Ethernet driver code reserves a Tx QP range with 256b alignment. This is wrong because if there are more than 64 Tx QPs in use, QPNs >= base + 65 will have bits 6/7 set. This problem is not specific for the Ethernet driver, any entity that tries to reserve more than 64 BF-enabled QPs should fail. Also, using ranges is not necessary here and is wasteful. The new mechanism introduced here will support reservation for "Eth QPs eligible for BF" for all drivers: bare-metal, multi-PF, and VFs (when hypervisors support WC in VMs). The flow we use is: 1. In mlx4_en, allocate Tx QPs one by one instead of a range allocation, and request "BF enabled QPs" if BF is supported for the function 2. In the ALLOC_RES FW command, change param1 to: a. param1[23:0] - number of QPs b. param1[31-24] - flags controlling QPs reservation Bit 31 refers to Eth blueflame supported QPs. Those QPs must have bits 6 and 7 unset in order to be used in Ethernet. Bits 24-30 of the flags are currently reserved. When a function tries to allocate a QP, it states the required attributes for this QP. Those attributes are considered "best-effort". If an attribute, such as Ethernet BF enabled QP, is a must-have attribute, the function has to check that attribute is supported before trying to do the allocation. In a lower layer of the code, mlx4_qp_reserve_range masks out the bits which are unsupported. If SRIOV is used, the PF validates those attributes and masks out unsupported attributes as well. In order to notify VFs which attributes are supported, the VF uses QUERY_FUNC_CAP command. This command's mailbox is filled by the PF, which notifies which QP allocation attributes it supports. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09net/mlx4_en: Support for configurable RSS hash functionEyal Perry1-1/+1
The ConnectX HW is capable of using one of the following hash functions: Toeplitz and an XOR hash function. This patch extends the implementation of the mlx4_en driver set/get_rxfh callbacks to support getting and setting the RSS hash function used by the device. Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-23mlx4: fix mlx4_en_set_rxfh()Eric Dumazet1-0/+1
mlx4_en_set_rxfh() can crash if no RSS indir table is provided. While we are at it, allow RSS key to be changed with ethtool -X Tested: myhost:~# cat /proc/sys/net/core/netdev_rss_key b6:89:91:f3:b2:c3:c2:90:11:e8:ce:45:e8:a9:9d:1c:f2:f6:d4:53:61:8b:26:3a:b3:9a:57:97:c3:b6:79:4d:2e:d9:66:5c:72:ed:b6:8e:c5:5d:4d:8c:22:67:30:ab:8a:6e:c3:6a myhost:~# ethtool -x eth0 RX flow hash indirection table for eth0 with 8 RX ring(s): 0: 0 1 2 3 4 5 6 7 RSS hash key: b6:89:91:f3:b2:c3:c2:90:11:e8:ce:45:e8:a9:9d:1c:f2:f6:d4:53:61:8b:26:3a:b3:9a:57:97:c3:b6:79:4d:2e:d9:66:5c:72:ed:b6:8e myhost:~# ethtool -X eth0 hkey \ 03:0e:e2:43:fa:82:0e:73:14:2d:c0:68:21:9e:82:99:b9:84:d0:22:e2:b3:64:9f:4a:af:00:fa:cc:05:b4:4a:17:05:14:73:76:58:bd:2f myhost:~# ethtool -x eth0 RX flow hash indirection table for eth0 with 8 RX ring(s): 0: 0 1 2 3 4 5 6 7 RSS hash key: 03:0e:e2:43:fa:82:0e:73:14:2d:c0:68:21:9e:82:99:b9:84:d0:22:e2:b3:64:9f:4a:af:00:fa:cc:05:b4:4a:17:05:14:73:76:58:bd:2f Reported-by: Ben Hutchings <ben@decadent.org.uk> Fixes: b9d1ab7eb42e ("mlx4: use netdev_rss_key_fill() helper") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-11net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETEShani Michaeli1-1/+4
When processing received traffic, pass CHECKSUM_COMPLETE status to the stack, with calculated checksum for non TCP/UDP packets (such as GRE or ICMP). Although the stack expects checksum which doesn't include the pseudo header, the HW adds it. To address that, we are subtracting the pseudo header checksum from the checksum value provided by the HW. In the IPv6 case, we also compute/add the IP header checksum which is not added by the HW for such packets. Cc: Jerry Chu <hkchu@google.com> Signed-off-by: Shani Michaeli <shanim@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-03net/mlx4_en: Remove RX buffers alignment to IP_ALIGNIdo Shamay1-1/+0
When IP_ALIGN has a non zero value, hardware will write to a non aligned address. The only reader from this address is when copying the header from the first frag into the linear buffer (further access to the IP address will be from the linear buffer, in which the headers are aligned). Since the penalty of non align access by the hardware is greater than the software memcpy, changing the frag_align to always be 0. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-29net/mlx4_en: Add support for setting rxvlan offload OFF/ONSaeed Mahameed1-3/+7
Rename mlx4_en_timestamp_config to mlx4_en_reset_config and extend it to support choosing RX vlan offload configuration. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-29net/mlx4_en: Use PTYS register to query ethtool settingsSaeed Mahameed1-1/+7
- If dev cap MLX4_DEV_CAP_FLAG2_ETH_PROT_CTRL is ON, query PTYS register to fill ethtool settings. else use default values. - Use autoneg port cap and dev backplane autoneg cap to reprort autoneg interface capbilities. - Fix typo in mlx4_en_port_state struct field (transciver to transceiver). Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-06net/mlx4_en: Use the new tx_copybreak to set inline thresholdEric Dumazet1-1/+0
Instead of setting inline threshold using module parameter only on driver load, use set_tunable() to set it dynamically. No need to store the threshold per ring, using instead the netdev global priv->prof->inline_thold Initial value still is set using the module parameter, therefore backward compatability is kept. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-06net/mlx4_en: Avoid a cache line miss in TX completion for single frag skb'sEric Dumazet1-0/+3
Add frag0_dma/frag0_byte_count into mlx4_en_tx_info to avoid a cache line miss in TX completion for frames having one dma element. (We avoid reading back the tx descriptor) Note this could be extended to 2/3 dma elements later, as we have free room in mlx4_en_tx_info Also, mlx4_en_free_tx_desc() no longer accesses skb_shinfo(). We use a new nr_maps fields in mlx4_en_tx_info to avoid 2 or 3 cache misses. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-06net/mlx4_en: Avoid calling bswap in tx fast pathEric Dumazet1-1/+2
- doorbell_qpn is stored in the cpu_to_be32() way to avoid bswap() in fast path. - mdev->mr.key stored in ring->mr_key to also avoid bswap() and access to cold cache line. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-06net/mlx4_en: Align tx path structures to cache linesEric Dumazet1-40/+46
Reorganize struct mlx4_en_tx_ring to have: - One cache line containing last_nr_txbb & cons & wake_queue, used by tx completion. - One cache line containing fields dirtied by mlx4_en_xmit() - Following part is read mostly and shared by cpus. Align struct mlx4_en_tx_info to a cache line Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-06net/mlx4_en: Code cleanups in tx pathEric Dumazet1-1/+0
- Remove unused variable ring->poll_cnt - No need to set some fields if using blueflame - Add missing const's - Use unlikely - Remove unneeded new line - Make some comments more precise - struct mlx4_bf @offset field reduced to unsigned int to save space Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-05mlx4: add a new xmit_more counterEric Dumazet1-1/+4
ethtool -S reports a new counter, tracking number of time doorbell was not triggered, because skb->xmit_more was set. $ ethtool -S eth0 | egrep "tx_packet|xmit_more" tx_packets: 2413288400 xmit_more: 666121277 I merged the tso_packet false sharing avoidance in this patch as well. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-26mellanox: Change en_print to return voidJoe Perches1-2/+2
No caller or macro uses the return value so make it void. Signed-off-by: Joe Perches <joe@perches.com> Acked-By: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-20net/mlx4_en: Add mlx4_en_get_cqe helperIdo Shamay1-0/+6
This function derives the base address of the CQE from the CQE size, and calculates the real CQE context segment in it from the factor (this is like before). Before this change the code used the factor to calculate the base address of the CQE as well. The factor indicates in which segment of the cqe stride the cqe information is located. For 32-byte strides, the segment is 0, and for 64 byte strides, the segment is 1 (bytes 32..63). Using the factor was ok as long as we had only 32 and 64 byte strides. However, with larger strides, the factor is zero, and so cannot be used to calculate the base of the CQE. The helper uses the same method of CQE buffer pulling made by all other components that reads the CQE buffer (mlx4_ib driver and libmlx4). Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-23net/mlx4_en: Reduce memory consumption on kdump kernelAmir Vadai1-0/+1
When memory is limited, reduce number of rx and tx rings. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-23net/mlx4_en: Disable blueflame using ethtool private flagsAmir Vadai1-0/+5
Enable the user to turn off the hardware feature called BlueFlame. Since it is something specific to mlx4_en hardware, we control the feature via ethtool private flags. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+4
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09net/mlx4_en: Ignore budget on TX napi pollingAmir Vadai1-0/+3
It is recommended that TX work not count against the quota. The cost of TX packet liberation is a minute percentage of what it costs to process an RX frame. Furthermore, that SKB freeing makes memory available for other paths in the stack. Give the TX a larger budget and be more aggressive about cleaning up the Tx descriptors this budget could be changed using ethtool: $ ethtool -C eth1 tx-frames-irq <budget> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09net/mlx4_en: Fix mac_hash database inconsistencyNoa Osherovich1-1/+1
Using a local copy of dev_addr in mlx4_en_set_mac() to prevent dev_addr from being modified during error flow or when dev_addr is modified in another context (which is another problem that is being discussed over the mailing list [1]). Also fixing bad naming of priv->prev_mac into priv->current_mac. [1] - http://patchwork.ozlabs.org/patch/351489/ Reviewed-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-09net/mlx4_en: Do not count LLC/SNAP in MTU calculationYishai Hadas1-2/+0
LLC/SNAP 8 bytes should not be added as part of header calculation. If used, payload will be decreased accordingly. For MTU of 1500 we'll set 1522 instead of 1523. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Liran Liss <liranl@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-03net/mlx4_en: Don't use irq_affinity_notifier to track changes in IRQ ↵Amir Vadai1-0/+1
affinity map IRQ affinity notifier can only have a single notifier - cpu_rmap notifier. Can't use it to track changes in IRQ affinity map. Detect IRQ affinity changes by comparing CPU to current IRQ affinity map during NAPI poll thread. CC: Thomas Gleixner <tglx@linutronix.de> CC: Ben Hutchings <ben@decadent.org.uk> Fixes: 2eacc23 ("net/mlx4_core: Enforce irq affinity changes immediatly") Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-12net/mlx4_en: Use affinity hintYuval Atias1-0/+1
The “affinity hint” mechanism is used by the user space daemon, irqbalancer, to indicate a preferred CPU mask for irqs. Irqbalancer can use this hint to balance the irqs between the cpus indicated by the mask. We wish the HCA to preferentially map the IRQs it uses to numa cores close to it. To accomplish this, we use cpumask_set_cpu_local_first(), that sets the affinity hint according the following policy: First it maps IRQs to “close” numa cores. If these are exhausted, the remaining IRQs are mapped to “far” numa cores. Signed-off-by: Yuval Atias <yuvala@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-02Revert "net/mlx4_en: Use affinity hint"David S. Miller1-1/+0
This reverts commit 70a640d0dae3a9b1b222ce673eb5d92c263ddd61. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-02net/mlx4_en: Use affinity hintYuval Atias1-0/+1
The “affinity hint” mechanism is used by the user space daemon, irqbalancer, to indicate a preferred CPU mask for irqs. Irqbalancer can use this hint to balance the irqs between the cpus indicated by the mask. We wish the HCA to preferentially map the IRQs it uses to numa cores close to it. To accomplish this, we use cpumask_set_cpu_local_first(), that sets the affinity hint according the following policy: First it maps IRQs to “close” numa cores. If these are exhausted, the remaining IRQs are mapped to “far” numa cores. Signed-off-by: Yuval Atias <yuvala@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-09mellanox: Logging message cleanupsJoe Perches1-20/+20
Use a more current logging style. o Coalesce formats o Add missing spaces for coalesced formats o Align arguments for modified formats o Add missing newlines for some logging messages o Use DRV_NAME as part of format instead of %s, DRV_NAME to reduce overall text. o Use ..., ##__VA_ARGS__ instead of args... in macros o Correct a few format typos o Use a single line message where appropriate Signed-off-by: Joe Perches <joe@perches.com> Acked-By: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-16mlx4_en: don't use napi_synchronize inside mlx4_en_netpollChris Mason1-1/+0
The mlx4 driver is triggering schedules while atomic inside mlx4_en_netpoll: spin_lock_irqsave(&cq->lock, flags); napi_synchronize(&cq->napi); ^^^^^ msleep here mlx4_en_process_rx_cq(dev, cq, 0); spin_unlock_irqrestore(&cq->lock, flags); This was part of a patch by Alexander Guller from Mellanox in 2011, but it still isn't upstream. Signed-off-by: Chris Mason <clm@fb.com> cc: stable@vger.kernel.org Acked-By: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-04-02net/mlx4: Set proper build dependancy with vxlanOr Gerlitz1-0/+2
Make sure that vxlan_get_rx_port() is present in the kernel build in a manner consistent with mlx4, else mlx4 can be made built-in where vxlan a module and the phase of the build linking fails. Add CONFIG_MLX4_EN_VXLAN for that. Also, #ifdef the advertizement and implementation of the mlx4 vxlan ndo calls and related code under this config directive. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-29net/mlx4: Implement vxlan ndo callsOr Gerlitz1-0/+3
Add implementation for the add/del vxlan port ndo calls, using the CONFIG_DEV firmware command. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-2/+2
Conflicts: drivers/net/wireless/ath/ath9k/recv.c drivers/net/wireless/mwifiex/pcie.c net/ipv6/sit.c The SIT driver conflict consists of a bug fix being done by hand in 'net' (missing u64_stats_init()) whilst in 'net-next' a helper was created (netdev_alloc_pcpu_stats()) which takes care of this. The two wireless conflicts were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03net/mlx4: Replace mlx4_en_mac_to_u64() with mlx4_mac_to_u64()Eugenia Emantayev1-1/+0
Currently, the EN driver uses a private static function mlx4_en_mac_to_u64(). Move it to a common include file (driver.h) for mlx4_en and mlx4_ib for further use. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03net/mlx4_en: Move queue stopped/waked counters to be per ringEugenia Emantayev1-0/+2
Give accurate counters and avoids cache misses when several rings update the counters of stop/wake queue. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-03-03net/mlx4_en: Verify mlx4_en module parametersEugenia Emantayev1-0/+9
Verify mlx4_en module parameters. In case they are out of range - reset to default values. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26net,IB/mlx: Bump all Mellanox driver versionsAmir Vadai1-2/+2
Bump all Mellanox driver versions. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-25net/mlx4: Set number of RX rings in a utility functionIdo Shamay1-1/+1
mlx4_en_add() is too long. Moving set number of RX rings to a utiltity function to improve readability and modulization of the code. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17netdevice: add queue selection fallback handler for ndo_select_queueDaniel Borkmann1-1/+1
Add a new argument for ndo_select_queue() callback that passes a fallback handler. This gets invoked through netdev_pick_tx(); fallback handler is currently __netdev_pick_tx() as most drivers invoke this function within their customized implementation in case for skbs that don't need any special handling. This fallback handler can then be replaced on other call-sites with different queue selection methods (e.g. in packet sockets, pktgen etc). This also has the nice side-effect that __netdev_pick_tx() is then only invoked from netdev_pick_tx() and export of that function to modules can be undone. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+2
2014-01-13net/mlx4_en: call gro handler for encapsulated framesEric Dumazet1-2/+2
In order to use the native GRO handling of encapsulated protocols on mlx4, we need to call napi_gro_receive() instead of netif_receive_skb() unless busy polling is in action. While we are at it, rename mlx4_en_cq_ll_polling() to mlx4_en_cq_busy_polling() Tested with GRE tunnel : GRO aggregation is now performed on the ethernet device instead of being done later on gre device. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amir Vadai <amirv@mellanox.com> Cc: Jerry Chu <hkchu@google.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Acked-By: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-10net: core: explicitly select a txq before doing l2 forwardingJason Wang1-1/+2
Currently, the tx queue were selected implicitly in ndo_dfwd_start_xmit(). The will cause several issues: - NETIF_F_LLTX were removed for macvlan, so txq lock were done for macvlan instead of lower device which misses the necessary txq synchronization for lower device such as txq stopping or frozen required by dev watchdog or control path. - dev_hard_start_xmit() was called with NULL txq which bypasses the net device watchdog. - dev_hard_start_xmit() does not check txq everywhere which will lead a crash when tso is disabled for lower device. Fix this by explicitly introducing a new param for .ndo_select_queue() for just selecting queues in the case of l2 forwarding offload. netdev_pick_tx() was also extended to accept this parameter and dev_queue_xmit_accel() was used to do l2 forwarding transmission. With this fixes, NETIF_F_LLTX could be preserved for macvlan and there's no need to check txq against NULL in dev_hard_start_xmit(). Also there's no need to keep a dedicated ndo_dfwd_start_xmit() and we can just reuse the code of dev_queue_xmit() to do the transmission. In the future, it was also required for macvtap l2 forwarding support since it provides a necessary synchronization method. Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: e1000-devel@lists.sourceforge.net Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-02mlx4_en: Add PTP hardware clockShawn Bohrer1-0/+6
This adds a PHC to the mlx4_en driver. We use reader/writer spinlocks to protect the timecounter since every packet received needs to call timecounter_cycle2time() when timestamping is enabled. This can become a performance bottleneck with RSS and multiple receive queues if normal spinlocks are used. This driver has been tested with both Documentation/ptp/testptp and the linuxptp project (http://linuxptp.sourceforge.net/) on a Mellanox ConnectX-3 card. Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Acked-By: Hadar Hen Zion <hadarh@mellanox.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-31net/mlx4_en: Add netdev support for TCP/IP offloads of vxlan tunnelingOr Gerlitz1-1/+2
When the device tunneling offloads mode is vxlan do the following - call SET_PORT with the relevant setting - add DMFS steering vxlan rule for the device self and multicast mac addresses of the form: {<ETH, outer-mac> <VXLAN, ANY vnid> <ETH, ANY mac>} --> RSS QP - set relevant QPC fields in RSS context and RX ring QPs - in TX flow, set WQE fields to generate HW checksum, and handle gso skbs which are marked for encapsulation such that the HW will segment them properly. - in RX flow, read HW offloaded checksum for encapsulated packets from the CQE - advertize hw_enc_features and NETIF_F_GSO_UDP_TUNNEL to the networking stack Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-20net/mlx4_en: Add NAPI support for transmit sideEugenia Emantayev1-0/+1
Add NAPI for TX side, implement its support and provide NAPI callback. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-20net/mlx4_en: Configure the XPS queue mapping on driver loadIdo Shamay1-1/+4
Only TX rings of User Piority 0 are mapped. TX rings of other UP's are using UP 0 mapping. XPS is not in use when num_tc is set. Signed-off-by: Ido Shamay <idos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-08net/mlx4_en: Datapath structures are allocated per NUMA nodeEugenia Emantayev1-3/+3
For each RX/TX ring and its CQ, allocation is done on a NUMA node that corresponds to the core that the data structure should operate on. The assumption is that the core number is reflected by the ring index. The affected allocations are the ring/CQ data structures, the TX/RX info and the shared HW/SW buffer. For TX rings, each core has rings of all UPs. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-08net/mlx4_en: Datapath resources allocated dynamicallyEugenia Emantayev1-13/+14
Currently all TX/RX rings and completion queues are part of the netdev priv structure and are allocated statically. This patch will change the priv to hold only arrays of pointers and therefore all TX/RX rings and completetion queues will be allocated dynamically. This is in preparation for NUMA aware allocations. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Reviewed-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-10-09net/mlx4_en: Rename name of mlx4_en_rx_alloc membersAmir Vadai1-2/+2
Add page prefix to page related members: @size and @offset into @page_size and @page_offset CC: Eric Dumazet <edumazet@google.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-02net: rename CONFIG_NET_LL_RX_POLL to CONFIG_NET_RX_BUSY_POLLCong Wang1-5/+5
Eliezer renames several *ll_poll to *busy_poll, but forgets CONFIG_NET_LL_RX_POLL, so in case of confusion, rename it too. Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Cong Wang <amwang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-26mlx4: allow order-0 memory allocations in RX pathEric Dumazet1-6/+6
Signed-off-by: Eric Dumazet <edumazet@google.com> mlx4 exclusively uses order-2 allocations in RX path, which are likely to fail under memory pressure. We therefore drop frames more than needed. This patch tries order-3, order-2, order-1 and finally order-0 allocations to keep good performance, yet allow allocations if/when memory gets fragmented. By using larger pages, and avoiding unnecessary get_page()/put_page() on compound pages, this patch improves performance as well, lowering false sharing on struct page. Also use GFP_KERNEL allocations in initialization path, as allocating 12 MB (390 order-3 pages) can easily fail with GFP_ATOMIC. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Amir Vadai <amirv@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-20net/mlx4_en: Low Latency recv statisticsAmir Vadai1-0/+6
Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-06-20net/mlx4_en: Add Low Latency Socket (LLS) supportAmir Vadai1-0/+121
Add basic support for LLS. Signed-off-by: Amir Vadai <amirv@mellanox.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>