summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-06-12cpumask: Utility function to set n'th cpu - local cpu firstAmir Vadai2-0/+71
This function sets the n'th cpu - local cpu's first. For example: in a 16 cores server with even cpu's local, will get the following values: cpumask_set_cpu_local_first(0, numa, cpumask) => cpu 0 is set cpumask_set_cpu_local_first(1, numa, cpumask) => cpu 2 is set ... cpumask_set_cpu_local_first(7, numa, cpumask) => cpu 14 is set cpumask_set_cpu_local_first(8, numa, cpumask) => cpu 1 is set cpumask_set_cpu_local_first(9, numa, cpumask) => cpu 3 is set ... cpumask_set_cpu_local_first(15, numa, cpumask) => cpu 15 is set Curently this function will be used by multi queue networking devices to calculate the irq affinity mask, such that as many local cpu's as possible will be utilized to handle the mq device irq's. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11Merge branch 'master' of ↵David S. Miller20-174/+453
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2014-06-11 This series contains updates to igb, i40e and i40evf. Todd makes a change to igb to un-hide invariant returns by getting rid of the E1000_SUCCESS define and converting those returns to return 0. Jacob separates the hardware logic from the set function, so that we can re-use it during a ptp_reset in igb. This enables the reset to return functionality to the last know timestamp mode, rather than resetting the value. Ashish implements context flags for headwb and headwb_addr so that we do not have to keep them always enabled. Shannon updates the admin queue API for the new firmware, which adds set_pf_content, nvm_config_read/write, replaces set_phy_reset with set_phy_debug and removes nvm_read/write_reg_se. Cleans up the driver to use the stored base_queue value since there is no need to read the PCI register for the PF's base queue on every single transmit queue enable and disable as we already have the value stored from reading the capability features at startup. Anjali changes the notion of source and destination for FD_SB in ethtool to align i40e with other drivers. Adds flow director statistics to the PF stats. Fixes a bug in ethtool for flow director drop packet filter where the drop action comes down as a ring_cookie value, so allow it as a special value that can be used to configure destination control. Mitch fixes the i40evf to keep the driver from going down when it is already in a down state. This prevents a CPU soft lock in napi_disable(). Also change the i40evf to check the admin queue error bits since the firmware can indicate any admin queue error states to the driver via some bits in the length registers. Neerav separates out the DCB capability and enabled flags because currently if the firmware reports DCB capability the driver enables I40E_FLAG_DCB_ENABLED flag. When this flag is enabled the driver inserts a tag when transmitting a packet from the port even if there are no DCB traffic classes configured at the port. So by adding the additional flag, I40E_FLAG_DCB_CAPABLE, that will be set when the DCB capability is present and the existing enabled flag will only be set if there are more than one traffic classes configured at the port. Greg fixes the i40e driver to not automatically accept tagged packets by default so that the system must request a VLAN tag packet filter to get packets with that tag. Greg also converts i40e to use the in-kernel ether_addr_copy() instead of mempcy(). Jesse removes the FTYPE field from the receive descriptor to match the hardware implementation. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11Merge branch 'sctp-next'David S. Miller6-68/+136
Daniel Borkmann says: ==================== SCTP update This set contains transport path selection improvements in SCTP. Please see individual patches for details. ==================== Acked-by: Vlad Yasevich <vyasevich@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net: sctp: fix incorrect type in gfp initializerDaniel Borkmann1-1/+1
This fixes the following sparse warning: net/sctp/associola.c:1556:29: warning: incorrect type in initializer (different base types) net/sctp/associola.c:1556:29: expected bool [unsigned] [usertype] preload net/sctp/associola.c:1556:29: got restricted gfp_t Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net: sctp: improve sctp_select_active_and_retran_path selectionDaniel Borkmann1-5/+45
In function sctp_select_active_and_retran_path(), we walk the transport list in order to look for the two most recently used ACTIVE transports (trans_pri, trans_sec). In case we didn't find anything ACTIVE, we currently just camp on a possibly PF or INACTIVE transport that is primary path; this behavior actually dates back to linux-history tree of the very early days of lksctp, and can yield a behavior that chooses suboptimal transport paths. Instead, be a bit more clever by reusing and extending the recently introduced sctp_trans_elect_best() handler. In case both transports are evaluated to have the same score resulting from their states, break the tie by looking at: 1) transport patch error count 2) last_time_heard value from each transport. This is analogous to Nishida's Quick Failover draft [1], section 5.1, 3: The sender SHOULD avoid data transmission to PF destinations. When all destinations are in either PF or Inactive state, the sender MAY either move the destination from PF to active state (and transmit data to the active destination) or the sender MAY transmit data to a PF destination. In the former scenario, (i) the sender MUST NOT notify the ULP about the state transition, and (ii) MUST NOT clear the destination's error counter. It is recommended that the sender picks the PF destination with least error count (fewest consecutive timeouts) for data transmission. In case of a tie (multiple PF destinations with same error count), the sender MAY choose the last active destination. Thus for sctp_select_active_and_retran_path(), we keep track of the best, if any, transport that is in PF state and in case no ACTIVE transport has been found (hence trans_{pri,sec} is NULL), we select the best out of the three: current primary_path and retran_path as well as a possible PF transport. The secondary may still camp on the original primary_path as before. The change in sctp_trans_elect_best() with a more fine grained tie selection also improves at the same time path selection for sctp_assoc_update_retran_path() in case of non-ACTIVE states. [1] http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05 Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net: sctp: migrate most recently used transport to ktimeDaniel Borkmann4-8/+10
Be more precise in transport path selection and use ktime helpers instead of jiffies to compare and pick the better primary and secondary recently used transports. This also avoids any side-effects during a possible roll-over, and could lead to better path decision-making. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net: sctp: refactor active path selectionDaniel Borkmann1-59/+61
This patch just refactors and moves the code for the active path selection into its own helper function outside of sctp_assoc_control_transport() which is already big enough. No functional changes here. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11ktime: add ktime_after and ktime_before helperDaniel Borkmann2-1/+25
Add two minimal helper functions analogous to time_before() and time_after() that will later on both be needed by SCTP code. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11Merge branch 'mac802154'David S. Miller3-10/+18
Phoebe Buckheister says: ==================== Recent llsec code introduced a memory leak on decryption failures during rx. This fixes said leak, and optimizes the receive loops for monitor and wpan devices to only deliver skbs to devices that are actually up. Also changes a dev_kfree_skb to kfree_skb when an invalid packet is dropped before being pushed into the stack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11mac802154: don't deliver packets to devices that are downPhoebe Buckheister3-10/+17
Only one WPAN devices can be active at any given time, so only deliver packets to that one interface that is actually up. Multiple monitors may be up at any given time, but we don't have to deliver to monitors that are down either. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11mac802154: properly free incoming skbs on decryption failurePhoebe Buckheister1-0/+1
mac802154 RX did not free skbs on decryption failure, assuming that the caller would when the local rx handler returned _DROP. This was false. Signed-off-by: Phoebe Buckheister <phoebe.buckheister@itwm.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11i40e/i40evf: Bump i40e to version 0.4.10 and i40evf to 0.9.34Catherine Sullivan2-2/+2
Bump versions. Change-ID: Ic4a84354955061ca18321b1e97c9c30fe1563b5c Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40e: use stored base_queue valueShannon Nelson1-3/+2
No need to read the PCI register for the PF's base queue on every single Tx queue enable and disable as we already have the value stored from reading the capability features at startup. Change-ID: Ic02fb622757742f43cb8269369c3d972d4f66555 Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40e: Fix a bug in ethtool for FD drop packet filter actionAnjali Singhai Jain1-2/+7
A drop action comes down as a ring_cookie value, so allow it as a special value that can be used to configure destination control. Also fix the output to filter read command accordingly. Change-ID: I9956723cee42f3194885403317dd21ed4a151144 Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40e/i40evf: Add Flow director stats to PF statsAnjali Singhai Jain6-2/+41
Add members to stat struct to keep track of Flow director ATR and SideBand filter packet matches. Change-ID: Ibbb31a53c7adcc2bb96991dd80565442a2f2513c Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40e/i40evf: remove FTYPEJesse Brandeburg2-2/+0
This change drops the FTYPE field from the Rx descriptor, to match the hardware implementation. Change-ID: I66d31d2b43861da45e8ace4fb03df033abe88bab Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40evf: check admin queue error bitsMitch Williams1-0/+36
FW can indicate any admin queue error states to the driver via some bits in the length registers. Each time we process an admin queue message, check these bits and log any errors we find. Since the VF really can't do much, we just print the message and depend on the PF driver to clear things up on our behalf. Change-ID: I92bc6c53ce3b4400544e0ca19c5de2d27490bd0d Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40e/i40evf: User ether_addr_copy instead of memcpyGreg Rose4-21/+18
Linux gives us a function to copy Ethernet MAC addresses, let's use it. Change-ID: I0c861900029ca5ea65a53ca39565852fb633f6fd Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40e: Do not accept tagged packets by defaultGreg Rose1-0/+32
Remove the filter created by the firmware with the default MAC address it reads out of the NVM storage and a promiscuous VLAN tag and replace it with a filter that will not accept tagged packets by default. The system must request a VLAN tag packet filter to get packets with that tag. Change-ID: I119e6c3603a039bd68282ba31bf26f33a575490a Signed-off-by: Greg Rose <gregory.v.rose@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40e: Separate out DCB capability and enabled flagsNeerav Parikh3-10/+25
Currently if the firmware reports DCB capability the driver enables I40E_FLAG_DCB_ENABLED flag. When this flag is enabled the driver inserts a tag when transmitting a packet from the port even if there are no DCB traffic classes configured at the port. This patch adds a new flag I40E_FLAG_DCB_CAPABLE that will be set when the DCB capability is present and the existing flag I40E_FLAG_DCB_ENABLED will be set only if there are more than one traffic classes configured at the port. Change-ID: I24ccbf53ef293db2eba80c8a9772acf729795bd5 Signed-off-by: Neerav Parikh <neerav.parikh@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40evf: don't go further downMitch Williams1-2/+4
If the device is down, there's no place to go but up, so don't try to go down even more. This prevents a CPU soft lock in napi_disable(). Change-ID: I8b058b9ee974dfa01c212fae2597f4f54b333314 Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40e: Change the notion of src and dst for FD_SB in ethtoolAnjali Singhai Jain2-8/+16
In XL710 devices we program FD filter's fields from Tx perspective of the flow. However the user interface exposed in ethtool should be compliant with the previous generation of drivers where a filter src and dst field are from the RX perspective. This patch changes the ethtool interface in this regard to match the other drivers. Change-ID: Iec6ccddd87357c4fb53ccf33aa0fae699faf70cf Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40e/i40evf: AdminQ API update for new FWShannon Nelson3-33/+161
Add set_pf_context, replace set_phy_reset with set_phy_debug, add nvm_config_read/write, remove nvm_read/write_reg_se and add some PHY types. With these changes we bump the API version to 1.2. Change-ID: I4dc3aec175c2316f66fc9b726b3f7d594699d84e Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11i40e/i40evf: set headwb Tx context flags and use themAshish Shah2-3/+5
Set appropriate fields in Tx queue configuration virtchnl message to pf to enable headwb and setup headwb addr. Then use that info from the VF to set headwb and headwb_addr instead of always enabling them. Change-ID: I7d393d1b2b07f0f3355b3a4f7c2d3c6ee3b0d622 Signed-off-by: Ashish Shah <ashish.n.shah@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11igb: separate hardware setting from the set_ts_config ioctlJacob Keller1-12/+38
This patch separates the hardware logic from the set function, so that we can re-use it during a ptp_reset. This enables the reset to return functionality to the last known timestamp mode, rather than resetting the value. We initialize the mode to off during the ptp_init cycle. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11igb: unhide invariant returnsTodd Fujinaka5-74/+66
Return a 0 directly rather than a constant. Reported-by: Peter Senna Tschudin <peter.senna@gmail.com> Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-06-11amd-xgbe: Rename MAX_DMA_CHANNELS to avoid powerpc conflictLendacky, Thomas2-2/+2
MAX_DMA_CHANNELS is defined in asm/scatterlist.h of the powerpc architecture. Rename this #define in xgbe.h to avoid the redefined warning issued during compilation. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11farsync: Fix confusion about DMA address and buffer offset typesBen Hutchings1-13/+8
Use dma_addr_t for DMA address parameters and u32 for shared memory offset parameters. Do not assume that dma_addr_t is the same as unsigned long; it will not be in PAE configurations. Truncate DMA addresses to 32 bits when printing them. This is OK because the DMA mask for this device is 32-bit (per default). Also rename the DMA address parameters from 'skb' to 'dma'. Compile-tested only. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11Merge branch 'mlx4'David S. Miller2-5/+38
Or Gerlitz says: ==================== mlx4 SRIOV fixes The patch from Wei Yang is a designed fix to a regression introduced by earlier commit of him. Jack added a fix to the resource management which we got from IBM. Let's get that into 3.16-rc1 1st and later see to what stable version/s this should go. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net/mlx4_core: Keep only one driver entry release mlx4_privWei Yang1-1/+1
Following commit befdf89 "net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()", there are two mlx4 pci callbacks which will attempt to release the mlx4_priv object -- .shutdown and .remove. This leads to a use-after-free access to the already freed mlx4_priv instance and trigger a "Kernel access of bad area" crash when both .shutdown and .remove are called. During reboot or kexec, .shutdown is called, with the VFs probed to the host going through shutdown first and then the PF. Later, the PF will trigger VFs' .remove since VFs still have driver attached. Fix that by keeping only one driver entry which releases mlx4_priv. Fixes: befdf89 ('net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()') CC: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net/mlx4_core: Fix SRIOV free-pool management when enforcing resource quotasJack Morgenstein1-4/+37
The Hypervisor driver tracks free slots and reserved slots at the global level and tracks allocated slots and guaranteed slots per VF. Guaranteed slots are treated as reserved by the driver, so the total reserved slots is the sum of all guaranteed slots over all the VFs. As VFs allocate resources, free (global) is decremented and allocated (per VF) is incremented for those resources. However, reserved (global) is never changed. This means that effectively, when a VF allocates a resource from its guaranteed pool, it is actually reducing that resource's free pool (since the global reserved count was not also reduced). The fix for this problem is the following: For each resource, as long as a VF's allocated count is <= its guaranteed number, when allocating for that VF, the reserved count (global) should be reduced by the allocation as well. When the global reserved count reaches zero, the remaining global free count is still accessible as the free pool for that resource. When the VF frees resources, the reverse happens: the global reserved count for a resource is incremented only once the VFs allocated number falls below its guaranteed number. This fix was developed by Rick Kready <kready@us.ibm.com> Reported-by: Rick Kready <kready@us.ibm.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net: wimax: i2400m: control.c: Cleaning up conjunction always evaluates to falseRickard Strandqvist1-1/+1
Logical conjunction always evaluates to false: minor < 2 && minor > 1 I guess what you wanted is rather: minor > 2 || minor < 1 This was partly found using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net: ethernet: toshiba: ps3_gelic_net.c: Cleaning up a check on a memory ↵Rickard Strandqvist1-1/+1
allocation A check on a memory allocation is checked incorrectly. This was partly found using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se> Acked-by: Geoff Levand <geoff@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11amd-xgbe: fix unused variable compilation warning in phylib driverfrançois romieu1-1/+1
Fix following compilation warning: [...] CC drivers/net/phy/amd-xgbe-phy.o drivers/net/phy/amd-xgbe-phy.c:1353:30: warning: ‘amd_xgbe_phy_ids’ defined but not used [-Wunused-variable] static struct mdio_device_id amd_xgbe_phy_ids[] = { ^ Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net: filter: fix nlattr and nlattr_nest BPF testsAlexei Starovoitov1-13/+21
- 'struct nlattr' must be 2 byte aligned - provide big-endian input data for nlattr/nlattr_nest tests Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net: filter: cleanup A/X name usageAlexei Starovoitov4-302/+314
The macro 'A' used in internal BPF interpreter: #define A regs[insn->a_reg] was easily confused with the name of classic BPF register 'A', since 'A' would mean two different things depending on context. This patch is trying to clean up the naming and clarify its usage in the following way: - A and X are names of two classic BPF registers - BPF_REG_A denotes internal BPF register R0 used to map classic register A in internal BPF programs generated from classic - BPF_REG_X denotes internal BPF register R7 used to map classic register X in internal BPF programs generated from classic - internal BPF instruction format: struct sock_filter_int { __u8 code; /* opcode */ __u8 dst_reg:4; /* dest register */ __u8 src_reg:4; /* source register */ __s16 off; /* signed offset */ __s32 imm; /* signed immediate constant */ }; - BPF_X/BPF_K is 1 bit used to encode source operand of instruction In classic: BPF_X - means use register X as source operand BPF_K - means use 32-bit immediate as source operand In internal: BPF_X - means use 'src_reg' register as source operand BPF_K - means use 32-bit immediate as source operand Suggested-by: Chema Gonzalez <chema@google.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Chema Gonzalez <chema@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11Merge branch 'bridge_multicast_exports'David S. Miller4-113/+326
Linus Lüssing says: ==================== bridge: multicast snooping patches / exports The first patch is simply a cosmetic patch. So far I (and maybe others too?) have been regularly confusing these two structs, therefore I'd suggest renaming them and therefore making the follow-up patches easier to understand and nicer to fit in. The second patch fixes a minor issue, but probably not worth for stable. On the other hand the first two patches are also preparations for the third and fourth patch: These two patches are exporting functionality needed to marry the bridge multicast snooping with the batman-adv multicast optimizations recently added for the 3.15 kernel, allowing to use these optimzations in common setups having a bridge on top of e.g. bat0, too. So far these bridged setups would fall back to simple flooding through the batman-adv mesh network for any multicast packet entering bat0. More information about the batman-adv multicast optimizations currently implemented can be found here: http://www.open-mesh.org/projects/batman-adv/wiki/Basic-multicast-optimizations The integration on the batman-adv side could afterwards look like this, for instance: http://git.open-mesh.org/batman-adv.git/commitdiff/576b59dd3e34737c702e548b21fa72059262f796?hp=f95ce7131746c65fbcdffcf2089cab59e2c2f7ac ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11bridge: memorize and export selected IGMP/MLD querier portLinus Lüssing3-6/+68
Adding bridge support to the batman-adv multicast optimization requires batman-adv knowing about the existence of bridged-in IGMP/MLD queriers to be able to reliably serve any multicast listener behind this same bridge. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11bridge: add export of multicast database adjacent to net_devLinus Lüssing3-12/+76
With this new, exported function br_multicast_list_adjacent(net_dev) a list of IPv4/6 addresses is returned. This list contains all multicast addresses sensed by the bridge multicast snooping feature on all bridge ports of the bridge interface of net_dev, excluding addresses from the specified net_device itself. Adding bridge support to the batman-adv multicast optimization requires batman-adv knowing about the existence of bridged-in multicast listeners to be able to reliably serve them with multicast packets. Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11bridge: adhere to querier election mechanism specified by RFCsLinus Lüssing2-13/+95
MLDv1 (RFC2710 section 6), MLDv2 (RFC3810 section 7.6.2), IGMPv2 (RFC2236 section 3) and IGMPv3 (RFC3376 section 6.6.2) specify that the querier with lowest source address shall become the selected querier. So far the bridge stopped its querier as soon as it heard another querier regardless of its source address. This results in the "wrong" querier potentially becoming the active querier or a potential, unnecessary querying delay. With this patch the bridge memorizes the source address of the currently selected querier and ignores queries from queriers with a higher source address than the currently selected one. This slight optimization is supposed to make it more RFC compliant (but is rather uncritical and therefore probably not necessary to be queued for stable kernels). Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11bridge: rename struct bridge_mcast_query/querierLinus Lüssing3-95/+100
The current naming of these two structs is very random, in that reversing their naming would not make any semantical difference. This patch tries to make the naming less confusing by giving them a more specific, distinguishable naming. This is also useful for the upcoming patches reintroducing the "struct bridge_mcast_querier" but for storing information about the selected querier (no matter if our own or a foreign querier). Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11Merge branch 'cxgb4'David S. Miller12-57/+356
Hariprasad Shenai says: ==================== Adds support for CIQ and other misc. fixes for rdma/cxgb4 This patch series adds support to allocate and use IQs specifically for indirect interrupts, adds fixes to align ISS for iWARP connections & fixes related to tcp snd/rvd window for Chelsio T4/T5 adapters on iw_cxgb4. Also changes Interrupt Holdoff Packet Count threshold of response queues for cxgb4 driver. The patches series is created against 'net-next' tree. And includes patches on cxgb4 and iw_cxgb4 driver. Since this patch-series contains cxgb4 and iw_cxgb4 patches, we would like to request this patch series to get merged via David Miller's 'net-next' tree. We have included all the maintainers of respective drivers. Kindly review the change and let us know in case of any review comments. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11cxgb4: Change default Interrupt Holdoff Packet Count ThresholdHariprasad Shenai2-30/+40
Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11iw_cxgb4: don't truncate the recv window sizeHariprasad Shenai3-4/+53
Fixed a bug that shows up with recv window sizes that exceed the size of the RCV_BUFSIZ field in opt0 (>= 1024K). If the recv window exceeds this, then we specify the max possible in opt0, add add the rest in via a RX_DATA_ACK credits. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11iw_cxgb4: Choose appropriate hw mtu index and ISS for iWARP connectionsHariprasad Shenai5-15/+180
Select the appropriate hw mtu index and initial sequence number to optimize hw memory performance. Add new cxgb4_best_aligned_mtu() which allows callers to provide enough information to be used to [possibly] select an MTU which will result in the TCP Data Segment Size (AKA Maximum Segment Size) to be an aligned value. If an RTR message exhange is required, then align the ISS to 8B - 1 + 4, so that after the SYN the send seqno will align on a 4B boundary. The RTR message exchange will leave the send seqno aligned on an 8B boundary. If an RTR is not required, then align the ISS to 8B - 1. The goal is to have the send seqno be 8B aligned when we send the first FPDU. Based on original work by Casey Leedom <leeedom@chelsio.com> and Steve Wise <swise@opengridcomputing.com> Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11iw_cxgb4: Allocate and use IQs specifically for indirect interruptsHariprasad Shenai8-9/+84
Currently indirect interrupts for RDMA CQs funnel through the LLD's RDMA RXQs, which also handle direct interrupts for offload CPLs during RDMA connection setup/teardown. The intended T4 usage model, however, is to have indirect interrupts flow through dedicated IQs. IE not to mix indirect interrupts with CPL messages in an IQ. This patch adds the concept of RDMA concentrator IQs, or CIQs, setup and maintained by the LLD and exported to iw_cxgb4 for use when creating CQs. RDMA CPLs will flow through the LLD's RDMA RXQs, and CQ interrupts flow through the CIQs. Design: cxgb4 creates and exports an array of CIQs for the RDMA ULD. These IQs are sized according to the max available CQs available at adapter init. In addition, these IQs don't need FL buffers since they only service indirect interrupts. One CIQ is setup per RX channel similar to the RDMA RXQs. iw_cxgb4 will utilize these CIQs based on the vector value passed into create_cq(). The num_comp_vectors advertised by iw_cxgb4 will be the number of CIQs configured, and thus the vector value will be the index into the array of CIQs. Based on original work by Steve Wise <swise@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11gre: allow changing mac address when device is upstephen hemminger1-0/+1
There is no need to require forcing device down on a Ethernet GRE (gretap) tunnel to change the MAC address. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11tcp: add gfp parameter to tcp_fragmentOctavian Purdila3-10/+12
tcp_fragment can be called from process context (from tso_fragment). Add a new gfp parameter to allow it to preserve atomic memory if possible. Signed-off-by: Octavian Purdila <octavian.purdila@intel.com> Reviewed-by: Christoph Paasch <christoph.paasch@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11Merge branch 'master' of ↵David S. Miller12-113/+303
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2014-06-09 This series contains more updates to i40e and i40evf. Shannon adds checks for error status bits on the admin event queue and provides notification if seen. Cleans up unused variable and memory allocation which was used earlier in driver development and is no longer needed. Also fixes the driver to not complain about removing non-existent MAC addresses. Bumps the driver versions for both i40e and i40evf. Catherine fixes a function header comment to make sure the comment correctly reflects the function name. Mitch adds code to allow for additional VSIs since the number of VSIs that the firmware reports to us is a guaranteed minimum, not an absolute maximum. The hardware actually supports for more than the reported value, which we often need. Implements anti-spoofing for VFs for both MAC addresses and VLANs, as well as enable this feature by default for all VFs. Anjali changes the interrupt distribution policy to change the way resources for special features are handled. Fixes the driver to not fall back to one queue if the only feature enabled is ATR, since FD_SB and FD_ATR need to be checked independently in order to decide if we will support multiple queue or not. Allows the RSS table entry range and GPS to be any number, not necessarily a power of 2 because hardware does not restrict us to use a power of 2 GPS in the case of RSS as long as we are not sharing the RSS table with another VSI (VMDq). Frank modifies the driver to keep SR-IOV enabled in the case that RSS, VMFq, FD_SB and DCB are disabled so that SR-IOV does not get turned off unnecessarily. Jesse fixes a bug in receive checksum where the driver was not marking packets with bad checksums correctly, especially IPv6 packets with a bad checksum. To do this correctly, we need a define that may be set by hardware in rare cases. Greg fixes the driver to delete all the old and stale MAC filters for the VF VSI when the host administrator changes the VF MAC address from under its feet. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-09i40e/i40evf: bump version to 0.4.7 for i40e and 0.9.31 for i40evfShannon Nelson2-2/+2
Bumpity and Fred Worm say it's time to change the numbers again. Signed-off-by: Shannon Nelson <shannon.nelson@intel.com> Change-ID: I658731d022ea23cedede4be2bfecd8b4cc68d270 Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>