linux-hardened

Author	SHA1	Message	Date
Eric Dumazet	ecfd2ce1a9	mlx4: change TX coalescing defaults mlx4 currently uses a too high tx coalescing setting, deferring TX completion interrupts by up to 128 us. With the recent skb_orphan() removal in commit `8112ec3b87`, performance of a single TCP flow is capped to ~4 Gbps, unless we increase tcp_limit_output_bytes. I suggest using 16 us instead of 128 us, allowing a finer control. Performance of a single TCP flow is restored to previous levels, while keeping TCP small queues fully enabled with default sysctl. This patch is also a BQL prereq. Reported-by: Vimalkumar <j.vimal@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yevgeny Petrilin <yevgenyp@mellanox.com> Cc: Or Gerlitz <ogerlitz@mellanox.com> Acked-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-11-07 15:30:19 -05:00
Amir Vadai	ee64c0ee51	net/mlx4_en: Limit the RFS filter IDs to be < RPS_NO_FILTER RFS filter id can't have the special value RPS_NO_FILTER, need to skip it when allocating id's. CC: Ben Hutchings <bhutchings@solarflare.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-26 00:23:55 -07:00
Amir Vadai	1eb8c695bd	net/mlx4_en: Add accelerated RFS support Use RFS infrastructure and flow steering in HW to keep CPU affinity of rx interrupts and application per TCP stream. A flow steering filter is added to the HW whenever the RFS ndo callback is invoked by core networking code. Because the invocation takes place in interrupt context, the actual setup of HW is done using workqueue. Whenever new filter is added, the driver checks for expiry of existing filters. Since there's window in time between the point where the core RFS code invoked the ndo callback, to the point where the HW is configured from the workqueue context, the 2nd, 3rd etc packets from that stream will cause the net core to invoke the callback again and again. To prevent inefficient/double configuration of the HW, the filters are kept in a database which is indexed using hash function to enable fast access. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-19 08:34:37 -07:00
Dan Carpenter	9c64508af2	net/mlx4_en: dereferencing freed memory We dereferenced "mclist" after the kfree(). Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-16 22:58:07 -07:00
Hadar Hen Zion	cabdc8ee37	net/mlx4_en: Add support for drop action through ethtool The drop action is implemented by allocating a QP and keeping it in a reset state such that the HW drops any packets which are steered to that QP. When a drop action is requested, we attach the relevant flow to that QP. Sign-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:06 -07:00
Hadar Hen Zion	592e49dda8	net/mlx4: Implement promiscuous mode with device managed flow-steering The device managed flow steering API has three promiscuous modes: 1. Uplink - captures all the packets that arrive to the port. 2. Allmulti - captures all multicast packets arriving to the port. 3. Function port - for future use, this mode is not implemented yet. Use these modes with the flow_attach and flow_detach firmware commands according to the promiscuous state of the netdevice. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:06 -07:00
Hadar Hen Zion	0ff1fb654b	{NET, IB}/mlx4: Add device managed flow steering firmware API The driver is modified to support three operation modes. If supported by firmware use the device managed flow steering API, that which we call device managed steering mode. Else, if the firmware supports the B0 steering mode use it, and finally, if none of the above, use the A0 steering mode. When the steering mode is device managed, the code is modified such that L2 based rules set by the mlx4_en driver for Ethernet unicast and multicast, and the IB stack multicast attach calls done through the mlx4_ib driver are all routed to use the device managed API. When attaching rule using device managed flow steering API, the firmware returns a 64 bit registration id, which is to be provided during detach. Currently the firmware is always programmed during HCA initialization to use standard L2 hashing. Future work should be done to allow configuring the flow-steering hash function with common, non proprietary means. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:05 -07:00
Hadar Hen Zion	c96d97f4d1	net/mlx4: Set steering mode according to device capabilities Instead of checking the firmware supported steering mode in various places in the code, add a dedicated field in the mlx4 device capabilities structure which is written once during the initialization flow and read across the code. This also set the grounds for add new steering modes. Currently two modes are supported, and are named after the ConnectX HW versions A0 and B0. A0 steering uses mac_index, vlan_index and priority to steer traffic into pre-defined range of QPs. B0 steering uses Ethernet L2 hashing rules and is enabled only if the firmware supports both unicast and multicast B0 steering, The current steering modes are relevant for Ethernet traffic only, such that Infiniband steering remains untouched. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:05 -07:00
Yevgeny Petrilin	6d19993788	net/mlx4_en: Re-design multicast attachments flow Currently, for every change in the net device multicast list, the driver detaches all the addresses from the HW device, and then attaches the updated list. This behavior is wrong from two aspects: first, it causes a load of firmware commands and second, there is period of time where the correct addresses are not attached, which turned into packet loss. To improve - a copy of the multicast list is saved by the driver. For every change in the multicast list, the multicast list copy is used to find the delta between those two lists and add or remove multicast addresses as needed. Reported-by: Shawn Bohrer <sbohrer@rgmadvisors.com> Cc: Shawn Bohrer <sbohrer@rgmadvisors.com> Signed-off-by: Hadar Hen Zion <hadarh@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-07-07 16:23:05 -07:00
Yevgeny Petrilin	044ca2a5f2	net/mlx4_en: Release QP range in free_resources Add a missing resource release in ring cleanup. Not doing this leaves a range of QPs that are being reserved, and no one can use them. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 16:30:12 -07:00
Yevgeny Petrilin	5c8e904666	net/mlx4_en: Set correct port parameters during device initialization Set valid port parameters: MTU and flow control configuration when configuring the port during HW device initialization, prior to the net device open() being called. Using invalid parameters (such as all zeros) could lead to bad firmware behavior. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-06-25 16:30:12 -07:00
Amir Vadai	bc6a4744b8	net/mlx4_en: num cores tx rings for every UP Change the TX ring scheme such that the number of rings for untagged packets and for tagged packets (per each of the vlan priorities) is the same, unlike the current situation where for tagged traffic there's one ring per priority and for untagged rings as the number of core. Queue selection is done as follows: If the mqprio qdisc is operates on the interface, such that the core networking code invoked the device setup_tc ndo callback, a mapping of skb->priority => queue set is forced - for both, tagged and untagged traffic. Else, the egress map skb->priority => User priority is used for tagged traffic, and all untagged traffic is sent through tx rings of UP 0. The patch follows the convergence of discussing that issue with John Fastabend over this thread http://comments.gmane.org/gmane.linux.network/229877 Cc: John Fastabend <john.r.fastabend@intel.com> Cc: Liran Liss <liranl@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-05-17 16:17:50 -04:00
Yevgeny Petrilin	5b263f5374	mlx4_en: Byte Queue Limit support Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:34:02 -04:00
Yevgeny Petrilin	e22979d96a	mlx4_en: Moving to Interrupts for TX completions Moving to interrupts instead of polling fpr TX completions Avoiding situations where skb can be held in by the driver for a long time (till timer expires). The change is also necessary for supporting BQL. Removing comp_lock that was required because we could handle TX completions from several contexts: Interrupts, timer, polling. Now there is only interrupts Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:34:02 -04:00
Yevgeny Petrilin	a19a848a45	mlx4_en: Added Ethtool support for TX Interrupt coalescing Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-23 22:34:02 -04:00
Amir Vadai	897d7846b4	net/mlx4_en: sk_prio <=> UP for untagged traffic Since vlan egress map is only good for tagged traffic, need to have other mapping to be used by untagged traffic. For that, the driver uses sch_mqprio mapping. This mapping could be set by using tc tool from iproute2 package. Mapped UP will be used by the HW for QoS purposes, but won't go out on the wire. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 05:08:04 -04:00
Amir Vadai	564c274c3d	net/mlx4_en: DCB QoS support Set TSA, promised BW and PFC using IEEE 802.1qaz netlink commands. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 05:08:04 -04:00
Amir Vadai	0e98b523c4	net/mlx4_en: Force user priority by QP attribute Instead of relying on HW to change schedule queue by UP, schedule queue is fixed for a tx_ring, and UP in WQE is ignored in this aspect. This resolves two issues with untagged traffic: 1. untagged traffic has no UP in packet which is needed for QoS. The change above allows setting the schedule queue (and by that the UP) of such a stream. 2. BlueFlame uses the same field used by vlan tag. So forcing UP from QPC allows using BF for untagged but prioritized traffic. In old firmware that force UP is not supported, untagged traffic will not subject to QoS. Because UP is set by QP, need to always have a tx ring per UP, even if pfcrx module paramter is false. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-05 05:08:03 -04:00
Yevgeny Petrilin	ebf8c9aa03	net/mlx4_en: Saving mem access on data path Localized the pdev->dev, and using dma_map instead of pci_map There are multiple map/unmap operations on data path, optimizing those by saving redundant pointer access. Those places were identified as hot-spots when running kernel profiling during some benchmarks. The fixes had most impact when testing packet rate with small packets, reducing several % from CPU load, and in some case being the difference between reaching wire speed or being CPU bound. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-03-06 15:19:17 -05:00
David S. Miller	d5ef8a4d87	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/infiniband/hw/nes/nes_cm.c Simple whitespace conflict. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-10 23:32:28 -05:00
Thadeu Lima de Souza Cascardo	68355f7113	mlx4: allow device removal by fixing dma unmap size After opening the network interface, Mellanox ConnectX device cannot be removed by hotplug because it has not properly unmapped all DMA memory. It happens that mlx4_en_activate_rx_rings overrides the variable that keeps the size of the memory mapped. This is fixed by passing to mlx4_en_destroy_rx_ring the same size that is given to mlx4_en_create_rx_ring. After applying this patch, hot unplugging the device works after opening the interface. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-02-06 14:42:28 -05:00
Joe Perches	41de8d4cff	drivers/net: Remove alloc_etherdev error messages alloc_etherdev has a generic OOM/unable to alloc message. Remove the duplicative messages after alloc_etherdev calls. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-31 16:20:48 -05:00
Eugenia Emantayev	93ece0c1a7	mlx4_en: eth statistics modification In native mode display all available staticstics. In SRIOV mode on VF display only SW counters statistics, in SRIOV mode on hypervisor display SW counters and errors (got from FW) statistics. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-22 15:08:43 -05:00
Eugenia Emantayev	b477ba628a	mlx4_en: clear all eth statistics when port goes up Bug fix: Not all stats fields were cleared. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Reviewed-by: Yevgeny Petriln <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-01-22 15:08:43 -05:00
Alexander Guller	0e03567a2c	mlx4_en: nullify cached multicast address list after cleanup Solves an issue where we tried to free the same page twice after the port has been opened and closed. Signed-off-by: Alexander Guller <alexg@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-19 14:57:07 -05:00
Eugenia Emantayev	ffe455ad04	mlx4: Ethernet port management modifications The physical port is now common to the PF and VFs. The port resources and configuration is managed by the PF, VFs can only influence the MTU of the port, it is set as max among all functions, Each function allocates RX buffers of required size to meet it's MTU enforcement. Port management code was moved to mlx4_core, as the mlx4_en module is virtualization unaware Move handling qp functionality to mlx4_get_eth_qp/mlx4_put_eth_qp including reserve/release range and add/release unicast steering. Let mlx4_register/unregister_mac deal only with MAC (un)registration. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-13 13:56:07 -05:00
Jiri Pirko	8e586137e6	net: make vlan ndo_vlan_rx_[add/kill]_vid return error value Let caller know the result of adding/removing vlan id to/from vlan filter. In some drivers I make those functions to just return 0. But in those where there is able to see if hw setup went correctly, return value is set appropriately. Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-08 19:52:37 -05:00
Amir Vadai	60d6fe99e4	net/mlx4_en: adding loopback support Device must be in promiscuous mode or DMAC must be same as the host MAC, or else packet will be dropped by the HW rx filtering. Signed-off-by: Amir Vadai <amirv@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-11-27 17:17:04 -05:00
Yevgeny Petrilin	ad86107f7b	mlx4_en: Adding rxhash support Moving to Toeplitz function in RSS calculation. Reporting rxhash in skb. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-19 03:42:27 -04:00
Alexander Guller	4234144f5c	mlx4_en: Fix crash upon device initialization error Netdevice was being freed without being unregistered first if mlx4_SET_PORT_general or mlx4_INIT_PORT failed. Signed-off-by: Alexander Guller <alexg@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-09 23:42:58 -04:00
Alexander Guller	6b4d8d9fd1	mlx4_en: Adjusting moderation per each ring Moderation is now done per ring and coalescing is enabled by set_ring_param in ethtool. Signed-off-by: Alexander Guller <alexg@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-09 23:42:57 -04:00
Alexander Guller	fe0af03c69	mlx4_en: Removing reserve vectors Fixed a bug where ring size change caused insufficient memory upon driver restart due to unreleased EQs. Signed-off-by: Alexander Guller <alexg@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-09 23:42:57 -04:00
Alexander Guller	76532d0c7e	mlx4_en: Assigning TX irq per ring Until now only RX rings used irq per ring and TX used only one per port. >From now on, both of them will use the irq per ring while RX & TX ring[i] will use the same irq. Signed-off-by: Alexander Guller <alexg@mellanox.co.il> Signed-off-by: Sharon Cohen <sharonc@mellanox.co.il> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-10-09 23:42:56 -04:00
Jiri Pirko	afc4b13df1	net: remove use of ndo_set_multicast_list in drivers replace it by ndo_set_rx_mode Signed-off-by: Jiri Pirko <jpirko@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-08-17 20:22:03 -07:00
Jeff Kirsher	5a2cc190eb	mlx4: Move the Mellanox driver Moves the Mellanox driver into drivers/net/ethernet/mellanox/ and make the necessary Kconfig and Makefile changes. CC: Roland Dreier <roland@kernel.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2011-08-11 02:41:35 -07:00

35 commits