summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-02-01net: ena: Relocate skb_tx_timestamp() to improve time stamping accuracyDavid Arinzon1-2/+2
Move skb_tx_timestamp() closer to the actual time the driver sends the packets to the device. Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Add more information on TX timeoutsDavid Arinzon2-14/+64
The function responsible for polling TX completions might not receive the CPU resources it needs due to higher priority tasks running on the requested core. The driver might not be able to recognize such cases, but it can use its state to suspect that they happened. If both conditions are met: - napi hasn't been executed more than the TX completion timeout value - napi is scheduled (meaning that we've received an interrupt) Then it's more likely that the napi handler isn't scheduled because of an overloaded CPU. It was decided that for this case, the driver would wait twice as long as the regular timeout before scheduling a reset. The driver uses ENA_REGS_RESET_SUSPECTED_POLL_STARVATION reset reason to indicate this case to the device. This patch also adds more information to the ena_tx_timeout() callback. This function is called by the kernel when it detects that a specific TX queue has been closed for too long. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Change error print during ena_device_init()David Arinzon1-1/+2
The print was re-worded to a more informative one. Signed-off-by: Shahar Itzko <itzko@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Remove CQ tail pointer updateDavid Arinzon5-39/+2
The functionality was added to allow the drivers to create an SQ and CQ of different sizes. When the RX/TX SQ and CQ have the same size, such update isn't necessary as the device can safely assume it doesn't override unprocessed completions. However, if the SQ is larger than the CQ, the device might "have" more completions it wants to update about than there's room in the CQ. There's no support for different SQ and CQ sizes, therefore, removing the API and its usage. '____cacheline_aligned' compiler attribute was added to 'struct ena_com_io_cq' to ensure that the removal of the 'cq_head_db_reg' field doesn't change the cache-line layout of this struct. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Enable DIM by defaultDavid Arinzon1-0/+6
Dynamic Interrupt Moderation (DIM) is a technique designed to balance the need for timely data processing with the desire to minimize CPU overhead. Instead of generating an interrupt for every received packet, the system can dynamically adjust the rate at which interrupts are generated based on the incoming traffic patterns. Enabling DIM by default to improve the user experience. DIM can be turned on/off through ethtool: `ethtool -C <interface> adaptive-rx <on/off>` Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Osama Abboud <osamaabb@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Minor cosmetic changesDavid Arinzon1-4/+2
A few changes for better readability and style 1. Adding / Removing newlines 2. Removing an unnecessary and confusing comment 3. Using an existing variable rather than re-checking a field Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Add more documentation for RX copybreakDavid Arinzon1-0/+6
This patch contains more details about the functionality of RX copybreak. Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: ena: Remove an unused fieldDavid Arinzon2-4/+0
Remove io_sq->header_addr field because it is no longer in use. LLQ was updated to support a bounce buffer so there is no need in saving the header address of the sq. Signed-off-by: Nati Koler <nkoler@amazon.com> Signed-off-by: David Arinzon <darinzon@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01Merge branch 'net-mana-assigning-irq-affinity-on-ht-cores'Paolo Abeni4-10/+113
Souradeep Chakrabarti says: ==================== net: mana: Assigning IRQ affinity on HT cores This patch set introduces a new helper function irq_setup(), to optimize IRQ distribution for MANA network devices. The patch set makes the driver working 15% faster than with cpumask_local_spread(). ==================== Link: https://lore.kernel.org/r/1706509267-17754-1-git-send-email-schakrabarti@linux.microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: mana: Assigning IRQ affinity on HT coresSouradeep Chakrabarti1-11/+50
Existing MANA design assigns IRQ to every CPU, including sibling hyper-threads. This may cause multiple IRQs to be active simultaneously in the same core and may reduce the network performance. Improve the performance by assigning IRQ to non sibling CPUs in local NUMA node. The performance improvement we are getting using ntttcp with following patch is around 15 percent against existing design and approximately 11 percent, when trying to assign one IRQ in each core across NUMA nodes, if enough cores are present. The change will improve the performance for the system with high number of CPU, where number of CPUs in a node is more than 64 CPUs. Nodes with 64 CPUs or less than 64 CPUs will not be affected by this change. The performance study was done using ntttcp tool in Azure. The node had 2 nodes with 32 cores each, total 128 vCPU and number of channels were 32 for 32 RX rings. The below table shows a comparison between existing design and new design: IRQ node-num core-num CPU performance(%) 1 0 | 0 0 | 0 0 | 0-1 0 2 0 | 0 0 | 1 1 | 2-3 3 3 0 | 0 1 | 2 2 | 4-5 10 4 0 | 0 1 | 3 3 | 6-7 15 5 0 | 0 2 | 4 4 | 8-9 15 ... ... 25 0 | 0 12| 24 24| 48-49 12 ... 32 0 | 0 15| 31 31| 62-63 12 33 0 | 0 16| 0 32| 0-1 10 ... 64 0 | 0 31| 31 63| 62-63 0 Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: mana: add a function to spread IRQs per CPUsYury Norov1-0/+29
Souradeep investigated that the driver performs faster if IRQs are spread on CPUs with the following heuristics: 1. No more than one IRQ per CPU, if possible; 2. NUMA locality is the second priority; 3. Sibling dislocality is the last priority. Let's consider this topology: Node 0 1 Core 0 1 2 3 CPU 0 1 2 3 4 5 6 7 The most performant IRQ distribution based on the above topology and heuristics may look like this: IRQ Nodes Cores CPUs 0 1 0 0-1 1 1 1 2-3 2 1 0 0-1 3 1 1 2-3 4 2 2 4-5 5 2 3 6-7 6 2 2 4-5 7 2 3 6-7 The irq_setup() routine introduced in this patch leverages the for_each_numa_hop_mask() iterator and assigns IRQs to sibling groups as described above. According to [1], for NUMA-aware but sibling-ignorant IRQ distribution based on cpumask_local_spread() performance test results look like this: ./ntttcp -r -m 16 NTTTCP for Linux 1.4.0 --------------------------------------------------------- 08:05:20 INFO: 17 threads created 08:05:28 INFO: Network activity progressing... 08:06:28 INFO: Test run completed. 08:06:28 INFO: Test cycle finished. 08:06:28 INFO: ##### Totals: ##### 08:06:28 INFO: test duration :60.00 seconds 08:06:28 INFO: total bytes :630292053310 08:06:28 INFO: throughput :84.04Gbps 08:06:28 INFO: retrans segs :4 08:06:28 INFO: cpu cores :192 08:06:28 INFO: cpu speed :3799.725MHz 08:06:28 INFO: user :0.05% 08:06:28 INFO: system :1.60% 08:06:28 INFO: idle :96.41% 08:06:28 INFO: iowait :0.00% 08:06:28 INFO: softirq :1.94% 08:06:28 INFO: cycles/byte :2.50 08:06:28 INFO: cpu busy (all) :534.41% For NUMA- and sibling-aware IRQ distribution, the same test works 15% faster: ./ntttcp -r -m 16 NTTTCP for Linux 1.4.0 --------------------------------------------------------- 08:08:51 INFO: 17 threads created 08:08:56 INFO: Network activity progressing... 08:09:56 INFO: Test run completed. 08:09:56 INFO: Test cycle finished. 08:09:56 INFO: ##### Totals: ##### 08:09:56 INFO: test duration :60.00 seconds 08:09:56 INFO: total bytes :741966608384 08:09:56 INFO: throughput :98.93Gbps 08:09:56 INFO: retrans segs :6 08:09:56 INFO: cpu cores :192 08:09:56 INFO: cpu speed :3799.791MHz 08:09:56 INFO: user :0.06% 08:09:56 INFO: system :1.81% 08:09:56 INFO: idle :96.18% 08:09:56 INFO: iowait :0.00% 08:09:56 INFO: softirq :1.95% 08:09:56 INFO: cycles/byte :2.25 08:09:56 INFO: cpu busy (all) :569.22% [1] https://lore.kernel.org/all/20231211063726.GA4977@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net/ Signed-off-by: Yury Norov <yury.norov@gmail.com> Co-developed-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01cpumask: define cleanup function for cpumasksYury Norov1-0/+3
Now we can simplify code that allocates cpumasks for local needs. Signed-off-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01cpumask: add cpumask_weight_andnot()Yury Norov3-0/+32
Similarly to cpumask_weight_and(), cpumask_weight_andnot() is a handy helper that may help to avoid creating an intermediate mask just to calculate number of bits that set in a 1st given mask, and clear in 2nd one. Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01net: dsa: Add KSZ8567 switch supportPhilippe Schenker5-1/+53
This commit introduces support for the KSZ8567, a robust 7-port Ethernet switch. The KSZ8567 features two RGMII/MII/RMII interfaces, each capable of gigabit speeds, complemented by five 10/100 Mbps MAC/PHYs. Signed-off-by: Philippe Schenker <philippe.schenker@impulsing.ch> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20240130083419.135763-2-dev@pschenker.ch Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01dt-bindings: net: dsa: Add KSZ8567 switch supportPhilippe Schenker1-0/+1
This commit adds the dt-binding for KSZ8567, a robust 7-port Ethernet switch. The KSZ8567 features two RGMII/MII/RMII interfaces, each capable of gigabit speeds, complemented by five 10/100 Mbps MAC/PHYs. This binding is necessary to set specific capabilities for this switch chip that are necessary due to the ksz dsa driver only accepting specific chip ids. The KSZ8567 is very similar to KSZ9567 however only containing 100 Mbps phys on its downstream ports. Signed-off-by: Philippe Schenker <philippe.schenker@impulsing.ch> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20240130083419.135763-1-dev@pschenker.ch Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01dt-bindings: net: qcom,ipa: do not override firmware-name $refKrzysztof Kozlowski1-1/+1
dtschema package defines firmware-name as string-array, so individual bindings should not make it a string but instead just narrow the number of expected firmware file names. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Alex Elder <elder@linaro.org> Link: https://lore.kernel.org/r/20240129142121.102450-1-krzysztof.kozlowski@linaro.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-02-01Merge branch 'tools-net-ynl-add-features-for-tc-family'Jakub Kicinski7-187/+2221
Donald Hunter says: ==================== tools/net/ynl: Add features for tc family Add features to ynl for tc and update the tc spec to use them. Patch 1 adds an option to output json instead of python pretty printing. Patch 2, 3 adds support and docs for sub-messages in nested attribute spaces that reference keys from a parent space. Patches 4 and 7-9 refactor ynl in support of nested struct definitions Patch 5 implements sub-message encoding for write ops. Patch 6 adds logic to set default zero values for binary blobs Patches 10, 11 adds support and docs for nested struct definitions Patch 12 updates the ynl doc generator to include type information for struct members. Patch 13 updates the tc spec - still a work in progress but more complete ==================== Link: https://lore.kernel.org/r/20240129223458.52046-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01doc/netlink/specs: Update the tc specDonald Hunter1-101/+2017
Fill in many of the gaps in the tc netlink spec, including stats attrs, classes and actions. Many documentation strings have also been added. This is still a work in progress, albeit fairly complete: - there are still many attributes left as binary blobs. - actions have not had much testing Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-14-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01tools/net/ynl: Add type info to struct members in generated docsDonald Hunter1-1/+8
Extend the ynl doc generator to include type information for struct members, ignoring the pad type. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-13-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01doc/netlink: Describe nested structs in netlink raw docsDonald Hunter1-0/+34
Add a description and example of nested struct definitions to the netlink raw documentation. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-12-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01tools/net/ynl: Add support for nested structsDonald Hunter3-9/+34
Make it possible for struct definitions to reference other struct definitions ofr binary members. For example, the tbf qdisc uses this struct definition for its parms attribute: - name: tc-tbf-qopt type: struct members: - name: rate type: binary struct: tc-ratespec - name: peakrate type: binary struct: tc-ratespec - name: limit type: u32 - name: buffer type: u32 - name: mtu type: u32 This adds the necessary schema changes and adds nested struct encoding and decoding to ynl. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-11-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01tools/net/ynl: Move formatted_string method out of NlAttrDonald Hunter1-16/+15
The formatted_string() class method was in NlAttr so that it could be accessed by NlAttr.as_struct(). Now that as_struct() has been removed, move formatted_string() to YnlFamily as an internal helper method. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-10-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01tools/net/ynl: Rename _fixed_header_size() to _struct_size()Donald Hunter1-6/+6
Refactor the _fixed_header_size() method to be _struct_size() so that naming is consistent with _encode_struct() and _decode_struct(). Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-9-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01tools/net/ynl: Combine struct decoding logic in ynlDonald Hunter1-33/+14
_decode_fixed_header() and NlAttr.as_struct() both implemented struct decoding logic. Deduplicate the code into newly named _decode_struct() method. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-8-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01tools/net/ynl: Encode default values for binary blobsDonald Hunter1-2/+7
Add support for defaulting binary byte arrays to all zeros as well as defaulting scalar values to 0 when encoding input parameters. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-7-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01tools/net/ynl: Add support for encoding sub-messagesDonald Hunter1-4/+23
Add sub-message encoding to ynl. This makes it possible to create tc qdiscs and other polymorphic netlink objects. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-6-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01tools/net/ynl: Refactor fixed header encoding into separate methodDonald Hunter1-11/+15
Refactor the fixed header encoding into a separate _encode_struct method so that it can be reused for fixed headers in sub-messages and for encoding structs. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-5-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01doc/netlink: Describe sub-message selector resolutionDonald Hunter1-0/+8
Update the netlink-raw docs to add a description of sub-message selector resolution to explain that selector resolution is constrained by the spec. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-4-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01tools/net/ynl: Support sub-messages in nested attribute spacesDonald Hunter1-9/+29
Sub-message selectors could only be resolved using values from the current nest level. Enable value lookup in outer scopes by using collections.ChainMap to implement an ordered lookup from nested to outer scopes. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-3-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01tools/net/ynl: Add --output-json arg to ynl cliDonald Hunter1-3/+19
The ynl cli currently emits python pretty printed structures which is hard to consume. Add a new --output-json argument to emit JSON. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240129223458.52046-2-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01selftests: Declare local variable for pause in fcnal-test.shDavid Ahern1-4/+5
Running fcnal-test.sh script with -P argument is causing test failures: $ ./fcnal-test.sh -t ping -P TEST: ping out - ns-B IP [ OK ] hit enter to continue, 'q' to quit fcnal-test.sh: line 106: [: ping: integer expression expected TEST: out, [FAIL] expected rc ping; actual rc 0 hit enter to continue, 'q' to quit The test functions use local variable 'a' for addresses and then log_test is also using 'a' without a local declaration. Fix by declaring a local variable and using 'ans' (for answer) in the read. Signed-off-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20240130154327.33848-1-dsahern@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01Merge branch 'af_unix-remove-io_uring-dead-code-in-gc'Jakub Kicinski8-209/+143
Kuniyuki Iwashima says: ==================== af_unix: Remove io_uring dead code in GC. I will post another series that rewrites the garbage collector for AF_UNIX socket. This is a prep series to clean up changes to GC made by io_uring but now not necessary. ==================== Link: https://lore.kernel.org/r/20240129190435.57228-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01af_unix: Remove CONFIG_UNIX_SCM.Kuniyuki Iwashima8-175/+137
Originally, the code related to garbage collection was all in garbage.c. Commit f4e65870e5ce ("net: split out functions related to registering inflight socket files") moved some functions to scm.c for io_uring and added CONFIG_UNIX_SCM just in case AF_UNIX was built as module. However, since commit 97154bcf4d1b ("af_unix: Kconfig: make CONFIG_UNIX bool"), AF_UNIX is no longer built separately. Also, io_uring does not support SCM_RIGHTS now. Let's move the functions back to garbage.c Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20240129190435.57228-4-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01af_unix: Remove io_uring code for GC.Kuniyuki Iwashima3-30/+2
Since commit 705318a99a13 ("io_uring/af_unix: disable sending io_uring over sockets"), io_uring's unix socket cannot be passed via SCM_RIGHTS, so it does not contribute to cyclic reference and no longer be candidate for garbage collection. Also, commit 6e5e6d274956 ("io_uring: drop any code related to SCM_RIGHTS") cleaned up SCM_RIGHTS code in io_uring. Let's do it in AF_UNIX as well by reverting commit 0091bfc81741 ("io_uring/af_unix: defer registered files gc to io_uring release") and commit 10369080454d ("net: reclaim skb->scm_io_uring bit"). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20240129190435.57228-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01af_unix: Replace BUG_ON() with WARN_ON_ONCE().Kuniyuki Iwashima2-8/+8
This is a prep patch for the last patch in this series so that checkpatch will not warn about BUG_ON(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Jens Axboe <axboe@kernel.dk> Link: https://lore.kernel.org/r/20240129190435.57228-2-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: bridge: Use KMEM_CACHE instead of kmem_cache_createKunwu Chan1-4/+1
commit 0a31bd5f2bbb ("KMEM_CACHE(): simplify slab cache creation") introduces a new macro. Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240130092536.73623-1-chentao@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: ipv4: Simplify the allocation of slab caches in inet_initpeersKunwu Chan1-4/+1
commit 0a31bd5f2bbb ("KMEM_CACHE(): simplify slab cache creation") introduces a new macro. Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240130092255.73078-1-chentao@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01Merge branch 'net-phy-split-at803x'Jakub Kicinski10-2766/+2927
Christian Marangi says: ==================== net: phy: split at803x This is the last patchset of a long series of cleanup and preparation to make at803x better maintainable and permit the addition of other QCOM PHY Families. A shared library modules is created since many QCOM PHY share similar/exact implementation and are reused. This series doesn't introduce any new code but just move the function around and introduce a new module for all the functions that are shared between the 3 different PHY family. Since the drivers are actually detached, new probe function are introduced that allocate the specific priv struct for the PHYs. After this patch, qca808x will be further generalized as LED and cable test function are also used by the QCA807x PHYs. This is just for reference and the additional function move will be done on the relates specific series. This is also needed in preparation for the introduction of qca807x PHYs family and PHY package concept. ==================== Link: https://lore.kernel.org/r/20240129141600.2592-1-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: phy: qcom: detach qca808x PHY driver from at803xChristian Marangi4-896/+942
Almost all the QCA8081 PHY driver OPs are specific and only some of them use the generic at803x. To make the at803x code slimmer, move all the specific qca808x regs and functions to a dedicated PHY driver. Probe function and priv struct is reworked to allocate and use only the qca808x specific data. Unused data from at803x PHY driver are dropped from at803x priv struct. Also a new Kconfig is introduced QCA808X_PHY, to compile the newly introduced PHY driver for QCA8081 PHY. As the Kconfig name starts with Qualcomm the same order is kept. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240129141600.2592-6-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: phy: qcom: move additional functions to shared libraryChristian Marangi3-425/+463
Move additional functions to shared library in preparation for qca808x PHY Family to be detached from at803x driver. Only the shared defines are moved to the shared qcom.h header. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240129141600.2592-5-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: phy: qcom: deatch qca83xx PHY driver from at803xChristian Marangi4-238/+284
Deatch qca83xx PHY driver from at803x. The QCA83xx PHYs implement specific function and doesn't use generic at803x so it can be detached from the driver and moved to a dedicated one. Probe function and priv struct is reimplemented to allocate and use only the qca83xx specific data. Unused data from at803x PHY driver are dropped from at803x priv struct. This is to make slimmer PHY drivers instead of including lots of bloat that would never be used in specific SoC. A new Kconfig flag QCA83XX_PHY is introduced to compile the new introduced PHY driver. As the Kconfig name starts with Qualcomm the same order is kept. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240129141600.2592-4-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: phy: qcom: create and move functions to shared libraryChristian Marangi5-67/+94
Create and move functions to shared library in preparation for qca83xx PHY Family to be detached from at803x driver. Only the shared defines are moved to the shared qcom.h header. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240129141600.2592-3-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: phy: move at803x PHY driver to dedicated directoryChristian Marangi5-7/+11
In preparation for addition of other Qcom PHY and to tidy things up, move the at803x PHY driver to dedicated directory. The same order in the Kconfig selection is saved. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240129141600.2592-2-ansuelsmth@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01Merge branch 'prevent-nullptr-exceptions-in-isr'Jakub Kicinski2-5/+63
Andre Werner says: ==================== Prevent nullptr exceptions in ISR In case phydev->irq is modified unconditionally to a valid IRQ, handling the IRQ may lead to a nullptr exception if no interrupt handler is registered to the phy driver. phy_interrupt calls a phy_device->handle_interrupt unconditionally. And interrupts are enabled in phy_connect_direct if phydev->irq is not equal to PHY_POLL or PHY_MAC_INTERRUPT, so it does not check for a phy driver providing an ISR. Adding an additonal check for a valid interrupt handler in phy_attach_direct function, and falling back to polling mode if not, should prevent for such nullptr exceptions. Moreover, the ADIN1100 phy driver is extended with an interrupt handler for changes in the link status. ==================== Link: https://lore.kernel.org/r/20240129135734.18975-1-andre.werner@systec-electronic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: phy: adin1100: Add interrupt support for link changeAndre Werner1-0/+55
An interrupt handler was added to the driver as well as functions to enable interrupts at the phy. There are several interrupts maskable at the phy, but only link change interrupts are handled by the driver yet. Signed-off-by: Andre Werner <andre.werner@systec-electronic.com> Link: https://lore.kernel.org/r/20240129135734.18975-3-andre.werner@systec-electronic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01net: phy: phy_device: Prevent nullptr exceptions on ISRAndre Werner1-5/+8
If phydev->irq is set unconditionally, check for valid interrupt handler or fall back to polling mode to prevent nullptr exceptions in interrupt service routine. Signed-off-by: Andre Werner <andre.werner@systec-electronic.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240129135734.18975-2-andre.werner@systec-electronic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01ptp: lan743x: Use spin_lock instead of spin_lock_bhLucas Tanure1-2/+2
lan743x_ptp_request_tx_timestamp uses spin_lock_bh, but it is only called from lan743x_tx_xmit_frame where all IRQs are already disabled. This fixes the "IRQs not enabled as expected" warning. Signed-off-by: Lucas Tanure <tanure@linux.com> Link: https://lore.kernel.org/r/20240128101849.107298-1-tanure@linux.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-01dpll: move xa_erase() call in to match dpll_pin_alloc() error path orderJiri Pirko1-1/+1
This is cosmetics. Move the call of xa_erase() in dpll_pin_put() so the order of cleanup calls matches the error path of dpll_pin_alloc(). Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://lore.kernel.org/r/20240130155814.268622-1-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-31selftests/net: calibrate txtimestampWillem de Bruijn1-5/+7
The test sends packets and compares enqueue, transmit and Ack timestamps with expected values. It installs netem delays to increase latency between these points. The test proves flaky in virtual environment (vng). Increase the delays to reduce variance. Scale measurement tolerance accordingly. Time sensitive tests are difficult to calibrate. Increasing delays 10x also increases runtime 10x, for one. And it may still prove flaky at some rate. Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240127023212.3746239-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-31Merge tag 'nf-next-24-01-29' of ↵David S. Miller12-53/+94
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== nf-next pr 2024-01-29 This batch contains updates for your *next* tree. First three changes, from Phil Sutter, allow userspace to define a table that is exclusively owned by a daemon (via netlink socket aliveness) without auto-removing this table when the userspace program exits. Such table gets marked as orphaned and a restarting management daemon may re-attach/reassume ownership. Next patch, from Pablo, passes already-validated flags variable around rather than having called code re-fetch it from netlnik message. Patches 5 and 6 update ipvs and nf_conncount to use the recently introduced KMEM_CACHE() macro. Last three patches, from myself, tweak kconfig logic a little to permit a kernel configuration that can run iptables-over-nftables but not classic (setsockopt) iptables. Such builds lack the builtin-filter/mangle/raw/nat/security tables, the set/getsockopt interface and the "old blob format" interpreter/traverser. For now, this is 'oldconfig friendly', users need to manually deselect existing config options for this. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>