Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 0b4eeee2876f2b08442eb32081451bf130e01a4c ]
Currently the TBU driver will only probe when CONFIG_ARM_SMMU_QCOM_DEBUG
is enabled. The driver not probing would prevent the platform to reach
sync_state and the system will remain in sub-optimal power consumption
mode while waiting for all consumer drivers to probe. To address this,
let's register the TBU driver in qcom_smmu_impl_init(), so that it can
probe, but still enable its functionality only when the debug option in
Kconfig is enabled.
Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Closes: https://lore.kernel.org/r/CAA8EJppcXVu72OSo+OiYEiC1HQjP3qCwKMumOsUhcn6Czj0URg@mail.gmail.com
Fixes: 414ecb030870 ("iommu/arm-smmu-qcom-debug: Add support for TBUs")
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Link: https://lore.kernel.org/r/20240704010759.507798-1-quic_c_gdjako@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d3867e7148318e12b5d69b64950622f5ed06fe86 ]
Static checker is complaining about the ASID possibly set uninitialized.
This only happens in case of error and this value would be ignored anyway.
A simple fix would be just to initialize the local variable to zero,
this path will only be reached on the first attach to a domain where
the CD is already initialized to zero.
This avoids having to bloat the function with an error path.
Closes: https://lore.kernel.org/linux-iommu/849e3d77-0a3c-43c4-878d-a0e061c8cd61@moroto.mountain/T/#u
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Fixes: 04905c17f648 ("iommu/arm-smmu-v3: Build the whole CD in arm_smmu_make_s1_cd()")
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240604185218.2602058-1-smostafa@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
into next
|
|
It turns out kconfig has problems ensuring the SMMU module and the KUNIT
module are consistently y/m to allow linking. It will permit KUNIT to be a
module while SMMU is built in.
Also, Fedora apparently enables kunit on production kernels.
So, put the entire kunit in its own module using the
VISIBLE_IF_KUNIT/EXPORT_SYMBOL_IF_KUNIT machinery. This keeps it out of
vmlinus on Fedora and makes the kconfig work in the normal way. There is
no cost if kunit is disabled.
Fixes: 56e1a4cc2588 ("iommu/arm-smmu-v3: Add unit tests for arm_smmu_write_entry")
Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/all/aeea8546-5bce-4c51-b506-5d2008e52fef@leemhuis.info
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/0-v1-24cba6c0f404+2ae-smmu_kunit_module_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
This was missed because of the function pointer indirection.
nvidia_smmu_context_fault() is also installed as a irq function, and the
'void *' was changed to a struct arm_smmu_domain. Since the iommu_domain
is embedded at a non-zero offset this causes nvidia_smmu_context_fault()
to miscompute the offset. Fixup the types.
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000120
Mem abort info:
ESR = 0x0000000096000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=0000000107c9f000
[0000000000000120] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 0000000096000004 [#1] SMP
Modules linked in:
CPU: 1 PID: 47 Comm: kworker/u25:0 Not tainted 6.9.0-0.rc7.58.eln136.aarch64 #1
Hardware name: Unknown NVIDIA Jetson Orin NX/NVIDIA Jetson Orin NX, BIOS 3.1-32827747 03/19/2023
Workqueue: events_unbound deferred_probe_work_func
pstate: 604000c9 (nZCv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : nvidia_smmu_context_fault+0x1c/0x158
lr : __free_irq+0x1d4/0x2e8
sp : ffff80008044b6f0
x29: ffff80008044b6f0 x28: ffff000080a60b18 x27: ffffd32b5172e970
x26: 0000000000000000 x25: ffff0000802f5aac x24: ffff0000802f5a30
x23: ffff0000802f5b60 x22: 0000000000000057 x21: 0000000000000000
x20: ffff0000802f5a00 x19: ffff000087d4cd80 x18: ffffffffffffffff
x17: 6234362066666666 x16: 6630303078302d30 x15: ffff00008156d888
x14: 0000000000000000 x13: ffff0000801db910 x12: ffff00008156d6d0
x11: 0000000000000003 x10: ffff0000801db918 x9 : ffffd32b50f94d9c
x8 : 1fffe0001032fda1 x7 : ffff00008197ed00 x6 : 000000000000000f
x5 : 000000000000010e x4 : 000000000000010e x3 : 0000000000000000
x2 : ffffd32b51720cd8 x1 : ffff000087e6f700 x0 : 0000000000000057
Call trace:
nvidia_smmu_context_fault+0x1c/0x158
__free_irq+0x1d4/0x2e8
free_irq+0x3c/0x80
devm_free_irq+0x64/0xa8
arm_smmu_domain_free+0xc4/0x158
iommu_domain_free+0x44/0xa0
iommu_deinit_device+0xd0/0xf8
__iommu_group_remove_device+0xcc/0xe0
iommu_bus_notifier+0x64/0xa8
notifier_call_chain+0x78/0x148
blocking_notifier_call_chain+0x4c/0x90
bus_notify+0x44/0x70
device_del+0x264/0x3e8
pci_remove_bus_device+0x84/0x120
pci_remove_root_bus+0x5c/0xc0
dw_pcie_host_deinit+0x38/0xe0
tegra_pcie_config_rp+0xc0/0x1f0
tegra_pcie_dw_probe+0x34c/0x700
platform_probe+0x70/0xe8
really_probe+0xc8/0x3a0
__driver_probe_device+0x84/0x160
driver_probe_device+0x44/0x130
__device_attach_driver+0xc4/0x170
bus_for_each_drv+0x90/0x100
__device_attach+0xa8/0x1c8
device_initial_probe+0x1c/0x30
bus_probe_device+0xb0/0xc0
deferred_probe_work_func+0xbc/0x120
process_one_work+0x194/0x490
worker_thread+0x284/0x3b0
kthread+0xf4/0x108
ret_from_fork+0x10/0x20
Code: a9b97bfd 910003fd a9025bf5 f85a0035 (b94122a1)
Cc: stable@vger.kernel.org
Fixes: e0976331ad11 ("iommu/arm-smmu: Pass arm_smmu_domain to internal functions")
Reported-by: Jerry Snitselaar <jsnitsel@redhat.com>
Closes: https://lore.kernel.org/all/jto5e3ili4auk6sbzpnojdvhppgwuegir7mpd755anfhwcbkfz@2u5gh7bxb4iv
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Jerry Snitselaar <jsnitsel@redhat.com>
Acked-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/0-v1-24ce064de41f+4ac-nvidia_smmu_fault_jgg@nvidia.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Add tests for some of the more common STE update operations that we expect
to see, as well as some artificial STE updates to test the edges of
arm_smmu_write_entry. These also serve as a record of which common
operation is expected to be hitless, and how many syncs they require.
arm_smmu_write_entry implements a generic algorithm that updates an STE/CD
to any other abritrary STE/CD configuration. The update requires a
sequence of write+sync operations with some invariants that must be held
true after each sync. arm_smmu_write_entry lends itself well to
unit-testing since the function's interaction with the STE/CD is already
abstracted by input callbacks that we can hook to introspect into the
sequence of operations. We can use these hooks to guarantee that
invariants are held throughout the entire update operation.
Link: https://lore.kernel.org/r/20240106083617.1173871-3-mshavit@google.com
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Michael Shavit <mshavit@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/9-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Half the code was living in arm_smmu_domain_finalise_s1(), just move it
here and take the values directly from the pgtbl_ops instead of storing
copies.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/8-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Pull all the calculations for building the CD table entry for a mmu_struct
into arm_smmu_make_sva_cd().
Call it in the two places installing the SVA CD table entry.
Open code the last caller of arm_smmu_update_ctx_desc_devices() and remove
the function.
Remove arm_smmu_write_ctx_desc() since all callers are gone. Add the
locking assertions to arm_smmu_alloc_cd_ptr() since
arm_smmu_update_ctx_desc_devices() was the last problematic caller.
Remove quiet_cd since all users are gone, arm_smmu_make_sva_cd() creates
the same value.
The behavior of quiet_cd changes slightly, the old implementation edited
the CD in place to set CTXDESC_CD_0_TCR_EPD0 assuming it was a SVA CD
entry. This version generates a full CD entry with a 0 TTB0 and relies on
arm_smmu_write_cd_entry() to install it hitlessly.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/7-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Avoid arm_smmu_attach_dev() having to undo the changes to the
smmu_domain->devices list, acquire the cdptr earlier so we don't need to
handle that error.
Now there is a clear break in arm_smmu_attach_dev() where all the
prep-work has been done non-disruptively and we commit to making the HW
change, which cannot fail.
This completes transforming arm_smmu_attach_dev() so that it does not
disturb the HW if it fails.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/6-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Only the attach callers can perform an allocation for the CD table entry,
the other callers must not do so, they do not have the correct locking and
they cannot sleep. Split up the functions so this is clear.
arm_smmu_get_cd_ptr() will return pointer to a CD table entry without
doing any kind of allocation.
arm_smmu_alloc_cd_ptr() will allocate the table and any required
leaf.
A following patch will add lockdep assertions to arm_smmu_alloc_cd_ptr()
once the restructuring is completed and arm_smmu_alloc_cd_ptr() is never
called in the wrong context.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
A cleared entry is all 0's. Make arm_smmu_clear_cd() do this sequence.
If we are clearing an entry and for some reason it is not already
allocated in the CD table then something has gone wrong.
Remove case (5) from arm_smmu_write_ctx_desc().
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Introduce arm_smmu_make_s1_cd() to build the CD from the paging S1 domain,
and reorganize all the places programming S1 domain CD table entries to
call it.
Split arm_smmu_update_s1_domain_cd_entry() from
arm_smmu_update_ctx_desc_devices() so that the S1 path has its own call
chain separate from the unrelated SVA path.
arm_smmu_update_s1_domain_cd_entry() only works on S1 domains attached to
RIDs and refreshes all their CDs. Remove case (3) from
arm_smmu_write_ctx_desc() as it is now handled by directly calling
arm_smmu_write_cd_entry().
Remove the forced clear of the CD during S1 domain attach,
arm_smmu_write_cd_entry() will do this automatically if necessary.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com
[will: Drop unused arm_smmu_clean_cd_entry() function]
Signed-off-by: Will Deacon <will@kernel.org>
|
|
CD table entries and STE's have the same essential programming sequence,
just with different types. Use the new ops indirection to link CD
programming to the common writer.
In a few more patches all CD writers will call an appropriate make
function and then directly call arm_smmu_write_cd_entry().
arm_smmu_write_ctx_desc() will be removed.
Until then lightly tweak arm_smmu_write_ctx_desc() to also use the new
programmer by using the same logic as right now to build the target CD on
the stack, sanitizing it to meet the used rules, and then using the
writer.
Sanitizing is necessary because the writer expects that the currently
programmed CD follows the used rules. Next patches add new make functions
and new direct calls to arm_smmu_write_cd_entry() which will require this.
Signed-off-by: Michael Shavit <mshavit@google.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/2-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Prepare to put the CD code into the same mechanism. Add an ops indirection
around all the STE specific code and make the worker functions independent
of the entry content being processed.
get_used and sync ops are provided to hook the correct code.
Signed-off-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v9-5040dc602008+177d7-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The TBU support is now available, so let's allow it to be used on other
platforms that have the Qualcomm SMMU-500 implementation with TBUs. This
will allow the context fault handler to query the TBUs when a context
fault occurs.
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Link: https://lore.kernel.org/r/20240417133731.2055383-7-quic_c_gdjako@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The sdm845 platform now supports TBUs, so let's get additional debug
info from the TBUs when a context fault occurs. Implement a custom
context fault handler that does both software + hardware page table
walks and TLB Invalidate All.
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Link: https://lore.kernel.org/r/20240417133731.2055383-5-quic_c_gdjako@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Threaded IRQ handlers run in a less critical context compared to normal
IRQs, so they can perform more complex and time-consuming operations
without causing significant delays in other parts of the kernel.
During a context fault, it might be needed to do more processing and
gather debug information from TBUs in the handler. These operations may
sleep, so add an option to use a threaded IRQ handler in these cases.
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Link: https://lore.kernel.org/r/20240417133731.2055383-4-quic_c_gdjako@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Operating the TBUs (Translation Buffer Units) from Linux on Qualcomm
platforms can help with debugging context faults. To help with that,
the TBUs can run ATOS (Address Translation Operations) to manually
trigger address translation of IOVA to physical address in hardware
and provide more details when a context fault happens.
The driver will control the resources needed by the TBU to allow
running the debug operations such as ATOS, check for outstanding
transactions, do snapshot capture etc.
Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com>
Link: https://lore.kernel.org/r/20240417133731.2055383-3-quic_c_gdjako@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
If devm_add_action() returns -ENOMEM, then MSIs are allocated but not
not freed on teardown. Use devm_add_action_or_reset() instead to keep
the static analyser happy.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Signed-off-by: Aleksandr Aprelkov <aaprelkov@usergate.com>
Link: https://lore.kernel.org/r/20240403053759.643164-1-aaprelkov@usergate.com
[will: Tweak commit message, remove warning message]
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Now that the BLOCKED and IDENTITY behaviors are managed with their own
domains change to the domain_alloc_paging() op.
The check for using_legacy_binding is now redundant,
arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this
mode, so the core code will never attempt to create a DMA domain in the
first place.
Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core
code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will
not pass in IDENTITY or BLOCKED if the global statics exist, so the test
for DMA is also redundant now too.
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v1-3632c65678e0+2f1-smmu_alloc_paging_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Existing remove_dev_pasid() callbacks of the underlying iommu drivers
get the attached domain from the group->pasid_array. However, the domain
stored in group->pasid_array is not always correct in all scenarios.
A wrong domain may result in failure in remove_dev_pasid() callback.
To avoid such problems, it is more reliable to pass the domain to the
remove_dev_pasid() op.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240328122958.83332-3-yi.l.liu@intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Instead of passing a naked __le16 * around to represent a CD table entry
wrap it in a "struct arm_smmu_cd" with an array of the correct size. This
makes it much clearer which functions will comprise the "CD API".
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
At this point we know which master we are going to change the PCI config
on, this is the only device we need to invalidate. Switch
arm_smmu_atc_inv_domain() for arm_smmu_atc_inv_master().
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The SVA code is wired to assume that the SVA is programmed onto the
mm->pasid. The current core code always does this, so it is fine.
Add a check for clarity.
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v6-228e7adf25eb+4155-smmuv3_newapi_p2_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The disable_bypass parameter has been mostly meaningless for a long time
since the introduction of default domains. Its original intent is now
fulfilled by the controls users have over the default domain type, and
its remaining effect in the brief window between Stream Table
initialisation and default domain creation hardly seems worth the
complication. Furthermore, thanks to 2-level Stream Tables, disabling
disable_bypass (there's another reason not to like it right there) has
never guaranteed that any particular StreamID *will* bypass anyway - any
device which might actually care about that wants RMRs - so there's not
really much lost by taking away that option (which has already been
non-default for nearing 6 years now).
As part of this, also remove the weird behaviour where we "successfully"
probe and register a non-functional SMMU if the DT "#iommu-cells"
property is wrong. I have no memory of what possessed me to think that
was a good idea at the time, and by now I suspect it's likely to break
things worse than simply failing probe would.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Link: https://lore.kernel.org/r/ea3ac4cd595a81b5511729601b2f7d4668178438.1712335927.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
STE attributes(NSCFG, PRIVCFG, INSTCFG) use value 0 for "Use Icomming",
for some reason SHCFG doesn't follow that, and it is defined as "0b01".
Currently the driver sets SHCFG to Use Incoming for stage-2 and bypass
domains.
However according to the User Manual (ARM IHI 0070 F.b):
When SMMU_IDR1.ATTR_TYPES_OVR == 0, this field is RES0 and the
incoming Shareability attribute is used.
This patch adds a condition for writing SHCFG to Use incoming to be
compliant with the architecture, and defines ATTR_TYPE_OVR as a new
feature discovered from IDR1.
This also required to propagate the SMMU through some functions args.
There is no need to add similar condition for the newly introduced function
arm_smmu_get_ste_used() as the values of the STE are the same before and
after any transition, so this will not trigger any change. (we already
do the same for the VMID).
Although this is a misconfiguration from the driver, this has been there
for a long time, so probably no HW running Linux is affected by it.
Reported-by: Will Deacon <will@kernel.org>
Closes: https://lore.kernel.org/all/20240215134952.GA690@willie-the-truck/
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240323134658.464743-1-smostafa@google.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
STRTAB_STE_0_V is a CPU value, it needs conversion for sparse to be clean.
The missing annotation was a mistake introduced by splitting the ops out
from the STE writer.
Fixes: 7da51af9125c ("iommu/arm-smmu-v3: Make STE programming independent of the callers")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202403011441.5WqGrYjp-lkp@intel.com/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v1-98b23ebb0c84+9f-smmu_cputole_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull iommu updates from Joerg Roedel:
"Core changes:
- Constification of bus_type pointer
- Preparations for user-space page-fault delivery
- Use a named kmem_cache for IOVA magazines
Intel VT-d changes from Lu Baolu:
- Add RBTree to track iommu probed devices
- Add Intel IOMMU debugfs document
- Cleanup and refactoring
ARM-SMMU Updates from Will Deacon:
- Device-tree binding updates for a bunch of Qualcomm SoCs
- SMMUv2: Support for Qualcomm X1E80100 MDSS
- SMMUv3: Significant rework of the driver's STE manipulation and
domain handling code. This is the initial part of a larger scale
rework aiming to improve the driver's implementation of the
IOMMU-API in preparation for hooking up IOMMUFD support.
AMD-Vi Updates:
- Refactor GCR3 table support for SVA
- Cleanups
Some smaller cleanups and fixes"
* tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
iommu: Fix compilation without CONFIG_IOMMU_INTEL
iommu/amd: Fix sleeping in atomic context
iommu/dma: Document min_align_mask assumption
iommu/vt-d: Remove scalabe mode in domain_context_clear_one()
iommu/vt-d: Remove scalable mode context entry setup from attach_dev
iommu/vt-d: Setup scalable mode context entry in probe path
iommu/vt-d: Fix NULL domain on device release
iommu: Add static iommu_ops->release_domain
iommu/vt-d: Improve ITE fault handling if target device isn't present
iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected
PCI: Make pci_dev_is_disconnected() helper public for other drivers
iommu/vt-d: Use device rbtree in iopf reporting path
iommu/vt-d: Use rbtree to track iommu probed devices
iommu/vt-d: Merge intel_svm_bind_mm() into its caller
iommu/vt-d: Remove initialization for dynamically heap-allocated rcu_head
iommu/vt-d: Remove treatment for revoking PASIDs with pending page faults
iommu/vt-d: Add the document for Intel IOMMU debugfs
iommu/vt-d: Use kcalloc() instead of kzalloc()
iommu/vt-d: Remove INTEL_IOMMU_BROKEN_GFX_WA
iommu: re-use local fwnode variable in iommu_ops_from_fwnode()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MSI updates from Thomas Gleixner:
"Updates for the MSI interrupt subsystem and initial RISC-V MSI
support.
The core changes have been adopted from previous work which converted
ARM[64] to the new per device MSI domain model, which was merged to
support multiple MSI domain per device. The ARM[64] changes are being
worked on too, but have not been ready yet. The core and platform-MSI
changes have been split out to not hold up RISC-V and to avoid that
RISC-V builds on the scheduled for removal interfaces.
The core support provides new interfaces to handle wire to MSI bridges
in a straight forward way and introduces new platform-MSI interfaces
which are built on top of the per device MSI domain model.
Once ARM[64] is converted over the old platform-MSI interfaces and the
related ugliness in the MSI core code will be removed.
The actual MSI parts for RISC-V were finalized late and have been
post-poned for the next merge window.
Drivers:
- Add a new driver for the Andes hart-level interrupt controller
- Rework the SiFive PLIC driver to prepare for MSI suport
- Expand the RISC-V INTC driver to support the new RISC-V AIA
controller which provides the basis for MSI on RISC-V
- A few fixup for the fallout of the core changes"
* tag 'irq-msi-2024-03-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (29 commits)
irqchip/riscv-intc: Fix low-level interrupt handler setup for AIA
x86/apic/msi: Use DOMAIN_BUS_GENERIC_MSI for HPET/IO-APIC domain search
genirq/matrix: Dynamic bitmap allocation
irqchip/riscv-intc: Add support for RISC-V AIA
irqchip/sifive-plic: Improve locking safety by using irqsave/irqrestore
irqchip/sifive-plic: Parse number of interrupts and contexts early in plic_probe()
irqchip/sifive-plic: Cleanup PLIC contexts upon irqdomain creation failure
irqchip/sifive-plic: Use riscv_get_intc_hwnode() to get parent fwnode
irqchip/sifive-plic: Use devm_xyz() for managed allocation
irqchip/sifive-plic: Use dev_xyz() in-place of pr_xyz()
irqchip/sifive-plic: Convert PLIC driver into a platform driver
irqchip/riscv-intc: Introduce Andes hart-level interrupt controller
irqchip/riscv-intc: Allow large non-standard interrupt number
genirq/irqdomain: Don't call ops->select for DOMAIN_BUS_ANY tokens
irqchip/imx-intmux: Handle pure domain searches correctly
genirq/msi: Provide MSI_FLAG_PARENT_PM_DEV
genirq/irqdomain: Reroute device MSI create_mapping
genirq/msi: Provide allocation/free functions for "wired" MSI interrupts
genirq/msi: Optionally use dev->fwnode for device domain
genirq/msi: Provide DOMAIN_BUS_WIRED_TO_MSI
...
|
|
'x86/amd' and 'core' into next
|
|
The xlate callbacks are supposed to translate of_phandle_args to proper
provider without modifying the of_phandle_args. Make the argument
pointer to const for code safety and readability.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240216144027.185959-2-krzysztof.kozlowski@linaro.org
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|
|
Now that the BLOCKED and IDENTITY behaviors are managed with their own
domains change to the domain_alloc_paging() op.
For now SVA remains using the old interface, eventually it will get its
own op that can pass in the device and mm_struct which will let us have a
sane lifetime for the mmu_notifier.
Call arm_smmu_domain_finalise() early if dev is available.
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/16-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Instead of putting container_of() casts in the internals, use the proper
type in this call chain. This makes it easier to check that the two global
static domains are not leaking into call chains they should not.
Passing the smmu avoids the only caller from having to set it and unset it
in the error path.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/15-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Consolidate some more code by having release call
arm_smmu_attach_dev_identity/blocked() instead of open coding this.
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/14-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Using the same design as the IDENTITY domain install an
STRTAB_STE_0_CFG_ABORT STE.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/13-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Move to the new static global for identity domains. Move all the logic out
of arm_smmu_attach_dev into an identity only function.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/12-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The SVA code only works if the RID domain is a S1 domain and has already
installed the cdtable.
Originally the check for this was in arm_smmu_sva_bind() but when the op
was removed the test didn't get copied over to the new
arm_smmu_sva_set_dev_pasid().
Without the test wrong usage usually will hit a WARN_ON() in
arm_smmu_write_ctx_desc() due to a missing ctx table.
However, the next patches wil change things so that an IDENTITY domain is
not a struct arm_smmu_domain and this will get into memory corruption if
the struct is wrongly casted.
Fail in arm_smmu_sva_set_dev_pasid() if the STE does not have a S1, which
is a proxy for the STE having a pointer to the CD table. Write it in a way
that will be compatible with the next patches.
Fixes: 386fa64fd52b ("arm-smmu-v3/sva: Add SVA domain support")
Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Closes: https://lore.kernel.org/linux-iommu/2a828e481416405fb3a4cceb9e075a59@huawei.com/
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/11-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Introducing global statics which are of type struct iommu_domain, not
struct arm_smmu_domain makes it difficult to retain
arm_smmu_master->domain, as it can no longer point to an IDENTITY or
BLOCKED domain.
The only place that uses the value is arm_smmu_detach_dev(). Change things
to work like other drivers and call iommu_get_domain_for_dev() to obtain
the current domain.
The master->domain is subtly protecting the master->domain_head against
being unused as only PAGING domains will set master->domain and only
paging domains use the master->domain_head. To make it simple keep the
master->domain_head initialized so that the list_del() logic just does
nothing for attached non-PAGING domains.
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/10-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The caller already has the domain, just pass it in. A following patch will
remove master->domain.
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/9-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Get closer to the IOMMU API ideal that changes between domains can be
hitless. The ordering for the CD table entry is not entirely clean from
this perspective.
When switching away from a STE with a CD table programmed in it we should
write the new STE first, then clear any old data in the CD entry.
If we are programming a CD table for the first time to a STE then the CD
entry should be programmed before the STE is loaded.
If we are replacing a CD table entry when the STE already points at the CD
entry then we just need to do the make/break sequence.
Lift this code out of arm_smmu_detach_dev() so it can all be sequenced
properly. The only other caller is arm_smmu_release_device() and it is
going to free the cdtable anyhow, so it doesn't matter what is in it.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/8-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This was needed because the STE code required the STE to be in
ABORT/BYPASS inorder to program a cdtable or S2 STE. Now that the STE code
can automatically handle all transitions we can remove this step
from the attach_dev flow.
A few small bugs exist because of this:
1) If the core code does BLOCKED -> UNMANAGED with disable_bypass=false
then there will be a moment where the STE points at BYPASS. Since
this can be done by VFIO/IOMMUFD it is a small security race.
2) If the core code does IDENTITY -> DMA then any IOMMU_RESV_DIRECT
regions will temporarily become BLOCKED. We'd like drivers to
work in a way that allows IOMMU_RESV_DIRECT to be continuously
functional during these transitions.
Make arm_smmu_release_device() put the STE back to the correct
ABORT/BYPASS setting. Fix a bug where a IOMMU_RESV_DIRECT was ignored on
this path.
As noted before the reordering of the linked list/STE/CD changes is OK
against concurrent arm_smmu_share_asid() because of the
arm_smmu_asid_lock.
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/7-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Currently arm_smmu_install_ste_for_dev() iterates over every SID and
computes from scratch an identical STE. Every SID should have the same STE
contents. Turn this inside out so that the STE is supplied by the caller
and arm_smmu_install_ste_for_dev() simply installs it to every SID.
This is possible now that the STE generation does not inform what sequence
should be used to program it.
This allows splitting the STE calculation up according to the call site,
which following patches will make use of, and removes the confusing NULL
domain special case that only supported arm_smmu_detach_dev().
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/6-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The BTM support wants to be able to change the ASID of any smmu_domain.
When it goes to do this it holds the arm_smmu_asid_lock and iterates over
the target domain's devices list.
During attach of a S1 domain we must ensure that the devices list and
CD are in sync, otherwise we could miss CD updates or a parallel CD update
could push an out of date CD.
This is pretty complicated, and almost works today because
arm_smmu_detach_dev() removes the master from the linked list before
working on the CD entries, preventing parallel update of the CD.
However, it does have an issue where the CD can remain programed while the
domain appears to be unattached. arm_smmu_share_asid() will then not clear
any CD entriess and install its own CD entry with the same ASID
concurrently. This creates a small race window where the IOMMU can see two
ASIDs pointing to different translations.
CPU0 CPU1
arm_smmu_attach_dev()
arm_smmu_detach_dev()
spin_lock_irqsave(&smmu_domain->devices_lock, flags);
list_del(&master->domain_head);
spin_unlock_irqrestore(&smmu_domain->devices_lock, flags);
arm_smmu_mmu_notifier_get()
arm_smmu_alloc_shared_cd()
arm_smmu_share_asid():
// Does nothing due to list_del above
arm_smmu_update_ctx_desc_devices()
arm_smmu_tlb_inv_asid()
arm_smmu_write_ctx_desc()
** Now the ASID is in two CDs
with different translation
arm_smmu_write_ctx_desc(master, IOMMU_NO_PASID, NULL);
Solve this by wrapping most of the attach flow in the
arm_smmu_asid_lock. This locks more than strictly needed to prepare for
the next patch which will reorganize the order of the linked list, STE and
CD changes.
Move arm_smmu_detach_dev() till after we have initialized the domain so
the lock can be held for less time.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/5-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Half the code was living in arm_smmu_domain_finalise_s2(), just move it
here and take the values directly from the pgtbl_ops instead of storing
copies.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/4-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This is preparation to move the STE calculation higher up in to the call
chain and remove arm_smmu_write_strtab_ent(). These new functions will be
called directly from attach_dev.
Reviewed-by: Moritz Fischer <mdf@kernel.org>
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/3-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This allows writing the flow of arm_smmu_write_strtab_ent() around abort
and bypass domains more naturally.
Note that the core code no longer supplies NULL domains, though there is
still a flow in the driver that end up in arm_smmu_write_strtab_ent() with
NULL. A later patch will remove it.
Remove the duplicate calculation of the STE in arm_smmu_init_bypass_stes()
and remove the force parameter. arm_smmu_rmr_install_bypass_ste() can now
simply invoke arm_smmu_make_bypass_ste() directly.
Rename arm_smmu_init_bypass_stes() to arm_smmu_init_initial_stes() to
better reflect its purpose.
Reviewed-by: Michael Shavit <mshavit@google.com>
Reviewed-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Mostafa Saleh <smostafa@google.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Tested-by: Moritz Fischer <moritzf@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/2-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
As the comment in arm_smmu_write_strtab_ent() explains, this routine has
been limited to only work correctly in certain scenarios that the caller
must ensure. Generally the caller must put the STE into ABORT or BYPASS
before attempting to program it to something else.
The iommu core APIs would ideally expect the driver to do a hitless change
of iommu_domain in a number of cases:
- RESV_DIRECT support wants IDENTITY -> DMA -> IDENTITY to be hitless
for the RESV ranges
- PASID upgrade has IDENTIY on the RID with no PASID then a PASID paging
domain installed. The RID should not be impacted
- PASID downgrade has IDENTIY on the RID and all PASID's removed.
The RID should not be impacted
- RID does PAGING -> BLOCKING with active PASID, PASID's should not be
impacted
- NESTING -> NESTING for carrying all the above hitless cases in a VM
into the hypervisor. To comprehensively emulate the HW in a VM we
should assume the VM OS is running logic like this and expecting
hitless updates to be relayed to real HW.
For CD updates arm_smmu_write_ctx_desc() has a similar comment explaining
how limited it is, and the driver does have a need for hitless CD updates:
- SMMUv3 BTM S1 ASID re-label
- SVA mm release should change the CD to answert not-present to all
requests without allowing logging (EPD0)
The next patches/series are going to start removing some of this logic
from the callers, and add more complex state combinations than currently.
At the end everything that can be hitless will be hitless, including all
of the above.
Introduce arm_smmu_write_ste() which will run through the multi-qword
programming sequence to avoid creating an incoherent 'torn' STE in the HW
caches. It automatically detects which of two algorithms to use:
1) The disruptive V=0 update described in the spec which disrupts the
entry and does three syncs to make the change:
- Write V=0 to QWORD 0
- Write the entire STE except QWORD 0
- Write QWORD 0
2) A hitless update algorithm that follows the same rational that the driver
already uses. It is safe to change IGNORED bits that HW doesn't use:
- Write the target value into all currently unused bits
- Write a single QWORD, this makes the new STE live atomically
- Ensure now unused bits are 0
The detection of which path to use and the implementation of the hitless
update rely on a "used bitmask" describing what bits the HW is actually
using based on the V/CFG/etc bits. This flows from the spec language,
typically indicated as IGNORED.
Knowing which bits the HW is using we can update the bits it does not use
and then compute how many QWORDS need to be changed. If only one qword
needs to be updated the hitless algorithm is possible.
Later patches will include CD updates in this mechanism so make the
implementation generic using a struct arm_smmu_entry_writer and struct
arm_smmu_entry_writer_ops to abstract the differences between STE and CD
to be plugged in.
At this point it generates the same sequence of updates as the current
code, except that zeroing the VMID on entry to BYPASS/ABORT will do an
extra sync (this seems to be an existing bug).
Going forward this will use a V=0 transition instead of cycling through
ABORT if a hitfull change is required. This seems more appropriate as ABORT
will fail DMAs without any logging, but dropping a DMA due to transient
V=0 is probably signaling a bug, so the C_BAD_STE is valuable.
Add STRTAB_STE_1_SHCFG_INCOMING to s2_cfg, this was editing the STE in
place and subtly inherited the value of data[1] from abort/bypass.
Signed-off-by: Michael Shavit <mshavit@google.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/1-v6-96275f25c39d+2d4-smmuv3_newapi_p1_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add the X1E80100 MDSS compatible to clients compatible list, as it also
needs the workarounds.
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
Link: https://lore.kernel.org/r/20240131-x1e80100-iommu-arm-smmu-qcom-v1-1-c1240419c718@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
If the SMMU is configured to use a two level CD table then
arm_smmu_write_ctx_desc() allocates a CD table leaf internally using
GFP_KERNEL. Due to recent changes this is being done under a spinlock to
iterate over the device list - thus it will trigger a sleeping while
atomic warning:
arm_smmu_sva_set_dev_pasid()
mutex_lock(&sva_lock);
__arm_smmu_sva_bind()
arm_smmu_mmu_notifier_get()
spin_lock_irqsave()
arm_smmu_write_ctx_desc()
arm_smmu_get_cd_ptr()
arm_smmu_alloc_cd_leaf_table()
dmam_alloc_coherent(GFP_KERNEL)
This is a 64K high order allocation and really should not be done
atomically.
At the moment the rework of the SVA to follow the new API is half
finished. Recently the CD table memory was moved from the domain to the
master, however we have the confusing situation where the SVA code is
wrongly using the RID domains device's list to track which CD tables the
SVA is installed in.
Remove the logic to replicate the CD across all the domain's masters
during attach. We know which master and which CD table the PASID should be
installed in.
Right now SVA only works when dma-iommu.c is in control of the RID
translation, which means we have a single iommu_domain shared across the
entire group and that iommu_domain is not shared outside the group.
Critically this means that the iommu_group->devices list and RID's
smmu_domain->devices list describe the same set of masters.
For PCI cases the core code also insists on singleton groups so there is
only one entry in the smmu_domain->devices list that is equal to the
master being passed in to arm_smmu_sva_set_dev_pasid().
Only non-PCI cases may have multi-device groups. However, the core code
will repeat the calls to arm_smmu_sva_set_dev_pasid() across the entire
iommu_group->devices list.
Instead of having arm_smmu_mmu_notifier_get() indirectly loop over all the
devices in the group via the RID's smmu_domain, rely on
__arm_smmu_sva_bind() to be called for each device in the group and
install the repeated CD entry that way.
This avoids taking the spinlock to access the devices list and permits the
arm_smmu_write_ctx_desc() to use a sleeping allocation. Leave the
arm_smmu_mm_release() as a confusing situation, this requires tracking
attached masters inside the SVA domain.
Removing the loop allows arm_smmu_write_ctx_desc() to be called outside
the spinlock and thus is safe to use GFP_KERNEL.
Move the clearing of the CD into arm_smmu_sva_remove_dev_pasid() so that
arm_smmu_mmu_notifier_get/put() remain paired functions.
Fixes: 24503148c545 ("iommu/arm-smmu-v3: Refactor write_ctx_desc")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/all/4e25d161-0cf8-4050-9aa3-dfa21cd63e56@moroto.mountain/
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Michael Shavit <mshavit@google.com>
Link: https://lore.kernel.org/r/0-v3-11978fc67151+112-smmu_cd_atomic_jgg@nvidia.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
As the iommu_report_device_fault() has been converted to auto-respond a
page fault if it fails to enqueue it, there's no need to return a code
in any case. Make it return void.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/20240212012227.119381-17-baolu.lu@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
|