summaryrefslogtreecommitdiff
path: root/drivers/iommu/intel
AgeCommit message (Collapse)AuthorFilesLines
2022-03-25Merge tag 'iommu-updates-v5.18' of ↵Linus Torvalds6-1260/+147
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: - IOMMU Core changes: - Removal of aux domain related code as it is basically dead and will be replaced by iommu-fd framework - Split of iommu_ops to carry domain-specific call-backs separatly - Cleanup to remove useless ops->capable implementations - Improve 32-bit free space estimate in iova allocator - Intel VT-d updates: - Various cleanups of the driver - Support for ATS of SoC-integrated devices listed in ACPI/SATC table - ARM SMMU updates: - Fix SMMUv3 soft lockup during continuous stream of events - Fix error path for Qualcomm SMMU probe() - Rework SMMU IRQ setup to prepare the ground for PMU support - Minor cleanups and refactoring - AMD IOMMU driver: - Some minor cleanups and error-handling fixes - Rockchip IOMMU driver: - Use standard driver registration - MSM IOMMU driver: - Minor cleanup and change to standard driver registration - Mediatek IOMMU driver: - Fixes for IOTLB flushing logic * tag 'iommu-updates-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (47 commits) iommu/amd: Improve amd_iommu_v2_exit() iommu/amd: Remove unused struct fault.devid iommu/amd: Clean up function declarations iommu/amd: Call memunmap in error path iommu/arm-smmu: Account for PMU interrupts iommu/vt-d: Enable ATS for the devices in SATC table iommu/vt-d: Remove unused function intel_svm_capable() iommu/vt-d: Add missing "__init" for rmrr_sanity_check() iommu/vt-d: Move intel_iommu_ops to header file iommu/vt-d: Fix indentation of goto labels iommu/vt-d: Remove unnecessary prototypes iommu/vt-d: Remove unnecessary includes iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFO iommu/vt-d: Remove domain and devinfo mempool iommu/vt-d: Remove iova_cache_get/put() iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info() iommu/vt-d: Remove intel_iommu::domains iommu/mediatek: Always tlb_flush_all when each PM resume iommu/mediatek: Add tlb_lock in tlb_flush_all iommu/mediatek: Remove the power status checking in tlb flush all ...
2022-03-21Merge tag 'x86-pasid-2022-03-21' of ↵Linus Torvalds3-12/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 PASID support from Thomas Gleixner: "Reenable ENQCMD/PASID support: - Simplify the PASID handling to allocate the PASID once, associate it to the mm of a process and free it on mm_exit(). The previous attempt of refcounted PASIDs and dynamic alloc()/free() turned out to be error prone and too complex. The PASID space is 20bits, so the case of resource exhaustion is a pure academic concern. - Populate the PASID MSR on demand via #GP to avoid racy updates via IPIs. - Reenable ENQCMD and let objtool check for the forbidden usage of ENQCMD in the kernel. - Update the documentation for Shared Virtual Addressing accordingly" * tag 'x86-pasid-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/x86: Update documentation for SVA (Shared Virtual Addressing) tools/objtool: Check for use of the ENQCMD instruction in the kernel x86/cpufeatures: Re-enable ENQCMD x86/traps: Demand-populate PASID MSR via #GP sched: Define and initialize a flag to identify valid PASID in the task x86/fpu: Clear PASID when copying fpstate iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit kernel/fork: Initialize mm's PASID iommu/ioasid: Introduce a helper to check for valid PASIDs mm: Change CONFIG option for mm->pasid field iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA
2022-03-08Merge branches 'arm/mediatek', 'arm/msm', 'arm/renesas', 'arm/rockchip', ↵Joerg Roedel6-1260/+147
'arm/smmu', 'x86/vt-d' and 'x86/amd' into next
2022-03-04iommu/vt-d: Enable ATS for the devices in SATC tableYian Chen1-2/+38
Starting from Intel VT-d v3.2, Intel platform BIOS can provide additional SATC table structure. SATC table includes a list of SoC integrated devices that support ATC (Address translation cache). Enabling ATC (via ATS capability) can be a functional requirement for SATC device operation or optional to enhance device performance/functionality. This is determined by the bit of ATC_REQUIRED in SATC table. When IOMMU is working in scalable mode, software chooses to always enable ATS for every device in SATC table because Intel SoC devices in SATC table are trusted to use ATS. On the other hand, if IOMMU is in legacy mode, ATS of SATC capable devices can work transparently to software and be automatically enabled by IOMMU hardware. As the result, there is no need for software to enable ATS on these devices. This also removes dmar_find_matched_atsr_unit() helper as it becomes dead code now. Signed-off-by: Yian Chen <yian.chen@intel.com> Link: https://lore.kernel.org/r/20220222185416.1722611-1-yian.chen@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220301020159.633356-13-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Remove unused function intel_svm_capable()YueHaibing1-5/+0
This is unused since commit 4048377414162 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers"). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Link: https://lore.kernel.org/r/20220216113851.25004-1-yuehaibing@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220301020159.633356-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Add missing "__init" for rmrr_sanity_check()Marco Bonelli1-1/+1
rmrr_sanity_check() calls arch_rmrr_sanity_check(), which is marked as "__init". Add the annotation for rmrr_sanity_check() too. This function is currently only called from dmar_parse_one_rmrr() which is also "__init". Signed-off-by: Marco Bonelli <marco@mebeim.net> Link: https://lore.kernel.org/r/20220210023141.9208-1-marco@mebeim.net Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220301020159.633356-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Move intel_iommu_ops to header fileAndy Shevchenko1-2/+0
Compiler is not happy about hidden declaration of intel_iommu_ops. .../iommu.c:414:24: warning: symbol 'intel_iommu_ops' was not declared. Should it be static? Move declaration to header file to make compiler happy. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20220207141240.8253-1-andriy.shevchenko@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220301020159.633356-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Fix indentation of goto labelsLu Baolu1-2/+2
Remove blanks before goto labels. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20220301020159.633356-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Remove unnecessary prototypesLu Baolu1-5/+0
Some prototypes in iommu.c are unnecessary. Delete them. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20220301020159.633356-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Remove unnecessary includesLu Baolu1-27/+7
Remove unnecessary include files and sort the remaining alphabetically. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20220301020159.633356-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Remove DEFER_DEVICE_DOMAIN_INFOLu Baolu4-128/+79
Allocate and set the per-device iommu private data during iommu device probe. This makes the per-device iommu private data always available during iommu_probe_device() and iommu_release_device(). With this changed, the dummy DEFER_DEVICE_DOMAIN_INFO pointer could be removed. The wrappers for getting the private data and domain are also cleaned. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20220301020159.633356-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Remove domain and devinfo mempoolLu Baolu1-99/+5
The domain and devinfo memory blocks are only allocated during device probe and released during remove. There's no hot-path context, hence no need for memory pools. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20220301020159.633356-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Remove iova_cache_get/put()Lu Baolu1-5/+0
These have been done in drivers/iommu/dma-iommu.c. Remove this duplicate code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20220301020159.633356-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Remove finding domain in dmar_insert_one_dev_info()Lu Baolu1-21/+0
The Intel IOMMU driver has already converted to use default domain framework in iommu core. There's no need to find a domain for the device in the domain attaching path. Cleanup that code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20220301020159.633356-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04iommu/vt-d: Remove intel_iommu::domainsLu Baolu1-65/+3
The "domains" field of the intel_iommu structure keeps the mapping of domain_id to dmar_domain. This information is not used anywhere. Remove and cleanup it to avoid unnecessary memory consumption. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220214025704.3184654-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20220301020159.633356-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-03-04Merge branch 'core' into x86/vt-dJoerg Roedel5-902/+15
2022-02-28iommu/vt-d: Fix double list_add when enabling VMD in scalable modeAdrian Huang1-1/+1
When enabling VMD and IOMMU scalable mode, the following kernel panic call trace/kernel log is shown in Eagle Stream platform (Sapphire Rapids CPU) during booting: pci 0000:59:00.5: Adding to iommu group 42 ... vmd 0000:59:00.5: PCI host bridge to bus 10000:80 pci 10000:80:01.0: [8086:352a] type 01 class 0x060400 pci 10000:80:01.0: reg 0x10: [mem 0x00000000-0x0001ffff 64bit] pci 10000:80:01.0: enabling Extended Tags pci 10000:80:01.0: PME# supported from D0 D3hot D3cold pci 10000:80:01.0: DMAR: Setup RID2PASID failed pci 10000:80:01.0: Failed to add to iommu group 42: -16 pci 10000:80:03.0: [8086:352b] type 01 class 0x060400 pci 10000:80:03.0: reg 0x10: [mem 0x00000000-0x0001ffff 64bit] pci 10000:80:03.0: enabling Extended Tags pci 10000:80:03.0: PME# supported from D0 D3hot D3cold ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:29! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 7 Comm: kworker/0:1 Not tainted 5.17.0-rc3+ #7 Hardware name: Lenovo ThinkSystem SR650V3/SB27A86647, BIOS ESE101Y-1.00 01/13/2022 Workqueue: events work_for_cpu_fn RIP: 0010:__list_add_valid.cold+0x26/0x3f Code: 9a 4a ab ff 4c 89 c1 48 c7 c7 40 0c d9 9e e8 b9 b1 fe ff 0f 0b 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 f0 0c d9 9e e8 a2 b1 fe ff <0f> 0b 48 89 d1 4c 89 c6 4c 89 ca 48 c7 c7 98 0c d9 9e e8 8b b1 fe RSP: 0000:ff5ad434865b3a40 EFLAGS: 00010246 RAX: 0000000000000058 RBX: ff4d61160b74b880 RCX: ff4d61255e1fffa8 RDX: 0000000000000000 RSI: 00000000fffeffff RDI: ffffffff9fd34f20 RBP: ff4d611d8e245c00 R08: 0000000000000000 R09: ff5ad434865b3888 R10: ff5ad434865b3880 R11: ff4d61257fdc6fe8 R12: ff4d61160b74b8a0 R13: ff4d61160b74b8a0 R14: ff4d611d8e245c10 R15: ff4d611d8001ba70 FS: 0000000000000000(0000) GS:ff4d611d5ea00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ff4d611fa1401000 CR3: 0000000aa0210001 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> intel_pasid_alloc_table+0x9c/0x1d0 dmar_insert_one_dev_info+0x423/0x540 ? device_to_iommu+0x12d/0x2f0 intel_iommu_attach_device+0x116/0x290 __iommu_attach_device+0x1a/0x90 iommu_group_add_device+0x190/0x2c0 __iommu_probe_device+0x13e/0x250 iommu_probe_device+0x24/0x150 iommu_bus_notifier+0x69/0x90 blocking_notifier_call_chain+0x5a/0x80 device_add+0x3db/0x7b0 ? arch_memremap_can_ram_remap+0x19/0x50 ? memremap+0x75/0x140 pci_device_add+0x193/0x1d0 pci_scan_single_device+0xb9/0xf0 pci_scan_slot+0x4c/0x110 pci_scan_child_bus_extend+0x3a/0x290 vmd_enable_domain.constprop.0+0x63e/0x820 vmd_probe+0x163/0x190 local_pci_probe+0x42/0x80 work_for_cpu_fn+0x13/0x20 process_one_work+0x1e2/0x3b0 worker_thread+0x1c4/0x3a0 ? rescuer_thread+0x370/0x370 kthread+0xc7/0xf0 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- ... Kernel panic - not syncing: Fatal exception Kernel Offset: 0x1ca00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]--- The following 'lspci' output shows devices '10000:80:*' are subdevices of the VMD device 0000:59:00.5: $ lspci ... 0000:59:00.5 RAID bus controller: Intel Corporation Volume Management Device NVMe RAID Controller (rev 20) ... 10000:80:01.0 PCI bridge: Intel Corporation Device 352a (rev 03) 10000:80:03.0 PCI bridge: Intel Corporation Device 352b (rev 03) 10000:80:05.0 PCI bridge: Intel Corporation Device 352c (rev 03) 10000:80:07.0 PCI bridge: Intel Corporation Device 352d (rev 03) 10000:81:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] 10000:82:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] The symptom 'list_add double add' is caused by the following failure message: pci 10000:80:01.0: DMAR: Setup RID2PASID failed pci 10000:80:01.0: Failed to add to iommu group 42: -16 pci 10000:80:03.0: [8086:352b] type 01 class 0x060400 Device 10000:80:01.0 is the subdevice of the VMD device 0000:59:00.5, so invoking intel_pasid_alloc_table() gets the pasid_table of the VMD device 0000:59:00.5. Here is call path: intel_pasid_alloc_table pci_for_each_dma_alias get_alias_pasid_table search_pasid_table pci_real_dma_dev() in pci_for_each_dma_alias() gets the real dma device which is the VMD device 0000:59:00.5. However, pte of the VMD device 0000:59:00.5 has been configured during this message "pci 0000:59:00.5: Adding to iommu group 42". So, the status -EBUSY is returned when configuring pasid entry for device 10000:80:01.0. It then invokes dmar_remove_one_dev_info() to release 'struct device_domain_info *' from iommu_devinfo_cache. But, the pasid table is not released because of the following statement in __dmar_remove_one_dev_info(): if (info->dev && !dev_is_real_dma_subdevice(info->dev)) { ... intel_pasid_free_table(info->dev); } The subsequent dmar_insert_one_dev_info() operation of device 10000:80:03.0 allocates 'struct device_domain_info *' from iommu_devinfo_cache. The allocated address is the same address that is released previously for device 10000:80:01.0. Finally, invoking device_attach_pasid_table() causes the issue. `git bisect` points to the offending commit 474dd1c65064 ("iommu/vt-d: Fix clearing real DMA device's scalable-mode context entries"), which releases the pasid table if the device is not the subdevice by checking the returned status of dev_is_real_dma_subdevice(). Reverting the offending commit can work around the issue. The solution is to prevent from allocating pasid table if those devices are subdevices of the VMD device. Fixes: 474dd1c65064 ("iommu/vt-d: Fix clearing real DMA device's scalable-mode context entries") Cc: stable@vger.kernel.org # v5.14+ Signed-off-by: Adrian Huang <ahuang12@lenovo.com> Link: https://lore.kernel.org/r/20220216091307.703-1-adrianhuang0701@gmail.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220221053348.262724-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu: Split struct iommu_opsLu Baolu1-9/+11
Move the domain specific operations out of struct iommu_ops into a new structure that only has domain specific operations. This solves the problem of needing to know if the method vector for a given operation needs to be retrieved from the device or the domain. Logically the domain ops are the ones that make sense for external subsystems and endpoint drivers to use, while device ops, with the sole exception of domain_alloc, are IOMMU API internals. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu: Remove unused argument in is_attach_deferredLu Baolu1-2/+1
The is_attach_deferred iommu_ops callback is a device op. The domain argument is unnecessary and never used. Remove it to make code clean. Suggested-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu/vt-d: Remove aux-domain related callbacksLu Baolu2-309/+3
The aux-domain related callbacks are not called in the tree. Remove them to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-28iommu/vt-d: Remove guest pasid related callbacksLu Baolu4-582/+0
The guest pasid related callbacks are not called in the tree. Remove them to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20220216025249.3459465-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-02-15iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exitFenghua Yu2-11/+2
PASIDs are process-wide. It was attempted to use refcounted PASIDs to free them when the last thread drops the refcount. This turned out to be complex and error prone. Given the fact that the PASID space is 20 bits, which allows up to 1M processes to have a PASID associated concurrently, PASID resource exhaustion is not a realistic concern. Therefore, it was decided to simplify the approach and stick with lazy on demand PASID allocation, but drop the eager free approach and make an allocated PASID's lifetime bound to the lifetime of the process. Get rid of the refcounting mechanisms and replace/rename the interfaces to reflect this new approach. [ bp: Massage commit message. ] Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lore.kernel.org/r/20220207230254.3342514-6-fenghua.yu@intel.com
2022-02-14iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVAFenghua Yu1-1/+1
This CONFIG option originally only referred to the Shared Virtual Address (SVA) library. But it is now also used for non-library portions of code. Drop the "_LIB" suffix so that there is just one configuration option for all code relating to SVA. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220207230254.3342514-2-fenghua.yu@intel.com
2022-02-14iommu/vtd: Replace acpi_bus_get_device()Rafael J. Wysocki1-1/+2
Replace acpi_bus_get_device() that is going to be dropped with acpi_fetch_acpi_dev(). No intentional functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1807113.tdWV9SEqCh@kreacher Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-01-31iommu/vt-d: Fix potential memory leak in intel_setup_irq_remapping()Guoqing Jiang1-3/+10
After commit e3beca48a45b ("irqdomain/treewide: Keep firmware node unconditionally allocated"). For tear down scenario, fn is only freed after fail to allocate ir_domain, though it also should be freed in case dmar_enable_qi returns error. Besides free fn, irq_domain and ir_msi_domain need to be removed as well if intel_setup_irq_remapping fails to enable queued invalidation. Improve the rewinding path by add out_free_ir_domain and out_free_fwnode lables per Baolu's suggestion. Fixes: e3beca48a45b ("irqdomain/treewide: Keep firmware node unconditionally allocated") Suggested-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev> Link: https://lore.kernel.org/r/20220119063640.16864-1-guoqing.jiang@linux.dev Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20220128031002.2219155-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-01-04Merge branches 'arm/smmu', 'virtio', 'x86/amd', 'x86/vt-d' and 'core' into nextJoerg Roedel1-74/+37
2021-12-20iommu/vt-d: Use put_pages_listMatthew Wilcox (Oracle)1-58/+31
page->freelist is for the use of slab. We already have the ability to free a list of pages in the core mm, but it requires the use of a list_head and for the pages to be chained together through page->lru. Switch the Intel IOMMU and IOVA code over to using free_pages_list(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> [rm: split from original patch, cosmetic tweaks, fix fq entries] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2115b560d9a0ce7cd4b948bd51a2b7bde8fdfd59.1639753638.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17iommu/vt-d: Remove unused dma_to_mm_pfn functionMaíra Canal1-5/+0
Remove dma_to_buf_pfn function, which is not used in the codebase. This was pointed by clang with the following warning: 'dma_to_mm_pfn' [-Wunused-function] static inline unsigned long dma_to_mm_pfn(unsigned long dma_pfn) ^ https://lore.kernel.org/r/YYhY7GqlrcTZlzuA@fedora drivers/iommu/intel/iommu.c:136:29: warning: unused function Signed-off-by: Maíra Canal <maira.canal@usp.br> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211217083817.1745419-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17iommu/vt-d: Drop duplicate check in dma_pte_free_pagetable()Kefeng Wang1-4/+0
The BUG_ON check exists in dma_pte_clear_range(), kill the duplicate check. Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Link: https://lore.kernel.org/r/20211025032307.182974-1-wangkefeng.wang@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211217083817.1745419-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17iommu/vt-d: Use bitmap_zalloc() when applicableChristophe JAILLET1-5/+4
'iommu->domain_ids' is a bitmap. So use 'bitmap_zalloc()' to simplify code and improve the semantic. Also change the corresponding 'kfree()' into 'bitmap_free()' to keep consistency. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/cb7a3e0a8d522447a06298a4f244c3df072f948b.1635018498.git.christophe.jaillet@wanadoo.fr Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211217083817.1745419-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-12-17iommu/vt-d: Use correctly sized arguments for bit fieldKees Cook1-2/+2
The find.h APIs are designed to be used only on unsigned long arguments. This can technically result in a over-read, but it is harmless in this case. Regardless, fix it to avoid the warning seen under -Warray-bounds, which we'd like to enable globally: In file included from ./include/linux/bitmap.h:9, from drivers/iommu/intel/iommu.c:17: drivers/iommu/intel/iommu.c: In function 'domain_context_mapping_one': ./include/linux/find.h:119:37: warning: array subscript 'long unsigned int[0]' is partly outside array bounds of 'int[1]' [-Warray-bounds] 119 | unsigned long val = *addr & GENMASK(size - 1, 0); | ^~~~~ drivers/iommu/intel/iommu.c:2115:18: note: while referencing 'max_pde' 2115 | int pds, max_pde; | ^~~~~~~ Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20211215232432.2069605-1-keescook@chromium.org Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-11-27iommu/vt-d: Fix unmap_pages supportAlex Williamson1-4/+2
When supporting only the .map and .unmap callbacks of iommu_ops, the IOMMU driver can make assumptions about the size and alignment used for mappings based on the driver provided pgsize_bitmap. VT-d previously used essentially PAGE_MASK for this bitmap as any power of two mapping was acceptably filled by native page sizes. However, with the .map_pages and .unmap_pages interface we're now getting page-size and count arguments. If we simply combine these as (page-size * count) and make use of the previous map/unmap functions internally, any size and alignment assumptions are very different. As an example, a given vfio device assignment VM will often create a 4MB mapping at IOVA pfn [0x3fe00 - 0x401ff]. On a system that does not support IOMMU super pages, the unmap_pages interface will ask to unmap 1024 4KB pages at the base IOVA. dma_pte_clear_level() will recurse down to level 2 of the page table where the first half of the pfn range exactly matches the entire pte level. We clear the pte, increment the pfn by the level size, but (oops) the next pte is on a new page, so we exit the loop an pop back up a level. When we then update the pfn based on that higher level, we seem to assume that the previous pfn value was at the start of the level. In this case the level size is 256K pfns, which we add to the base pfn and get a results of 0x7fe00, which is clearly greater than 0x401ff, so we're done. Meanwhile we never cleared the ptes for the remainder of the range. When the VM remaps this range, we're overwriting valid ptes and the VT-d driver complains loudly, as reported by the user report linked below. The fix for this seems relatively simple, if each iteration of the loop in dma_pte_clear_level() is assumed to clear to the end of the level pte page, then our next pfn should be calculated from level_pfn rather than our working pfn. Fixes: 3f34f1259776 ("iommu/vt-d: Implement map/unmap_pages() iommu_ops callback") Reported-by: Ajay Garg <ajaygargnsit@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Tested-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Link: https://lore.kernel.org/all/20211002124012.18186-1-ajaygargnsit@gmail.com/ Link: https://lore.kernel.org/r/163659074748.1617923.12716161410774184024.stgit@omen Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211126135556.397932-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-11-27iommu/vt-d: Fix an unbalanced rcu_read_lock/rcu_read_unlock()Christophe JAILLET1-2/+3
If we return -EOPNOTSUPP, the rcu lock remains lock. This is spurious. Go through the end of the function instead. This way, the missing 'rcu_read_unlock()' is called. Fixes: 7afd7f6aa21a ("iommu/vt-d: Check FL and SL capability sanity in scalable mode") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/40cc077ca5f543614eab2a10e84d29dd190273f6.1636217517.git.christophe.jaillet@wanadoo.fr Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211126135556.397932-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18iommu/vt-d: Avoid duplicate removing in __domain_mapping()Longpeng(Mike)1-5/+6
The __domain_mapping() always removes the pages in the range from 'iov_pfn' to 'end_pfn', but the 'end_pfn' is always the last pfn of the range that the caller wants to map. This would introduce too many duplicated removing and leads the map operation take too long, for example: Map iova=0x100000,nr_pages=0x7d61800 iov_pfn: 0x100000, end_pfn: 0x7e617ff iov_pfn: 0x140000, end_pfn: 0x7e617ff iov_pfn: 0x180000, end_pfn: 0x7e617ff iov_pfn: 0x1c0000, end_pfn: 0x7e617ff iov_pfn: 0x200000, end_pfn: 0x7e617ff ... it takes about 50ms in total. We can reduce the cost by recalculate the 'end_pfn' and limit it to the boundary of the end of this pte page. Map iova=0x100000,nr_pages=0x7d61800 iov_pfn: 0x100000, end_pfn: 0x13ffff iov_pfn: 0x140000, end_pfn: 0x17ffff iov_pfn: 0x180000, end_pfn: 0x1bffff iov_pfn: 0x1c0000, end_pfn: 0x1fffff iov_pfn: 0x200000, end_pfn: 0x23ffff ... it only need 9ms now. This also removes a meaningless BUG_ON() in __domain_mapping(). Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com> Tested-by: Liujunjie <liujunjie23@huawei.com> Link: https://lore.kernel.org/r/20211008000433.1115-1-longpeng2@huawei.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18iommu/vt-d: Clean up unused PASID updating functionsFenghua Yu1-23/+1
update_pasid() and its call chain are currently unused in the tree because Thomas disabled the ENQCMD feature. The feature will be re-enabled shortly using a different approach and update_pasid() and its call chain will not be used in the new approach. Remove the useless functions. Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20210920192349.2602141-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18iommu/vt-d: Delete dev_has_feat callbackLu Baolu1-54/+5
The commit 262948f8ba573 ("iommu: Delete iommu_dev_has_feature()") has deleted the iommu_dev_has_feature() interface. Remove the dev_has_feat callback to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210929072030.1330225-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20211014053839.727419-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18iommu/vt-d: Use second level for GPA->HPA translationLu Baolu1-3/+12
The IOMMU VT-d implementation uses the first level for GPA->HPA translation by default. Although both the first level and the second level could handle the DMA translation, they're different in some way. For example, the second level translation has separate controls for the Access/Dirty page tracking. With the first level translation, there's no such control. On the other hand, the second level translation has the page-level control for forcing snoop, but the first level only has global control with pasid granularity. This uses the second level for GPA->HPA translation so that we can provide a consistent hardware interface for use cases like dirty page tracking for live migration. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210926114535.923263-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20211014053839.727419-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18iommu/vt-d: Check FL and SL capability sanity in scalable modeLu Baolu2-0/+14
An iommu domain could be allocated and mapped before it's attached to any device. This requires that in scalable mode, when the domain is allocated, the format (FL or SL) of the page table must be determined. In order to achieve this, the platform should support consistent SL or FL capabilities on all IOMMU's. This adds a check for this and aborts IOMMU probing if it doesn't meet this requirement. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210926114535.923263-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20211014053839.727419-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18iommu/vt-d: Remove duplicate identity domain flagLu Baolu1-5/+4
The iommu_domain data structure already has the "type" field to keep the type of a domain. It's unnecessary to have the DOMAIN_FLAG_STATIC_IDENTITY flag in the vt-d implementation. This cleans it up with no functionality change. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20210926114535.923263-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20211014053839.727419-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18iommu/vt-d: Dump DMAR translation structure when DMA fault occursKyung Min Park3-2/+125
When the dmar translation fault happens, the kernel prints a single line fault reason with corresponding hexadecimal code defined in the Intel VT-d specification. Currently, when user wants to debug the translation fault in detail, debugfs is used for dumping the dmar_translation_struct, which is not available when the kernel failed to boot. Dump the DMAR translation structure, pagewalk the IO page table and print the page table entry when the fault happens. This takes effect only when CONFIG_DMAR_DEBUG is enabled. Signed-off-by: Kyung Min Park <kyung.min.park@intel.com> Link: https://lore.kernel.org/r/20210815203845.31287-1-kyung.min.park@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-10-18iommu/vt-d: Do not falsely log intel_iommu is unsupported kernel optionTvrtko Ursulin1-1/+5
Handling of intel_iommu kernel command line option should return "true" to indicate option is valid and so avoid logging it as unknown by the core parsing code. Also log unknown sub-options at the notice level to let user know of potential typos or similar. Reported-by: Eero Tamminen <eero.t.tamminen@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://lore.kernel.org/r/20210831112947.310080-1-tvrtko.ursulin@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20211014053839.727419-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-28iommu/vt-d: Drop "0x" prefix from PCI bus & device addressesBjorn Helgaas1-3/+3
719a19335692 ("iommu/vt-d: Tweak the description of a DMA fault") changed the DMA fault reason from hex to decimal. It also added "0x" prefixes to the PCI bus/device, e.g., - DMAR: [INTR-REMAP] Request device [00:00.5] + DMAR: [INTR-REMAP] Request device [0x00:0x00.5] These no longer match dev_printk() and other similar messages in dmar_match_pci_path() and dmar_acpi_insert_dev_scope(). Drop the "0x" prefixes from the bus and device addresses. Fixes: 719a19335692 ("iommu/vt-d: Tweak the description of a DMA fault") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20210903193711.483999-1-helgaas@kernel.org Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210922054726.499110-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09iommu/vt-d: Fix a deadlock in intel_svm_drain_prq()Fenghua Yu1-0/+12
pasid_mutex and dev->iommu->param->lock are held while unbinding mm is flushing IO page fault workqueue and waiting for all page fault works to finish. But an in-flight page fault work also need to hold the two locks while unbinding mm are holding them and waiting for the work to finish. This may cause an ABBA deadlock issue as shown below: idxd 0000:00:0a.0: unbind PASID 2 ====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc7+ #549 Not tainted [ 186.615245] ---------- dsa_test/898 is trying to acquire lock: ffff888100d854e8 (&param->lock){+.+.}-{3:3}, at: iopf_queue_flush_dev+0x29/0x60 but task is already holding lock: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (pasid_mutex){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 intel_svm_page_response+0x8e/0x260 iommu_page_response+0x122/0x200 iopf_handle_group+0x1c2/0x240 process_one_work+0x2a5/0x5a0 worker_thread+0x55/0x400 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #1 (&param->fault_param->lock){+.+.}-{3:3}: __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iommu_report_device_fault+0xc2/0x170 prq_event_thread+0x28a/0x580 irq_thread_fn+0x28/0x60 irq_thread+0xcf/0x180 kthread+0x13b/0x160 ret_from_fork+0x22/0x30 -> #0 (&param->lock){+.+.}-{3:3}: __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 __mutex_lock+0x75/0x730 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 [idxd] __fput+0x9c/0x250 ____fput+0xe/0x10 task_work_run+0x64/0xa0 exit_to_user_mode_prepare+0x227/0x230 syscall_exit_to_user_mode+0x2c/0x60 do_syscall_64+0x48/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae other info that might help us debug this: Chain exists of: &param->lock --> &param->fault_param->lock --> pasid_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(pasid_mutex); lock(&param->fault_param->lock); lock(pasid_mutex); lock(&param->lock); *** DEADLOCK *** 2 locks held by dsa_test/898: #0: ffff888100cc1cc0 (&group->mutex){+.+.}-{3:3}, at: iommu_sva_unbind_device+0x53/0x80 #1: ffffffff82b2f7c8 (pasid_mutex){+.+.}-{3:3}, at: intel_svm_unbind+0x34/0x1e0 stack backtrace: CPU: 2 PID: 898 Comm: dsa_test Not tainted 5.14.0-rc7+ #549 Hardware name: Intel Corporation Kabylake Client platform/KBL S DDR4 UD IMM CRB, BIOS KBLSE2R1.R00.X050.P01.1608011715 08/01/2016 Call Trace: dump_stack_lvl+0x5b/0x74 dump_stack+0x10/0x12 print_circular_bug.cold+0x13d/0x142 check_noncircular+0xf1/0x110 __lock_acquire+0x1134/0x1d60 lock_acquire+0xc6/0x2e0 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xde/0x240 __mutex_lock+0x75/0x730 ? iopf_queue_flush_dev+0x29/0x60 ? pci_mmcfg_read+0xfd/0x240 ? iopf_queue_flush_dev+0x29/0x60 mutex_lock_nested+0x1b/0x20 iopf_queue_flush_dev+0x29/0x60 intel_svm_drain_prq+0x127/0x210 ? intel_pasid_tear_down_entry+0x22e/0x240 intel_svm_unbind+0xc5/0x1e0 iommu_sva_unbind_device+0x62/0x80 idxd_cdev_release+0x15a/0x200 pasid_mutex protects pasid and svm data mapping data. It's unnecessary to hold pasid_mutex while flushing the workqueue. To fix the deadlock issue, unlock pasid_pasid during flushing the workqueue to allow the works to be handled. Fixes: d5b9e4bfe0d8 ("iommu/vt-d: Report prq to io-pgfault framework") Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210828070622.2437559-3-baolu.lu@linux.intel.com [joro: Removed timing information from kernel log messages] Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-09-09iommu/vt-d: Fix PASID leak in intel_svm_unbind_mm()Fenghua Yu1-3/+0
The mm->pasid will be used in intel_svm_free_pasid() after load_pasid() during unbinding mm. Clearing it in load_pasid() will cause PASID cannot be freed in intel_svm_free_pasid(). Additionally mm->pasid was updated already before load_pasid() during pasid allocation. No need to update it again in load_pasid() during binding mm. Don't update mm->pasid to avoid the issues in both binding mm and unbinding mm. Fixes: 4048377414162 ("iommu/vt-d: Use iommu_sva_alloc(free)_pasid() helpers") Reported-and-tested-by: Dave Jiang <dave.jiang@intel.com> Co-developed-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Link: https://lore.kernel.org/r/20210826215918.4073446-1-fenghua.yu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210828070622.2437559-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-20Merge branches 'apple/dart', 'arm/smmu', 'iommu/fixes', 'x86/amd', ↵Joerg Roedel7-136/+135
'x86/vt-d' and 'core' into next
2021-08-19iommu/vt-d: Add present bit check in pasid entry setup helpersLiu Yi L1-0/+16
The helper functions should not modify the pasid entries which are still in use. Add a check against present bit. Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20210817042425.1784279-1-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210818134852.1847070-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19iommu/vt-d: Use pasid_pte_is_present() helper functionLiu Yi L1-1/+1
Use the pasid_pte_is_present() helper for present bit check in the intel_pasid_tear_down_entry(). Signed-off-by: Liu Yi L <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20210817042425.1784279-1-yi.l.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210818134852.1847070-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19iommu/vt-d: Drop the kernel doc annotationAndy Shevchenko1-1/+1
Kernel doc validator is unhappy with the following .../perf.c:16: warning: Function parameter or member 'latency_lock' not described in 'DEFINE_SPINLOCK' .../perf.c:16: warning: expecting prototype for perf.c(). Prototype was for DEFINE_SPINLOCK() instead Drop kernel doc annotation since the top comment is not in the required format. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20210729163538.40101-1-andriy.shevchenko@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210818134852.1847070-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19iommu/vt-d: Allow devices to have more than 32 outstanding PRsLu Baolu2-5/+2
The minimum per-IOMMU PRQ queue size is one 4K page, this is more entries than the hardcoded limit of 32 in the current VT-d code. Some devices can support up to 512 outstanding PRQs but underutilized by this limit of 32. Although, 32 gives some rough fairness when multiple devices share the same IOMMU PRQ queue, but far from optimal for customized use case. This extends the per-IOMMU PRQ queue size to four 4K pages and let the devices have as many outstanding page requests as they can. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210818134852.1847070-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2021-08-19iommu/vt-d: Preset A/D bits for user space DMA usageLu Baolu1-7/+3
We preset the access and dirty bits for IOVA over first level usage only for the kernel DMA (i.e., when domain type is IOMMU_DOMAIN_DMA). We should also preset the FL A/D for user space DMA usage. The idea is that even the user space A/D bit memory write is unnecessary. We should avoid it to minimize the overhead. Suggested-by: Sanjay Kumar <sanjay.k.kumar@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20210720013856.4143880-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210818134852.1847070-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>