summaryrefslogtreecommitdiff
path: root/drivers/vfio
AgeCommit message (Collapse)AuthorFilesLines
2023-01-14vfio: platform: Do not pass return buffer to ACPI _RST methodRafael Mendonca1-2/+1
[ Upstream commit e67e070632a665c932d534b8b800477bb3111449 ] The ACPI _RST method has no return value, there's no need to pass a return buffer to acpi_evaluate_object(). Fixes: d30daa33ec1d ("vfio: platform: call _RST method when using ACPI") Signed-off-by: Rafael Mendonca <rafaelmendsr@gmail.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20221018152825.891032-1-rafaelmendsr@gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28vfio/type1: fix vaddr_get_pfns() return in vfio_pin_page_external()Daniel Jordan1-1/+7
commit 4ab4fcfce5b540227d80eb32f1db45ab615f7c92 upstream. vaddr_get_pfns() now returns the positive number of pfns successfully gotten instead of zero. vfio_pin_page_external() might return 1 to vfio_iommu_type1_pin_pages(), which will treat it as an error, if vaddr_get_pfns() is successful but vfio_pin_page_external() doesn't reach vfio_lock_acct(). Fix it up in vfio_pin_page_external(). Found by inspection. Fixes: be16c1fd99f4 ("vfio/type1: Change success value of vaddr_get_pfn()") Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Message-Id: <20210308172452.38864-1-daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-09-28vfio/type1: Unpin zero pagesAlex Williamson1-0/+12
[ Upstream commit 873aefb376bbc0ed1dd2381ea1d6ec88106fdbd4 ] There's currently a reference count leak on the zero page. We increment the reference via pin_user_pages_remote(), but the page is later handled as an invalid/reserved page, therefore it's not accounted against the user and not unpinned by our put_pfn(). Introducing special zero page handling in put_pfn() would resolve the leak, but without accounting of the zero page, a single user could still create enough mappings to generate a reference count overflow. The zero page is always resident, so for our purposes there's no reason to keep it pinned. Therefore, add a loop to walk pages returned from pin_user_pages_remote() and unpin any zero pages. Cc: stable@vger.kernel.org Reported-by: Luboslav Pivarc <lpivarc@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/166182871735.3518559.8884121293045337358.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28vfio/type1: Prepare for batched pinning with struct vfio_batchDaniel Jordan1-13/+58
[ Upstream commit 4b6c33b3229678e38a6b0bbd4367d4b91366b523 ] Get ready to pin more pages at once with struct vfio_batch, which represents a batch of pinned pages. The struct has a fallback page pointer to avoid two unlikely scenarios: pointlessly allocating a page if disable_hugepages is enabled or failing the whole pinning operation if the kernel can't allocate memory. vaddr_get_pfn() becomes vaddr_get_pfns() to prepare for handling multiple pages, though for now only one page is stored in the pages array. Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Stable-dep-of: 873aefb376bb ("vfio/type1: Unpin zero pages") Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-09-28vfio/type1: Change success value of vaddr_get_pfn()Daniel Jordan1-7/+14
[ Upstream commit be16c1fd99f41abebc0bf965d5d29cd18c9d271e ] vaddr_get_pfn() simply returns 0 on success. Have it report the number of pfns successfully gotten instead, whether from page pinning or follow_fault_pfn(), which will be used later when batching pinning. Change the last check in vfio_pin_pages_remote() for consistency with the other two. Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Stable-dep-of: 873aefb376bb ("vfio/type1: Unpin zero pages") Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-25vfio: Clear the caps->buf to NULL after freeSchspa Shi1-0/+1
[ Upstream commit 6641085e8d7b3f061911517f79a2a15a0a21b97b ] On buffer resize failure, vfio_info_cap_add() will free the buffer, report zero for the size, and return -ENOMEM. As additional hardening, also clear the buffer pointer to prevent any chance of a double free. Signed-off-by: Schspa Shi <schspa@gmail.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20220629022948.55608-1-schspa@gmail.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21vfio/mdev: Make to_mdev_device() into a static inlineJason Gunthorpe1-1/+4
[ Upstream commit 66873b5fa738ca02b5c075ca4a410b13d88e6e9a ] The macro wrongly uses 'dev' as both the macro argument and the member name, which means it fails compilation if any caller uses a word other than 'dev' as the single argument. Fix this defect by making it into proper static inline, which is more clear and typesafe anyhow. Fixes: 99e3123e3d72 ("vfio-mdev: Make mdev_device private and abstract interfaces") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <11-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21vfio: Split creation of a vfio_device into init and register opsJason Gunthorpe1-60/+65
[ Upstream commit 0bfc6a4ea63c2adac71a824397ef48f28dbc5e47 ] This makes the struct vfio_device part of the public interface so it can be used with container_of and so forth, as is typical for a Linux subystem. This is the first step to bring some type-safety to the vfio interface by allowing the replacement of 'void *' and 'struct device *' inputs with a simple and clear 'struct vfio_device *' For now the self-allocating vfio_add_group_dev() interface is kept so each user can be updated as a separate patch. The expected usage pattern is driver core probe() function: my_device = kzalloc(sizeof(*mydevice)); vfio_init_group_dev(&my_device->vdev, dev, ops, mydevice); /* other driver specific prep */ vfio_register_group_dev(&my_device->vdev); dev_set_drvdata(dev, my_device); driver core remove() function: my_device = dev_get_drvdata(dev); vfio_unregister_group_dev(&my_device->vdev); /* other driver specific tear down */ kfree(my_device); Allowing the driver to be able to use the drvdata and vfio_device to go to/from its own data. The pattern also makes it clear that vfio_register_group_dev() must be last in the sequence, as once it is called the core code can immediately start calling ops. The init/register gap is provided to allow for the driver to do setup before ops can be called and thus avoid races. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Liu Yi L <yi.l.liu@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <3-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21vfio: Simplify the lifetime logic for vfio_deviceJason Gunthorpe1-54/+25
[ Upstream commit 5e42c999445bd0ae86e35affeb3e7c473d74a893 ] The vfio_device is using a 'sleep until all refs go to zero' pattern for its lifetime, but it is indirectly coded by repeatedly scanning the group list waiting for the device to be removed on its own. Switch this around to be a direct representation, use a refcount to count the number of places that are blocking destruction and sleep directly on a completion until that counter goes to zero. kfree the device after other accesses have been excluded in vfio_del_group_dev(). This is a fairly common Linux idiom. Due to this we can now remove kref_put_mutex(), which is very rarely used in the kernel. Here it is being used to prevent a zero ref device from being seen in the group list. Instead allow the zero ref device to continue to exist in the device_list and use refcount_inc_not_zero() to exclude it once refs go to zero. This patch is organized so the next patch will be able to alter the API to allow drivers to provide the kfree. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <2-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21vfio: Remove extra put/gets around vfio_device->groupJason Gunthorpe1-19/+2
[ Upstream commit e572bfb2b6a83b05acd30c03010e661b1967960f ] The vfio_device->group value has a get obtained during vfio_add_group_dev() which gets moved from the stack to vfio_device->group in vfio_group_create_device(). The reference remains until we reach the end of vfio_del_group_dev() when it is put back. Thus anything that already has a kref on the vfio_device is guaranteed a valid group pointer. Remove all the extra reference traffic. It is tricky to see, but the get at the start of vfio_del_group_dev() is actually pairing with the put hidden inside vfio_device_put() a few lines below. A later patch merges vfio_group_create_device() into vfio_add_group_dev() which makes the ownership and error flow on the create side easier to follow. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <1-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08amba: Make the remove callback return voidUwe Kleine-König1-2/+1
[ Upstream commit 3fd269e74f2feec973f45ee11d822faeda4fe284 ] All amba drivers return 0 in their remove callback. Together with the driver core ignoring the return value anyhow, it doesn't make sense to return a value here. Change the remove prototype to return void, which makes it explicit that returning an error value doesn't work as expected. This simplifies changing the core remove callback to return void, too. Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alexandre Belloni <alexandre.belloni@bootlin.com> Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Acked-by: Krzysztof Kozlowski <krzk@kernel.org> # for drivers/memory Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Suzuki K Poulose <suzuki.poulose@arm.com> # for hwtracing/coresight Acked-By: Vinod Koul <vkoul@kernel.org> # for dmaengine Acked-by: Guenter Roeck <linux@roeck-us.net> # for watchdog Acked-by: Wolfram Sang <wsa@kernel.org> # for I2C Acked-by: Takashi Iwai <tiwai@suse.de> # for sound Acked-by: Vladimir Zapolskiy <vz@mleia.com> # for memory/pl172 Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20210126165835.687514-5-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08vfio: platform: simplify device removalUwe Kleine-König1-9/+5
[ Upstream commit 5b495ac8fe03b9e0d2e775f9064c3e2a340ff440 ] vfio_platform_remove_common() cannot return non-NULL in vfio_amba_remove() as the latter is only called if vfio_amba_probe() returned success. Diagnosed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Eric Auger <eric.auger@redhat.com> Link: https://lore.kernel.org/r/20210126165835.687514-4-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-18vfio: Use config not menuconfig for VFIO_NOIOMMUJason Gunthorpe1-1/+1
[ Upstream commit 26c22cfde5dd6e63f25c48458b0185dcb0fbb2fd ] VFIO_NOIOMMU is supposed to be an element in the VFIO menu, not start a new menu. Correct this copy-paste mistake. Fixes: 03a76b60f8ba ("vfio: Include No-IOMMU mode") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/0-v1-3f0b685c3679+478-vfio_menuconfig_jgg@nvidia.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-07-14vfio/pci: Handle concurrent vma faultsAlex Williamson1-8/+21
[ Upstream commit 6a45ece4c9af473555f01f0f8b97eba56e3c7d0d ] io_remap_pfn_range() will trigger a BUG_ON if it encounters a populated pte within the mapping range. This can occur because we map the entire vma on fault and multiple faults can be blocked behind the vma_lock. This leads to traces like the one reported below. We can use our vma_list to test whether a given vma is mapped to avoid this issue. [ 1591.733256] kernel BUG at mm/memory.c:2177! [ 1591.739515] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 1591.747381] Modules linked in: vfio_iommu_type1 vfio_pci vfio_virqfd vfio pv680_mii(O) [ 1591.760536] CPU: 2 PID: 227 Comm: lcore-worker-2 Tainted: G O 5.11.0-rc3+ #1 [ 1591.770735] Hardware name: , BIOS HixxxxFPGA 1P B600 V121-1 [ 1591.778872] pstate: 40400009 (nZcv daif +PAN -UAO -TCO BTYPE=--) [ 1591.786134] pc : remap_pfn_range+0x214/0x340 [ 1591.793564] lr : remap_pfn_range+0x1b8/0x340 [ 1591.799117] sp : ffff80001068bbd0 [ 1591.803476] x29: ffff80001068bbd0 x28: 0000042eff6f0000 [ 1591.810404] x27: 0000001100910000 x26: 0000001300910000 [ 1591.817457] x25: 0068000000000fd3 x24: ffffa92f1338e358 [ 1591.825144] x23: 0000001140000000 x22: 0000000000000041 [ 1591.832506] x21: 0000001300910000 x20: ffffa92f141a4000 [ 1591.839520] x19: 0000001100a00000 x18: 0000000000000000 [ 1591.846108] x17: 0000000000000000 x16: ffffa92f11844540 [ 1591.853570] x15: 0000000000000000 x14: 0000000000000000 [ 1591.860768] x13: fffffc0000000000 x12: 0000000000000880 [ 1591.868053] x11: ffff0821bf3d01d0 x10: ffff5ef2abd89000 [ 1591.875932] x9 : ffffa92f12ab0064 x8 : ffffa92f136471c0 [ 1591.883208] x7 : 0000001140910000 x6 : 0000000200000000 [ 1591.890177] x5 : 0000000000000001 x4 : 0000000000000001 [ 1591.896656] x3 : 0000000000000000 x2 : 0168044000000fd3 [ 1591.903215] x1 : ffff082126261880 x0 : fffffc2084989868 [ 1591.910234] Call trace: [ 1591.914837] remap_pfn_range+0x214/0x340 [ 1591.921765] vfio_pci_mmap_fault+0xac/0x130 [vfio_pci] [ 1591.931200] __do_fault+0x44/0x12c [ 1591.937031] handle_mm_fault+0xcc8/0x1230 [ 1591.942475] do_page_fault+0x16c/0x484 [ 1591.948635] do_translation_fault+0xbc/0xd8 [ 1591.954171] do_mem_abort+0x4c/0xc0 [ 1591.960316] el0_da+0x40/0x80 [ 1591.965585] el0_sync_handler+0x168/0x1b0 [ 1591.971608] el0_sync+0x174/0x180 [ 1591.978312] Code: eb1b027f 540000c0 f9400022 b4fffe02 (d4210000) Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking") Reported-by: Zeng Tao <prime.zeng@hisilicon.com> Suggested-by: Zeng Tao <prime.zeng@hisilicon.com> Link: https://lore.kernel.org/r/162497742783.3883260.3282953006487785034.stgit@omen Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-10vfio/platform: fix module_put call in error flowMax Gurtovoy1-1/+1
[ Upstream commit dc51ff91cf2d1e9a2d941da483602f71d4a51472 ] The ->parent_module is the one that use in try_module_get. It should also be the one the we use in module_put during vfio_platform_open(). Fixes: 32a2d71c4e80 ("vfio: platform: introduce vfio-platform-base module") Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Message-Id: <20210518192133.59195-1-mgurtovoy@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-10vfio/pci: zap_vma_ptes() needs MMURandy Dunlap1-0/+1
[ Upstream commit 2a55ca37350171d9b43d561528f23d4130097255 ] zap_vma_ptes() is only available when CONFIG_MMU is set/enabled. Without CONFIG_MMU, vfio_pci.o has build errors, so make VFIO_PCI depend on MMU. riscv64-linux-ld: drivers/vfio/pci/vfio_pci.o: in function `vfio_pci_mmap_open': vfio_pci.c:(.text+0x1ec): undefined reference to `zap_vma_ptes' riscv64-linux-ld: drivers/vfio/pci/vfio_pci.o: in function `.L0 ': vfio_pci.c:(.text+0x165c): undefined reference to `zap_vma_ptes' Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Cornelia Huck <cohuck@redhat.com> Cc: kvm@vger.kernel.org Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Eric Auger <eric.auger@redhat.com> Message-Id: <20210515190856.2130-1-rdunlap@infradead.org> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-06-10vfio/pci: Fix error return code in vfio_ecap_init()Zhen Lei1-1/+1
[ Upstream commit d1ce2c79156d3baf0830990ab06d296477b93c26 ] The error code returned from vfio_ext_cap_len() is stored in 'len', not in 'ret'. Fixes: 89e1f7d4c66d ("vfio: Add PCI device driver") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Message-Id: <20210515020458.6771-1-thunder.leizhen@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14vfio/mdev: Do not allow a mdev_type to have a NULL parent pointerJason Gunthorpe1-1/+1
[ Upstream commit b5a1f8921d5040bb788492bf33a66758021e4be5 ] There is a small race where the parent is NULL even though the kobj has already been made visible in sysfs. For instance the attribute_group is made visible in sysfs_create_files() and the mdev_type_attr_show() does: ret = attr->show(kobj, type->parent->dev, buf); Which will crash on NULL parent. Move the parent setup to before the type pointer leaves the stack frame. Fixes: 7b96953bc640 ("vfio: Mediated device Core driver") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <2-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14vfio/pci: Re-order vfio_pci_probe()Jason Gunthorpe1-8/+9
[ Upstream commit 4aeec3984ddc853f7c65903bde472ffdef738bae ] vfio_add_group_dev() must be called only after all of the private data in vdev is fully setup and ready, otherwise there could be races with user space instantiating a device file descriptor and starting to call ops. For instance vfio_pci_reflck_attach() sets vdev->reflck and vfio_pci_open(), called by fops open, unconditionally derefs it, which will crash if things get out of order. Fixes: cc20d7999000 ("vfio/pci: Introduce VF token") Fixes: e309df5b0c9e ("vfio/pci: Parallelize device open and release") Fixes: 6eb7018705de ("vfio-pci: Move idle devices to D3hot power state") Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <8-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14vfio/pci: Move VGA and VF initialization to functionsJason Gunthorpe1-42/+74
[ Upstream commit 61e90817482871b614133c0f20feb1aba2faec86 ] vfio_pci_probe() is quite complicated, with optional VF and VGA sub components. Move these into clear init/uninit functions and have a linear flow in probe/remove. This fixes a few little buglets: - vfio_pci_remove() is in the wrong order, vga_client_register() removes a notifier and is after kfree(vdev), but the notifier refers to vdev, so it can use after free in a race. - vga_client_register() can fail but was ignored Organize things so destruction order is the reverse of creation order. Fixes: ecaa1f6a0154 ("vfio-pci: Add VGA arbiter client") Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <7-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-14vfio/fsl-mc: Re-order vfio_fsl_mc_probe()Jason Gunthorpe1-27/+47
[ Upstream commit 2b1fe162e584a88ec7f12a651a2a50f94dd8cfac ] vfio_add_group_dev() must be called only after all of the private data in vdev is fully setup and ready, otherwise there could be races with user space instantiating a device file descriptor and starting to call ops. For instance vfio_fsl_mc_reflck_attach() sets vdev->reflck and vfio_fsl_mc_open(), called by fops open, unconditionally derefs it, which will crash if things get out of order. This driver started life with the right sequence, but two commits added stuff after vfio_add_group_dev(). Fixes: 2e0d29561f59 ("vfio/fsl-mc: Add irq infrastructure for fsl-mc devices") Fixes: f2ba7e8c947b ("vfio/fsl-mc: Added lock support in preparation for interrupt handling") Co-developed-by: Diana Craciun OSS <diana.craciun@oss.nxp.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <5-v3-225de1400dfc+4e074-vfio1_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-07vfio: Depend on MMUJason Gunthorpe1-1/+1
commit b2b12db53507bc97d96f6b7cb279e831e5eafb00 upstream. VFIO_IOMMU_TYPE1 does not compile with !MMU: ../drivers/vfio/vfio_iommu_type1.c: In function 'follow_fault_pfn': ../drivers/vfio/vfio_iommu_type1.c:536:22: error: implicit declaration of function 'pte_write'; did you mean 'vfs_write'? [-Werror=implicit-function-declaration] So require it. Suggested-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <0-v1-02cb5500df6e+78-vfio_no_mmu_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-21vfio/pci: Add missing range check in vfio_pci_mmapChristian A. Ehrhardt1-1/+3
commit 909290786ea335366e21d7f1ed5812b90f2f0a92 upstream. When mmaping an extra device region verify that the region index derived from the mmap offset is valid. Fixes: a15b1883fee1 ("vfio_pci: Allow mapping extra regions") Cc: stable@vger.kernel.org Signed-off-by: Christian A. Ehrhardt <lk@c--e.de> Message-Id: <20210412214124.GA241759@lisa.in-ulm.de> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07vfio/nvlink: Add missing SPAPR_TCE_IOMMU dependsJason Gunthorpe1-1/+1
commit e0146a108ce4d2c22b9510fd12268e3ee72a0161 upstream. Compiling the nvlink stuff relies on the SPAPR_TCE_IOMMU otherwise there are compile errors: drivers/vfio/pci/vfio_pci_nvlink2.c:101:10: error: implicit declaration of function 'mm_iommu_put' [-Werror,-Wimplicit-function-declaration] ret = mm_iommu_put(data->mm, data->mem); As PPC only defines these functions when the config is set. Previously this wasn't a problem by chance as SPAPR_TCE_IOMMU was the only IOMMU that could have satisfied IOMMU_API on POWERNV. Fixes: 179209fa1270 ("vfio: IOMMU_API should be selected") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <0-v1-83dba9768fc3+419-vfio_nvlink2_kconfig_jgg@nvidia.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-25vfio: IOMMU_API should be selectedJason Gunthorpe1-1/+1
commit 179209fa12709a3df8888c323b37315da2683c24 upstream. As IOMMU_API is a kconfig without a description (eg does not show in the menu) the correct operator is select not 'depends on'. Using 'depends on' for this kind of symbol means VFIO is not selectable unless some other random kconfig has already enabled IOMMU_API for it. Fixes: cba3345cc494 ("vfio: VFIO core") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Message-Id: <1-v1-df057e0f92c3+91-vfio_arm_compile_test_jgg@nvidia.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04vfio/type1: Use follow_pte()Alex Williamson1-2/+13
[ Upstream commit 07956b6269d3ed05d854233d5bb776dca91751dd ] follow_pfn() doesn't make sure that we're using the correct page protections, get the pte with follow_pte() so that we can test protections and get the pfn from the pte. Fixes: 5cbf3264bc71 ("vfio/type1: Fix VA->PA translation for PFNMAP VMAs in vaddr_get_pfn()") Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04vfio-pci/zdev: fix possible segmentation fault issueMax Gurtovoy1-0/+4
[ Upstream commit 7e31d6dc2c78b2a0ba0039ca97ca98a581e8db82 ] In case allocation fails, we must behave correctly and exit with error. Fixes: e6b817d4b821 ("vfio-pci/zdev: Add zPCI capabilities to VFIO_DEVICE_GET_INFO") Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04vfio/iommu_type1: Fix some sanity checks in detach groupKeqian Zhu1-23/+8
[ Upstream commit 4a19f37a3dd3f29997735e61b25ddad24b8abe73 ] vfio_sanity_check_pfn_list() is used to check whether pfn_list and notifier are empty when remove the external domain, so it makes a wrong assumption that only external domain will use the pinning interface. Now we apply the pfn_list check when a vfio_dma is removed and apply the notifier check when all domains are removed. Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices") Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04vfio/iommu_type1: Populate full dirty when detach non-pinned groupKeqian Zhu1-1/+16
[ Upstream commit d0a78f91761fcd837da1e7a4b0f8368873adc646 ] If a group with non-pinned-page dirty scope is detached with dirty logging enabled, we should fully populate the dirty bitmaps at the time it's removed since we don't know the extent of its previous DMA, nor will the group be present to trigger the full bitmap when the user retrieves the dirty bitmap. Fixes: d6a4c185660c ("vfio iommu: Implementation of ioctl for dirty pages tracking") Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30vfio/pci/nvlink2: Do not attempt NPU2 setup on POWER8NVL NPUAlexey Kardashevskiy1-2/+5
commit d22f9a6c92de96304c81792942ae7c306f08ac77 upstream. We execute certain NPU2 setup code (such as mapping an LPID to a device in NPU2) unconditionally if an Nvlink bridge is detected. However this cannot succeed on POWER8NVL machines as the init helpers return an error other than ENODEV which means the device is there is and setup failed so vfio_pci_enable() fails and pass through is not possible. This changes the two NPU2 related init helpers to return -ENODEV if there is no "memory-region" device tree property as this is the distinction between NPU and NPU2. Tested on - POWER9 pvr=004e1201, Ubuntu 19.04 host, Ubuntu 18.04 vm, NVIDIA GV100 10de:1db1 driver 418.39 - POWER8 pvr=004c0100, RHEL 7.6 host, Ubuntu 16.10 vm, NVIDIA P100 10de:15f9 driver 396.47 Fixes: 7f92891778df ("vfio_pci: Add NVIDIA GV100GL [Tesla V100 SXM2] subdriver") Cc: stable@vger.kernel.org # 5.0 Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30vfio/pci: Move dummy_resources_list init in vfio_pci_probe()Eric Auger1-2/+1
commit 16b8fe4caf499ae8e12d2ab1b1324497e36a7b83 upstream. In case an error occurs in vfio_pci_enable() before the call to vfio_pci_probe_mmaps(), vfio_pci_disable() will try to iterate on an uninitialized list and cause a kernel panic. Lets move to the initialization to vfio_pci_probe() to fix the issue. Signed-off-by: Eric Auger <eric.auger@redhat.com> Fixes: 05f0c03fbac1 ("vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive") CC: Stable <stable@vger.kernel.org> # v4.7+ Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30vfio-pci: Use io_remap_pfn_range() for PCI IO memoryJason Gunthorpe1-2/+2
[ Upstream commit 7b06a56d468b756ad6bb43ac21b11e474ebc54a0 ] commit f8f6ae5d077a ("mm: always have io_remap_pfn_range() set pgprot_decrypted()") allows drivers using mmap to put PCI memory mapped BAR space into userspace to work correctly on AMD SME systems that default to all memory encrypted. Since vfio_pci_mmap_fault() is working with PCI memory mapped BAR space it should be calling io_remap_pfn_range() otherwise it will not work on SME systems. Fixes: 11c4cd07ba11 ("vfio-pci: Fault mmaps to enable vma tracking") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Acked-by: Peter Xu <peterx@redhat.com> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-03vfio/pci: Bypass IGD init in case of -ENODEVFred Gao1-1/+1
Bypass the IGD initialization when -ENODEV returns, that should be the case if opregion is not available for IGD or within discrete graphics device's option ROM, or host/lpc bridge is not found. Then use of -ENODEV here means no special device resources found which needs special care for VFIO, but we still allow other normal device resource access. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Xiong Zhang <xiong.y.zhang@intel.com> Cc: Hang Yuan <hang.yuan@linux.intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Fred Gao <fred.gao@intel.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio: platform: fix reference leak in vfio_platform_openZhang Qilong1-2/+1
pm_runtime_get_sync() will increment pm usage counter even it failed. Forgetting to call pm_runtime_put will result in reference leak in vfio_platform_open, so we should fix it. Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com> Acked-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio/pci: Implement ioeventfd thread handler for contended memory lockAlex Williamson1-8/+35
The ioeventfd is called under spinlock with interrupts disabled, therefore if the memory lock is contended defer code that might sleep to a thread context. Fixes: bc93b9ae0151 ("vfio-pci: Avoid recursive read-lock usage") Link: https://bugzilla.kernel.org/show_bug.cgi?id=209253#c1 Reported-by: Ian Pilcher <arequipeno@gmail.com> Tested-by: Ian Pilcher <arequipeno@gmail.com> Tested-by: Justin Gatzen <justin.gatzen@gmail.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio/fsl-mc: Make vfio_fsl_mc_irqs_allocate staticDiana Craciun1-1/+1
Fixed compiler warning: drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c:16:5: warning: no previous prototype for function 'vfio_fsl_mc_irqs_allocate' [-Wmissing-prototypes] ^ drivers/vfio/fsl-mc/vfio_fsl_mc_intr.c:16:1: note: declare 'static' if the function is not intended to be used outside of this translation unit int vfio_fsl_mc_irqs_allocate(struct vfio_fsl_mc_device *vdev) Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio/fsl-mc: prevent underflow in vfio_fsl_mc_mmap()Dan Carpenter1-1/+1
My static analsysis tool complains that the "index" can be negative. There are some checks in do_mmap() which try to prevent underflows but I don't know if they are sufficient for this situation. Either way, making "index" unsigned is harmless so let's do it just to be safe. Fixes: 67247289688d ("vfio/fsl-mc: Allow userspace to MMAP fsl-mc device MMIO regions") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio/fsl-mc: return -EFAULT if copy_to_user() failsDan Carpenter1-2/+6
The copy_to_user() function returns the number of bytes remaining to be copied, but this code should return -EFAULT. Fixes: df747bcd5b21 ("vfio/fsl-mc: Implement VFIO_DEVICE_GET_REGION_INFO ioctl call") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-11-03vfio/type1: Use the new helper to find vfio_groupZenghui Yu1-12/+5
When attaching a new group to the container, let's use the new helper vfio_iommu_find_iommu_group() to check if it's already attached. There is no functional change. Also take this chance to add a missing blank line. Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-10-22Merge tag 'vfio-v5.10-rc1' of git://github.com/awilliam/linux-vfioLinus Torvalds16-16/+1200
Pull VFIO updates from Alex Williamson: - New fsl-mc vfio bus driver supporting userspace drivers of objects within NXP's DPAA2 architecture (Diana Craciun) - Support for exposing zPCI information on s390 (Matthew Rosato) - Fixes for "detached" VFs on s390 (Matthew Rosato) - Fixes for pin-pages and dma-rw accesses (Yan Zhao) - Cleanups and optimize vconfig regen (Zenghui Yu) - Fix duplicate irq-bypass token registration (Alex Williamson) * tag 'vfio-v5.10-rc1' of git://github.com/awilliam/linux-vfio: (30 commits) vfio iommu type1: Fix memory leak in vfio_iommu_type1_pin_pages vfio/pci: Clear token on bypass registration failure vfio/fsl-mc: fix the return of the uninitialized variable ret vfio/fsl-mc: Fix the dead code in vfio_fsl_mc_set_irq_trigger vfio/fsl-mc: Fixed vfio-fsl-mc driver compilation on 32 bit MAINTAINERS: Add entry for s390 vfio-pci vfio-pci/zdev: Add zPCI capabilities to VFIO_DEVICE_GET_INFO vfio/fsl-mc: Add support for device reset vfio/fsl-mc: Add read/write support for fsl-mc devices vfio/fsl-mc: trigger an interrupt via eventfd vfio/fsl-mc: Add irq infrastructure for fsl-mc devices vfio/fsl-mc: Added lock support in preparation for interrupt handling vfio/fsl-mc: Allow userspace to MMAP fsl-mc device MMIO regions vfio/fsl-mc: Implement VFIO_DEVICE_GET_REGION_INFO ioctl call vfio/fsl-mc: Implement VFIO_DEVICE_GET_INFO ioctl vfio/fsl-mc: Scan DPRC objects on vfio-fsl-mc driver bind vfio: Introduce capability definitions for VFIO_DEVICE_GET_INFO s390/pci: track whether util_str is valid in the zpci_dev s390/pci: stash version in the zpci_dev vfio/fsl-mc: Add VFIO framework skeleton for fsl-mc devices ...
2020-10-20vfio iommu type1: Fix memory leak in vfio_iommu_type1_pin_pagesXiaoyang Xu1-1/+2
pfn is not added to pfn_list when vfio_add_to_pfn_list fails. vfio_unpin_page_external will exit directly without calling vfio_iova_put_vfio_pfn. This will lead to a memory leak. Fixes: a54eb55045ae ("vfio iommu type1: Add support for mediated devices") Signed-off-by: Xiaoyang Xu <xuxiaoyang2@huawei.com> [aw: simplified logic, add Fixes] Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-10-19vfio/pci: Clear token on bypass registration failureAlex Williamson1-1/+3
The eventfd context is used as our irqbypass token, therefore if an eventfd is re-used, our token is the same. The irqbypass code will return an -EBUSY in this case, but we'll still attempt to unregister the producer, where if that duplicate token still exists, results in removing the wrong object. Clear the token of failed producers so that they harmlessly fall out when unregistered. Fixes: 6d7425f109d2 ("vfio: Register/unregister irq_bypass_producer") Reported-by: guomin chen <guomin_chen@sina.com> Tested-by: guomin chen <guomin_chen@sina.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-10-19vfio/fsl-mc: fix the return of the uninitialized variable retDiana Craciun1-1/+1
The vfio_fsl_mc_reflck_attach function may return, on success path, an uninitialized variable. Fix the problem by initializing the return variable to 0. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: f2ba7e8c947b ("vfio/fsl-mc: Added lock support in preparation for interrupt handling") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-10-16mm: remove the now-unnecessary mmget_still_valid() hackJann Horn1-20/+18
The preceding patches have ensured that core dumping properly takes the mmap_lock. Thanks to that, we can now remove mmget_still_valid() and all its users. Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: "Eric W . Biederman" <ebiederm@xmission.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Hugh Dickins <hughd@google.com> Link: http://lkml.kernel.org/r/20200827114932.3572699-8-jannh@google.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-10-15vfio/fsl-mc: Fix the dead code in vfio_fsl_mc_set_irq_triggerDiana Craciun1-3/+3
Static analysis discovered that some code in vfio_fsl_mc_set_irq_trigger is dead code. Fixed the code by changing the conditions order. Fixes: cc0ee20bd969 ("vfio/fsl-mc: trigger an interrupt via eventfd") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-10-13vfio/fsl-mc: Fixed vfio-fsl-mc driver compilation on 32 bitDiana Craciun1-0/+1
The FSL_MC_BUS on which the VFIO-FSL-MC driver is dependent on can be compiled on other architectures as well (not only ARM64) including 32 bit architectures. Include linux/io-64-nonatomic-hi-lo.h to make writeq/readq used in the driver available on 32bit platforms. Signed-off-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-10-12Merge branches 'v5.10/vfio/fsl-mc-v6' and 'v5.10/vfio/zpci-info-v3' into ↵Alex Williamson12-0/+1151
v5.10/vfio/next
2020-10-12vfio-pci/zdev: Add zPCI capabilities to VFIO_DEVICE_GET_INFOMatthew Rosato5-0/+205
Define a new configuration entry VFIO_PCI_ZDEV for VFIO/PCI. When this s390-only feature is configured we add capabilities to the VFIO_DEVICE_GET_INFO ioctl that describe features of the associated zPCI device and its underlying hardware. This patch is based on work previously done by Pierre Morel. Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-10-12vfio/fsl-mc: Add support for device resetDiana Craciun1-1/+17
Currently only resetting the DPRC container is supported which will reset all the objects inside it. Resetting individual objects is possible from the userspace by issueing commands towards MC firmware. Signed-off-by: Diana Craciun <diana.craciun@oss.nxp.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2020-10-12vfio/fsl-mc: Add read/write support for fsl-mc devicesDiana Craciun2-3/+116
The software uses a memory-mapped I/O command interface (MC portals) to communicate with the MC hardware. This command interface is used to discover, enumerate, configure and remove DPAA2 objects. The DPAA2 objects use MSIs, so the command interface needs to be emulated such that the correct MSI is configured in the hardware (the guest has the virtual MSIs). This patch is adding read/write support for fsl-mc devices. The mc commands are emulated by the userspace. The host is just passing the correct command to the hardware. Also the current patch limits userspace to write complete 64byte command once and read 64byte response by one ioctl. Signed-off-by: Bharat Bhushan <Bharat.Bhushan@nxp.com> Signed-off-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>