summaryrefslogtreecommitdiff
path: root/drivers/vfio/pci/pds/pci_drv.c
AgeCommit message (Collapse)AuthorFilesLines
2024-03-11vfio/pds: Refactor/simplify reset logicBrett Creeley1-22/+5
The current logic for handling resets is more complicated than it needs to be. The deferred_reset flag is used to indicate a reset is needed and the deferred_reset_state is the requested, post-reset, state. Also, the deferred_reset logic was added to vfio migration drivers to prevent a circular locking dependency with respect to mm_lock and state mutex. This is mainly because of the copy_to/from_user() functions(which takes mm_lock) invoked under state mutex. Remove all of the deferred reset logic and just pass the requested next state to pds_vfio_reset() so it can be used for VMM and DSC initiated resets. This removes the need for pds_vfio_state_mutex_lock(), so remove that and replace its use with a simple mutex_unlock(). Also, remove the reset_mutex as it's no longer needed since the state_mutex can be the driver's primary protector. Suggested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: Brett Creeley <brett.creeley@amd.com> Link: https://lore.kernel.org/r/20240308182149.22036-3-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-11-27vfio/pds: Fix possible sleep while in atomic contextBrett Creeley1-2/+2
The driver could possibly sleep while in atomic context resulting in the following call trace while CONFIG_DEBUG_ATOMIC_SLEEP=y is set: BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2817, name: bash preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 Call Trace: <TASK> dump_stack_lvl+0x36/0x50 __might_resched+0x123/0x170 mutex_lock+0x1e/0x50 pds_vfio_put_lm_file+0x1e/0xa0 [pds_vfio_pci] pds_vfio_put_save_file+0x19/0x30 [pds_vfio_pci] pds_vfio_state_mutex_unlock+0x2e/0x80 [pds_vfio_pci] pci_reset_function+0x4b/0x70 reset_store+0x5b/0xa0 kernfs_fop_write_iter+0x137/0x1d0 vfs_write+0x2de/0x410 ksys_write+0x5d/0xd0 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 This can happen if pds_vfio_put_restore_file() and/or pds_vfio_put_save_file() grab the mutex_lock(&lm_file->lock) while the spin_lock(&pds_vfio->reset_lock) is held, which can happen during while calling pds_vfio_state_mutex_unlock(). Fix this by changing the reset_lock to reset_mutex so there are no such conerns. Also, make sure to destroy the reset_mutex in the driver specific VFIO device release function. This also fixes a spinlock bad magic BUG that was caused by not calling spinlock_init() on the reset_lock. Since, the lock is being changed to a mutex, make sure to call mutex_init() on it. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/kvm/1f9bc27b-3de9-4891-9687-ba2820c1b390@moroto.mountain/ Fixes: bb500dbe2ac6 ("vfio/pds: Add VFIO live migration support") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20231122192532.25791-3-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-10-24iommufd/iova_bitmap: Move symbols to IOMMUFD namespaceJoao Martins1-0/+1
Have the IOVA bitmap exported symbols adhere to the IOMMUFD symbol export convention i.e. using the IOMMUFD namespace. In doing so, import the namespace in the current users. This means VFIO and the vfio-pci drivers that use iova_bitmap_set(). Link: https://lore.kernel.org/r/20231024135109.73787-4-joao.m.martins@oracle.com Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-08-16vfio/pds: Add support for firmware recoveryBrett Creeley1-0/+114
It's possible that the device firmware crashes and is able to recover due to some configuration and/or other issue. If a live migration is in progress while the firmware crashes, the live migration will fail. However, the VF PCI device should still be functional post crash recovery and subsequent migrations should go through as expected. When the pds_core device notices that firmware crashes it sends an event to all its client drivers. When the pds_vfio driver receives this event while migration is in progress it will request a deferred reset on the next migration state transition. This state transition will report failure as well as any subsequent state transition requests from the VMM/VFIO. Based on uapi/vfio.h the only way out of VFIO_DEVICE_STATE_ERROR is by issuing VFIO_DEVICE_RESET. Once this reset is done, the migration state will be reset to VFIO_DEVICE_STATE_RUNNING and migration can be performed. If the event is received while no migration is in progress (i.e. the VM is in normal operating mode), then no actions are taken and the migration state remains VFIO_DEVICE_STATE_RUNNING. Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230807205755.29579-8-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-08-16vfio/pds: Add VFIO live migration supportBrett Creeley1-0/+13
Add live migration support via the VFIO subsystem. The migration implementation aligns with the definition from uapi/vfio.h and uses the pds_core PF's adminq for device configuration. The ability to suspend, resume, and transfer VF device state data is included along with the required admin queue command structures and implementations. PDS_LM_CMD_SUSPEND and PDS_LM_CMD_SUSPEND_STATUS are added to support the VF device suspend operation. PDS_LM_CMD_RESUME is added to support the VF device resume operation. PDS_LM_CMD_STATE_SIZE is added to determine the exact size of the VF device state data. PDS_LM_CMD_SAVE is added to get the VF device state data. PDS_LM_CMD_RESTORE is added to restore the VF device with the previously saved data from PDS_LM_CMD_SAVE. PDS_LM_CMD_HOST_VF_STATUS is added to notify the DSC/firmware when a migration is in/not-in progress from the host's perspective. The DSC/firmware can use this to clear/setup any necessary state related to a migration. Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230807205755.29579-6-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-08-16vfio/pds: register with the pds_core PFBrett Creeley1-0/+14
The pds_core driver will supply adminq services, so find the PF and register with the DSC services. Use the following commands to enable a VF: echo 1 > /sys/bus/pci/drivers/pds_core/$PF_BDF/sriov_numvfs Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230807205755.29579-5-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2023-08-16vfio/pds: Initial support for pds VFIO driverBrett Creeley1-0/+68
This is the initial framework for the new pds-vfio-pci device driver. This does the very basics of registering the PDS PCI device and configuring it as a VFIO PCI device. With this change, the VF device can be bound to the pds-vfio-pci driver on the host and presented to the VM as an ethernet VF. Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20230807205755.29579-3-brett.creeley@amd.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com>