summaryrefslogtreecommitdiff
path: root/arch/s390/pci/pci_dma.c
AgeCommit message (Collapse)AuthorFilesLines
2023-10-02s390/pci: Use dma-iommu layerNiklas Schnelle1-735/+0
While s390 already has a standard IOMMU driver and previous changes have added I/O TLB flushing operations this driver is currently only used for user-space PCI access such as vfio-pci. For the DMA API s390 instead utilizes its own implementation in arch/s390/pci/pci_dma.c which drives the same hardware and shares some code but requires a complex and fragile hand over between DMA API and IOMMU API use of a device and despite code sharing still leads to significant duplication and maintenance effort. Let's utilize the common code DMAP API implementation from drivers/iommu/dma-iommu.c instead allowing us to get rid of arch/s390/pci/pci_dma.c. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Link: https://lore.kernel.org/r/20230928-dma_iommu-v13-3-9e5fc4dacc36@linux.ibm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-25iommu/s390: Use GFP_KERNEL in sleepable contextsJason Gunthorpe1-1/+1
These contexts are sleepable, so use the proper annotation. The GFP_ATOMIC was added mechanically in the prior patches. Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/10-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-01-25iommu/s390: Push the gfp parameter to the kmem_cache_alloc()'sJason Gunthorpe1-14/+17
dma_alloc_cpu_table() and dma_alloc_page_table() are eventually called by iommufd through s390_iommu_map_pages() and it should not be forced to atomic. Thread the gfp parameter through the call chain starting from s390_iommu_map_pages(). Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/9-v3-76b587fe28df+6e3-iommu_map_gfp_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-19s390/pci: use lock-free I/O translation updatesNiklas Schnelle1-29/+45
I/O translation tables on s390 use 8 byte page table entries and tables which are allocated lazily but only freed when the entire I/O translation table is torn down. Also each IOVA can at any time only translate to one physical address Furthermore I/O table accesses by the IOMMU hardware are cache coherent. With a bit of care we can thus use atomic updates to manipulate the translation table without having to use a global lock at all. This is done analogous to the existing I/O translation table handling code used on Intel and AMD x86 systems. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Link: https://lore.kernel.org/r/20221109142903.4080275-6-schnelle@linux.ibm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-11-19iommu/s390: Make attach succeed even if the device is in error stateNiklas Schnelle1-1/+2
If a zPCI device is in the error state while switching IOMMU domains zpci_register_ioat() will fail and we would end up with the device not attached to any domain. In this state since zdev->dma_table == NULL a reset via zpci_hot_reset_device() would wrongfully re-initialize the device for DMA API usage using zpci_dma_init_device(). As automatic recovery is currently disabled while attached to an IOMMU domain this only affects slot resets triggered through other means but will affect automatic recovery once we switch to using dma-iommu. Additionally with that switch common code expects attaching to the default domain to always work so zpci_register_ioat() should only fail if there is no chance to recover anyway, e.g. if the device has been unplugged. Improve the robustness of attach by specifically looking at the status returned by zpci_mod_fc() to determine if the device is unavailable and in this case simply ignore the error. Once the device is reset zpci_hot_reset_device() will then correctly set the domain's DMA translation tables. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20221109142903.4080275-2-schnelle@linux.ibm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-09-14s390/pci: convert high_memory to physical addressNiklas Schnelle1-1/+1
We use high_memory as a measure for amount of memory available in determining the required minimum size of our IOVA space with the assumption that one rarely maps more than the available memory for DMA. In special cases like mapping significant amounts of memory more than once this can still be tuned with the s390_iommu_apterture kernel parameter. In this use case high_memory is treated as a physical address. As high_memory is a virtual address however this means we need to convert it using virt_to_phys() before use Note that at the moment physical and virtual addresses are identical so this mismatch does not currently cause trouble. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-12-06s390/pci: use physical addresses in DMA tablesNiklas Schnelle1-13/+13
The entries in the DMA translation tables for our IOMMU must specify physical addresses of either the next level table or the final page to be mapped for DMA. Currently however the code simply passes the virtual addresses of both. On the other hand we still need to walk the tables via their virtual addresses so we need to do a phys_to_virt() when setting the entries and a virt_to_phys() when getting them. Similarly when passing the I/O translation anchor to the hardware we must also specify its physical address. As the DMA and IOMMU APIs we are implementing already use the correct phys_addr_t type for the address to be mapped let's also thread this through instead of treating it as just an unsigned long. Note: this currently doesn't fix a real bug, since virtual addresses are indentical to physical ones. Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-10-26s390/pci: add s390_iommu_aperture kernel parameterNiklas Schnelle1-2/+23
Some applications map the same memory area for DMA multiple times while also mapping significant amounts of memory. With our current DMA code these applications will run out of DMA addresses after mapping half of the available memory because the number of DMA mappings is constrained by the number of concurrently active DMA addresses we support which in turn is limited by the minimum of hardware constraints and high_memory. Limiting the number of active DMA addresses to high_memory is only a heuristic to save memory used by the iommu_bitmap and DMA page tables however. This was added under the assumption that it rarely makes sense to DMA map more than system memory. To accommodate special applications which insist on double mapping, which works on other platforms, allow specifying a factor of how many times installed memory is available as DMA address space. Use 0 as a special value to apply no constraints beyond what hardware dictates at the expense of significantly more memory use. Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-09-02Merge tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds1-6/+7
Pull dma-mapping updates from Christoph Hellwig: - fix debugfs initialization order (Anthony Iliopoulos) - use memory_intersects() directly (Kefeng Wang) - allow to return specific errors from ->map_sg (Logan Gunthorpe, Martin Oliveira) - turn the dma_map_sg return value into an unsigned int (me) - provide a common global coherent pool іmplementation (me) * tag 'dma-mapping-5.15' of git://git.infradead.org/users/hch/dma-mapping: (31 commits) hexagon: use the generic global coherent pool dma-mapping: make the global coherent pool conditional dma-mapping: add a dma_init_global_coherent helper dma-mapping: simplify dma_init_coherent_memory dma-mapping: allow using the global coherent pool for !ARM ARM/nommu: use the generic dma-direct code for non-coherent devices dma-direct: add support for dma_coherent_default_memory dma-mapping: return an unsigned int from dma_map_sg{,_attrs} dma-mapping: disallow .map_sg operations from returning zero on error dma-mapping: return error code from dma_dummy_map_sg() x86/amd_gart: don't set failed sg dma_address to DMA_MAPPING_ERROR x86/amd_gart: return error code from gart_map_sg() xen: swiotlb: return error code from xen_swiotlb_map_sg() parisc: return error code from .map_sg() ops sparc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR sparc/iommu: return error codes from .map_sg() ops s390/pci: don't set failed sg dma_address to DMA_MAPPING_ERROR s390/pci: return error code from s390_dma_map_sg() powerpc/iommu: don't set failed sg dma_address to DMA_MAPPING_ERROR powerpc/iommu: return error code from .map_sg() ops ...
2021-08-25s390/pci: improve DMA translation init and exitNiklas Schnelle1-8/+17
Currently zpci_dma_init_device()/zpci_dma_exit_device() is called as part of zpci_enable_device()/zpci_disable_device() and errors for zpci_dma_exit_device() are always ignored even if we could abort. Improve upon this by moving zpci_dma_exit_device() out of zpci_disable_device() and check for errors whenever we have a way to abort the current operation. Note that for example in zpci_event_hard_deconfigured() the device is expected to be gone so we really can't abort and proceed even in case of error. Similarly move the cc == 3 special case out of zpci_unregister_ioat() and into the callers allowing to abort when finding an already disabled devices precludes proceeding with the operation. While we are at it log IOAT register/unregister errors in the s390 debugfs log, Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-09s390/pci: don't set failed sg dma_address to DMA_MAPPING_ERRORLogan Gunthorpe1-1/+0
Setting the ->dma_address to DMA_MAPPING_ERROR is not part of the ->map_sg calling convention, so remove it. Link: https://lore.kernel.org/linux-mips/20210716063241.GC13345@lst.de/ Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Cc: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-08-09s390/pci: return error code from s390_dma_map_sg()Martin Oliveira1-5/+7
The .map_sg() op now expects an error code instead of zero on failure. So propagate the error from __s390_dma_map_sg() up. __s390_dma_map_sg() returns either -ENOMEM on allocation failure or -EINVAL which is the same as what's expected by dma_map_sgtable(). Signed-off-by: Martin Oliveira <martin.oliveira@eideticom.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Niklas Schnelle <schnelle@linux.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-10-06dma-mapping: split <linux/dma-mapping.h>Christoph Hellwig1-1/+1
Split out all the bits that are purely for dma_map_ops implementations and related code into a new <linux/dma-map-ops.h> header so that they don't get pulled into all the drivers. That also means the architecture specific <asm/dma-mapping.h> is not pulled in by <linux/dma-mapping.h> any more, which leads to a missing includes that were pulled in by the x86 or arm versions in a few not overly portable drivers. Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-09-25dma-mapping: add a new dma_alloc_pages APIChristoph Hellwig1-0/+2
This API is the equivalent of alloc_pages, except that the returned memory is guaranteed to be DMA addressable by the passed in device. The implementation will also be used to provide a more sensible replacement for DMA_ATTR_NON_CONSISTENT flag. Additionally dma_alloc_noncoherent is switched over to use dma_alloc_pages as its backend. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> (MIPS part)
2020-09-03dma-mapping: introduce dma_get_seg_boundary_nr_pages()Nicolin Chen1-4/+2
We found that callers of dma_get_seg_boundary mostly do an ALIGN with page mask and then do a page shift to get number of pages: ALIGN(boundary + 1, 1 << shift) >> shift However, the boundary might be as large as ULONG_MAX, which means that a device has no specific boundary limit. So either "+ 1" or passing it to ALIGN() would potentially overflow. According to kernel defines: #define ALIGN_MASK(x, mask) (((x) + (mask)) & ~(mask)) #define ALIGN(x, a) ALIGN_MASK(x, (typeof(x))(a) - 1) We can simplify the logic here into a helper function doing: ALIGN(boundary + 1, 1 << shift) >> shift = ALIGN_MASK(b + 1, (1 << s) - 1) >> s = {[b + 1 + (1 << s) - 1] & ~[(1 << s) - 1]} >> s = [b + 1 + (1 << s) - 1] >> s = [b + (1 << s)] >> s = (b >> s) + 1 This patch introduces and applies dma_get_seg_boundary_nr_pages() as an overflow-free helper for the dma_get_seg_boundary() callers to get numbers of pages. It also takes care of the NULL dev case for non-DMA API callers. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com> Acked-by: Niklas Schnelle <schnelle@linux.ibm.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-19Merge tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds1-0/+2
Pull dma-mapping updates from Christoph Hellwig: - add dma-mapping and block layer helpers to take care of IOMMU merging for mmc plus subsequent fixups (Yoshihiro Shimoda) - rework handling of the pgprot bits for remapping (me) - take care of the dma direct infrastructure for swiotlb-xen (me) - improve the dma noncoherent remapping infrastructure (me) - better defaults for ->mmap, ->get_sgtable and ->get_required_mask (me) - cleanup mmaping of coherent DMA allocations (me) - various misc cleanups (Andy Shevchenko, me) * tag 'dma-mapping-5.4' of git://git.infradead.org/users/hch/dma-mapping: (41 commits) mmc: renesas_sdhi_internal_dmac: Add MMC_CAP2_MERGE_CAPABLE mmc: queue: Fix bigger segments usage arm64: use asm-generic/dma-mapping.h swiotlb-xen: merge xen_unmap_single into xen_swiotlb_unmap_page swiotlb-xen: simplify cache maintainance swiotlb-xen: use the same foreign page check everywhere swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable xen: remove the exports for xen_{create,destroy}_contiguous_region xen/arm: remove xen_dma_ops xen/arm: simplify dma_cache_maint xen/arm: use dev_is_dma_coherent xen/arm: consolidate page-coherent.h xen/arm: use dma-noncoherent.h calls for xen-swiotlb cache maintainance arm: remove wrappers for the generic dma remap helpers dma-mapping: introduce a dma_common_find_pages helper dma-mapping: always use VM_DMA_COHERENT for generic DMA remap vmalloc: lift the arm flag for coherent mappings to common code dma-mapping: provide a better default ->get_required_mask dma-mapping: remove the dma_declare_coherent_memory export remoteproc: don't allow modular build ...
2019-09-04dma-mapping: explicitly wire up ->mmap and ->get_sgtableChristoph Hellwig1-0/+2
While the default ->mmap and ->get_sgtable implementations work for the majority of our dma_map_ops impementations they are inherently safe for others that don't use the page allocator or CMA and/or use their own way of remapping not covered by the common code. So remove the defaults if these methods are not wired up, but instead wire up the default implementations for all safe instances. Fixes: e1c7e324539a ("dma-mapping: always provide the dma_map_ops based implementation") Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-08-29s390/pci: avoid using strncmp with hardcoded lengthVasily Gorbik1-2/+2
Command line option values passed to __setup callbacks are always null-terminated and "s390_iommu=" may only accept "strict" as value. So replace strncmp with strcmp. While at it also make s390_iommu_setup return 1, which means this command line option is handled by this callback. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2018-12-20dma-mapping: zero memory returned from dma_alloc_*Christoph Hellwig1-1/+1
If we want to map memory from the DMA allocator to userspace it must be zeroed at allocation time to prevent stale data leaks. We already do this on most common architectures, but some architectures don't do this yet, fix them up, either by passing GFP_ZERO when we use the normal page allocator or doing a manual memset otherwise. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Sam Ravnborg <sam@ravnborg.org> [sparc]
2018-12-06s390: remove the mapping_error dma_map_ops methodChristoph Hellwig1-13/+5
S390 already returns (~(dma_addr_t)0x0) on mapping failures, so we can switch over to returning DMA_MAPPING_ERROR and let the core dma-mapping code handle the rest. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-05-08dma-debug: move initialization to common codeChristoph Hellwig1-9/+0
Most mainstream architectures are using 65536 entries, so lets stick to that. If someone is really desperate to override it that can still be done through <asm/dma-mapping.h>, but I'd rather see a really good rationale for that. dma_debug_init is now called as a core_initcall, which for many architectures means much earlier, and provides dma-debug functionality earlier in the boot process. This should be safe as it only relies on the memory allocator already being available. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com>
2018-05-07PCI: remove PCI_DMA_BUS_IS_PHYSChristoph Hellwig1-2/+0
This was used by the ide, scsi and networking code in the past to determine if they should bounce payloads. Now that the dma mapping always have to support dma to all physical memory (thanks to swiotlb for non-iommu systems) there is no need to this crude hack any more. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Palmer Dabbelt <palmer@sifive.com> (for riscv) Reviewed-by: Jens Axboe <axboe@kernel.dk>
2017-12-13s390/pci: handle insufficient resources during dma tlb flushSebastian Ott1-2/+19
In a virtualized setup lazy flushing can lead to the hypervisor running out of resources when lots of guest pages need to be pinned. In this situation simply trigger a global flush to give the hypervisor a chance to free some of these resources. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-11-24s390: pci: add SPDX identifiers to the remaining filesGreg Kroah-Hartman1-0/+1
It's good to have SPDX identifiers in all files to make it easier to audit the kernel tree for correct licenses. Update the arch/s390/pci/ files with the correct SPDX license identifier based on the license text in the file itself. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This work is based on a script and data from Thomas Gleixner, Philippe Ombredanne, and Kate Stewart. Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-07Merge tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds1-5/+13
Pull dma-mapping infrastructure from Christoph Hellwig: "This is the first pull request for the new dma-mapping subsystem In this new subsystem we'll try to properly maintain all the generic code related to dma-mapping, and will further consolidate arch code into common helpers. This pull request contains: - removal of the DMA_ERROR_CODE macro, replacing it with calls to ->mapping_error so that the dma_map_ops instances are more self contained and can be shared across architectures (me) - removal of the ->set_dma_mask method, which duplicates the ->dma_capable one in terms of functionality, but requires more duplicate code. - various updates for the coherent dma pool and related arm code (Vladimir) - various smaller cleanups (me)" * tag 'dma-mapping-4.13' of git://git.infradead.org/users/hch/dma-mapping: (56 commits) ARM: dma-mapping: Remove traces of NOMMU code ARM: NOMMU: Set ARM_DMA_MEM_BUFFERABLE for M-class cpus ARM: NOMMU: Introduce dma operations for noMMU drivers: dma-mapping: allow dma_common_mmap() for NOMMU drivers: dma-coherent: Introduce default DMA pool drivers: dma-coherent: Account dma_pfn_offset when used with device tree dma: Take into account dma_pfn_offset dma-mapping: replace dmam_alloc_noncoherent with dmam_alloc_attrs dma-mapping: remove dmam_free_noncoherent crypto: qat - avoid an uninitialized variable warning au1100fb: remove a bogus dma_free_nonconsistent call MAINTAINERS: add entry for dma mapping helpers powerpc: merge __dma_set_mask into dma_set_mask dma-mapping: remove the set_dma_mask method powerpc/cell: use the dma_supported method for ops switching powerpc/cell: clean up fixed mapping dma_ops initialization tile: remove dma_supported and mapping_error methods xen-swiotlb: remove xen_swiotlb_set_dma_mask arm: implement ->dma_supported instead of ->set_dma_mask mips/loongson64: implement ->dma_supported instead of ->set_dma_mask ...
2017-06-28s390: implement ->mapping_errorChristoph Hellwig1-5/+13
s390 can also use noop_dma_ops, and while that currently does not return errors it will so in the future. Implementing the mapping_error method is the proper way to have per-ops error conditions. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
2017-06-28s390/pci: improve unreg_ioat error handlingSebastian Ott1-1/+3
DMA tables are freed in zpci_dma_exit_device regardless of the return code of zpci_unregister_ioat. This could lead to a use after free. On the other hand during function hot-unplug, zpci_unregister_ioat will always fail since the function is already gone. So let zpci_unregister_ioat report success when the function is gone but don't cleanup the dma table when a function could still have it in access. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-24treewide: Constify most dma_map_ops structuresBart Van Assche1-1/+1
Most dma_map_ops structures are never modified. Constify these structures such that these can be write-protected. This patch has been generated as follows: git grep -l 'struct dma_map_ops' | xargs -d\\n sed -i \ -e 's/struct dma_map_ops/const struct dma_map_ops/g' \ -e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \ -e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \ -e 's/const const struct dma_map_ops /const struct dma_map_ops /g'; sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \ $(git grep -l 'struct dma_map_ops intel_dma_ops'); sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \ $(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc); sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \ -e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \ -e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \ drivers/pci/host/*.c sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Russell King <linux@armlinux.org.uk> Cc: x86@kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14Merge branch 'for-linus' of ↵Linus Torvalds1-17/+19
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "The main bulk of the s390 patches for the 4.10 merge window: - Add support for the contiguous memory allocator. - The recovery for I/O errors in the dasd device driver is improved, the driver will now remove channel paths that are not working properly. - Additional fields are added to /proc/sysinfo, the extended partition name and the partition UUID. - New naming for PCI devices with system defined UIDs. - The last few remaining alloc_bootmem calls are converted to memblock. - The thread_info structure is stripped down and moved to the task_struct. The only field left in thread_info is the flags field. - Rework of the arch topology code to fix a fake numa issue. - Refactoring of the atomic primitives and add a new preempt_count implementation. - Clocksource steering for the STP sync check offsets. - The s390 specific headers are changed to make them usable with CLANG. - Bug fixes and cleanup" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (70 commits) s390/cpumf: Use configuration level indication for sampling data s390: provide memmove implementation s390: cleanup arch/s390/kernel Makefile s390: fix initrd corruptions with gcov/kcov instrumented kernels s390: exclude early C code from gcov profiling s390/dasd: channel path aware error recovery s390/dasd: extend dasd path handling s390: remove unused labels from entry.S s390/vmlogrdr: fix IUCV buffer allocation s390/crypto: unlock on error in prng_tdes_read() s390/sysinfo: show partition extended name and UUID if available s390/numa: pin all possible cpus to nodes early s390/numa: establish cpu to node mapping early s390/topology: use cpu_topology array instead of per cpu variable s390/smp: initialize cpu_present_mask in setup_arch s390/topology: always use s390 specific sched_domain_topology_level s390/smp: use smp_get_base_cpu() helper function s390/numa: always use logical cpu and core ids s390: Remove VLAIS in ptff() and clear_table() s390: fix machine check panic stack switch ...
2016-11-17s390/pci_dma: remove memset from dma_allocSebastian Ott1-2/+0
Get rid of a useless memset from dma_alloc. Users of dma_alloc who want zero initialized memory can get it by specifying __GFP_ZERO or use one of the zalloc variants. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17s390/pci_dma: make lazy flush independent from the tlb_refresh bitSebastian Ott1-8/+11
We have 2 strategies to reduce the number of RPCIT instructions: * A HW feature indicated via the tlb_refresh bit allows us to omit RPCIT for invalid -> valid translation-table entry updates. * With "lazy flush" we omit RPCIT for valid -> invalid updates until we run out of dma addresses. When we have to reuse dma addresses we issue a global tlb flush using only one RPCIT instruction. Currently lazy flushing depends on tlb_refresh. Since there is no technical reason for this remove this dependency. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-17s390/pci: fix dma address calculation in map_sgSebastian Ott1-7/+8
__s390_dma_map_sg maps a dma-contiguous area. Although we only map whole pages we have to take into account that the area doesn't start or stop at a page boundary because we use the dma address to loop over the individual sg entries. Failing to do that might lead to an access of the wrong sg entry. Fixes: ee877b81c6b9 ("s390/pci_dma: improve map_sg") Reported-and-tested-by: Christoph Raisch <raisch@de.ibm.com> Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-11-11s390: pci: don't print uninitialized data for debuggingArnd Bergmann1-1/+1
gcc correctly warns about an incorrect use of the 'pa' variable in case we pass an empty scatterlist to __s390_dma_map_sg: arch/s390/pci/pci_dma.c: In function '__s390_dma_map_sg': arch/s390/pci/pci_dma.c:309:13: warning: 'pa' may be used uninitialized in this function [-Wmaybe-uninitialized] This adds a bogus initialization to the function to sanitize the debug output. I would have preferred a solution without the initialization, but I only got the report from the kbuild bot after turning on the warning again, and didn't manage to reproduce it myself. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-09-22s390/pci_dma: improve lazy flush for unmapSebastian Ott1-14/+33
Lazy unmap (defer tlb flush after unmap until dma address reuse) can greatly reduce the number of RPCIT instructions in the best case. In reality we are often far away from the best case scenario because our implementation suffers from the following problem: To create dma addresses we maintain an iommu bitmap and a pointer into that bitmap to mark the start of the next search. That pointer moves from the start to the end of that bitmap and we issue a global tlb flush once that pointer wraps around. To prevent address reuse before we issue the tlb flush we even have to move the next pointer during unmaps - when clearing a bit > next. This could lead to a situation where we only use the rear part of that bitmap and issue more tlb flushes than expected. To fix this we no longer clear bits during unmap but maintain a 2nd bitmap which we use to mark addresses that can't be reused until we issue the global tlb flush after wrap around. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22s390/pci_dma: split dma_update_transSebastian Ott1-21/+44
Split dma_update_trans into __dma_update_trans which handles updating the dma translation tables and __dma_purge_tlb which takes care of purging associated entries in the dma translation lookaside buffer. The map_sg API makes use of this split approach by calling __dma_update_trans once per physically contiguous address range but __dma_purge_tlb only once per dma contiguous address range. This results in less invocations of the expensive RPCIT instruction when using map_sg. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22s390/pci_dma: improve map_sgSebastian Ott1-24/+82
Our map_sg implementation mapped sg entries independently of each other. For ease of use and possible performance improvements this patch changes the implementation to try to map as many (likely physically non-contiguous) sglist entries as possible into a contiguous DMA segment. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22s390/pci_dma: simplify dma address calculationSebastian Ott1-19/+18
Simplify the code we use to calculate dma addresses by putting everything related in a dma_alloc_address function. Also provide a dma_free_address counterpart. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-09-22s390/pci_dma: remove dma address range checkSebastian Ott1-5/+0
We calculate dma addresses using an iommu bitmap. Since commit 69eea95c ("s390/pci_dma: fix DMA table corruption with > 4 TB main memory") we've made sure that addresses created using that bitmap are below the maximum reported by firmware. Thus the additional check for that address to be within range can be removed. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-04dma-mapping: use unsigned long for dma_attrsKrzysztof Kozlowski1-11/+12
The dma-mapping core and the implementations do not change the DMA attributes passed by pointer. Thus the pointer can point to const data. However the attributes do not have to be a bitfield. Instead unsigned long will do fine: 1. This is just simpler. Both in terms of reading the code and setting attributes. Instead of initializing local attributes on the stack and passing pointer to it to dma_set_attr(), just set the bits. 2. It brings safeness and checking for const correctness because the attributes are passed by value. Semantic patches for this change (at least most of them): virtual patch virtual context @r@ identifier f, attrs; @@ f(..., - struct dma_attrs *attrs + unsigned long attrs , ...) { ... } @@ identifier r.f; @@ f(..., - NULL + 0 ) and // Options: --all-includes virtual patch virtual context @r@ identifier f, attrs; type t; @@ t f(..., struct dma_attrs *attrs); @@ identifier r.f; @@ f(..., - NULL + 0 ) Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> Acked-by: Mark Salter <msalter@redhat.com> [c6x] Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris] Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm] Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp] Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core] Acked-by: David Vrabel <david.vrabel@citrix.com> [xen] Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb] Acked-by: Joerg Roedel <jroedel@suse.de> [iommu] Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390] Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org> Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32] Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc] Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-06-13s390/pci: ensure to not cross a dma segment boundarySebastian Ott1-1/+2
When we use the iommu_area_alloc helper to get dma addresses we specify the boundary_size parameter but not the offset (called shift in this context). As long as the offset (start_dma) is a multiple of the boundary we're ok (on current machines start_dma always seems to be 4GB). Don't leave this to chance and specify the offset for iommu_area_alloc. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-06-13s390/pci: ensure page aligned dma start addressSebastian Ott1-0/+1
We don't have an architectural guarantee on the value of the dma offset but rely on it to be at least page aligned. Enforce page alignemt of start_dma. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-04-21s390/pci: fix use after free in dma_initSebastian Ott1-6/+10
After a failure during registration of the dma_table (because of the function being in error state) we free its memory but don't reset the associated pointer to zero. When we then receive a notification from firmware (about the function being in error state) we'll try to walk and free the dma_table again. Fix this by resetting the dma_table pointer. In addition to that make sure that we free the iommu_bitmap when appropriate. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-03-20Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds1-2/+2
Pull virtio/vhost updates from Michael Tsirkin: "New features, performance improvements, cleanups: - basic polling support for vhost - rework virtio to optionally use DMA API, fixing it on Xen - balloon stats gained a new entry - using the new napi_alloc_skb speeds up virtio net - virtio blk stats can now be read while another VCPU is busy inflating or deflating the balloon plus misc cleanups in various places" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: virtio_net: replace netdev_alloc_skb_ip_align() with napi_alloc_skb() vhost_net: basic polling support vhost: introduce vhost_vq_avail_empty() vhost: introduce vhost_has_work() virtio_balloon: Allow to resize and update the balloon stats in parallel virtio_balloon: Use a workqueue instead of "vballoon" kthread virtio/s390: size of SET_IND payload virtio/s390: use dev_to_virtio vhost: rename vhost_init_used() vhost: rename cross-endian helpers virtio_blk: VIRTIO_BLK_F_WCE->VIRTIO_BLK_F_FLUSH vring: Use the DMA API on Xen virtio_pci: Use the DMA API if enabled virtio_mmio: Use the DMA API if enabled virtio: Add improved queue allocation API virtio_ring: Support DMA APIs vring: Introduce vring_use_dma_api() s390/dma: Allow per device dma ops alpha/dma: use common noop dma ops dma: Provide simple noop dma ops
2016-03-02s390/dma: Allow per device dma opsChristian Borntraeger1-2/+2
As virtio-ccw will have dma ops, we can no longer default to the zPCI ones. Make use of dev_archdata to keep the dma_ops per device. The pci devices now use that to override the default, and the default is changed to use the noop ops for everything that does not specify a device specific one. To compile without PCI support we will enable HAS_DMA all the time, via the default config in lib/Kconfig. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Joerg Roedel <jroedel@suse.de> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-23s390/pci: remove pdev pointer from arch dataSebastian Ott1-9/+12
For each PCI function we need to maintain arch specific data in struct zpci_dev which also contains a pointer to struct pci_dev. When a function is registered or deregistered (which is triggered by PCI common code) we need to adjust that pointer which could interfere with the machine check handler (triggered by FW) using zpci_dev->pdev. Since multiple instances of the same pdev could exist at a time this can't be solved with locking. Fix that by ditching the pdev pointer and use a bus walk to reach struct pci_dev (only one instance of a pdev can be registered at the bus at a time). Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-01-14Merge branch 'for-linus' of ↵Linus Torvalds1-5/+14
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Martin Schwidefsky: "Among the traditional bug fixes and cleanups are some improvements: - A tool to generated the facility lists, generating the bit fields by hand has been a source of bugs in the past - The spinlock loop is reordered to avoid bursts of hypervisor calls - Add support for the open-for-business interface to the service element - The get_cpu call is added to the vdso - A set of tracepoints is defined for the common I/O layer - The deprecated sclp_cpi module is removed - Update default configuration" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (56 commits) s390/sclp: fix possible control register corruption s390: fix normalization bug in exception table sorting s390/configs: update default configurations s390/vdso: optimize getcpu system call s390: drop smp_mb in vdso_init s390: rename struct _lowcore to struct lowcore s390/mem_detect: use unsigned longs s390/ptrace: get rid of long longs in psw_bits s390/sysinfo: add missing SYSIB 1.2.2 multithreading fields s390: get rid of CONFIG_SCHED_MC and CONFIG_SCHED_BOOK s390/Kconfig: remove pointless 64 bit dependencies s390/dasd: fix failfast for disconnected devices s390/con3270: testing return kzalloc retval s390/hmcdrv: constify hmcdrv_ftp_ops structs s390/cio: add NULL test s390/cio: Change I/O instructions from inline to normal functions s390/cio: Introduce common I/O layer tracepoints s390/cio: Consolidate inline assemblies and related data definitions s390/cio: Fix incorrect xsch opcode specification s390/cio: Remove unused inline assemblies ...
2016-01-09[s390] page_to_phys() always returns a multiple of PAGE_SIZEAl Viro1-2/+1
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2015-11-27s390/pci_dma: fix DMA table corruption with > 4 TB main memoryGerald Schaefer1-5/+14
DMA addresses returned from map_page() are calculated by using an iommu bitmap plus a start_dma offset. The size of this bitmap is based on the main memory size. If we have more than (4 TB - start_dma) main memory, the DMA address calculation will also produce addresses > 4 TB. Such addresses cannot be inserted in the 3-level DMA page table, instead the entries modulo 4 TB will be overwritten. Fix this by restricting the iommu bitmap size to (4 TB - start_dma). Also set zdev->end_dma to the actual end address of the usable range, instead of the theoretical maximum as reported by the hardware, which fixes a sanity check in dma_map() and also the IOMMU API domain geometry aperture calculation. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-09s390/pci_dma: improve debugging of errors during dma mapSebastian Ott1-11/+30
Improve debugging to find out what went wrong during a failed dma map/unmap operation. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-09s390/pci_dma: handle dma table failuresSebastian Ott1-14/+24
We use lazy allocation for translation table entries but don't handle allocation (and other) failures during translation table updates. Handle these failures and undo translation table updates when it's meaningful. Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com> Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>