summaryrefslogtreecommitdiff
path: root/arch/riscv/include/asm/tlbflush.h
AgeCommit message (Collapse)AuthorFilesLines
2024-01-11riscv: Add support for BATCHED_UNMAP_TLB_FLUSHAlexandre Ghiti1-0/+8
Allow to defer the flushing of the TLB when unmapping pages, which allows to reduce the numbers of IPI and the number of sfence.vma. The ubenchmarch used in commit 43b3dfdd0455 ("arm64: support batched/deferred tlb shootdown during page reclamation/migration") that was multithreaded to force the usage of IPI shows good performance improvement on all platforms: * Unmatched: ~34% * TH1520 : ~78% * Qemu : ~81% In addition, perf on qemu reports an important decrease in time spent dealing with IPIs: Before: 68.17% main [kernel.kallsyms] [k] __sbi_rfence_v02_call After : 8.64% main [kernel.kallsyms] [k] __sbi_rfence_v02_call * Benchmark: int stick_this_thread_to_core(int core_id) { int num_cores = sysconf(_SC_NPROCESSORS_ONLN); if (core_id < 0 || core_id >= num_cores) return EINVAL; cpu_set_t cpuset; CPU_ZERO(&cpuset); CPU_SET(core_id, &cpuset); pthread_t current_thread = pthread_self(); return pthread_setaffinity_np(current_thread, sizeof(cpu_set_t), &cpuset); } static void *fn_thread (void *p_data) { int ret; pthread_t thread; stick_this_thread_to_core((int)p_data); while (1) { sleep(1); } return NULL; } int main() { volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); pthread_t threads[4]; int ret; for (int i = 0; i < 4; ++i) { ret = pthread_create(&threads[i], NULL, fn_thread, (void *)i); if (ret) { printf("%s", strerror (ret)); } } memset(p, 0x88, SIZE); for (int k = 0; k < 10000; k++) { /* swap in */ for (int i = 0; i < SIZE; i += 4096) { (void)p[i]; } /* swap out */ madvise(p, SIZE, MADV_PAGEOUT); } for (int i = 0; i < 4; i++) { pthread_cancel(threads[i]); } for (int i = 0; i < 4; i++) { pthread_join(threads[i], NULL); } return 0; } Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Tested-by: Jisheng Zhang <jszhang@kernel.org> # Tested on TH1520 Tested-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20240108193640.344929-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06riscv: Improve flush_tlb_kernel_range()Alexandre Ghiti1-5/+6
This function used to simply flush the whole tlb of all harts, be more subtile and try to only flush the range. The problem is that we can only use PAGE_SIZE as stride since we don't know the size of the underlying mapping and then this function will be improved only if the size of the region to flush is < threshold * PAGE_SIZE. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20231030133027.19542-5-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06riscv: Make __flush_tlb_range() loop over pte instead of flushing the whole tlbAlexandre Ghiti1-0/+3
Currently, when the range to flush covers more than one page (a 4K page or a hugepage), __flush_tlb_range() flushes the whole tlb. Flushing the whole tlb comes with a greater cost than flushing a single entry so we should flush single entries up to a certain threshold so that: threshold * cost of flushing a single entry < cost of flushing the whole tlb. Co-developed-by: Mayuresh Chitale <mchitale@ventanamicro.com> Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20231030133027.19542-4-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-11-06riscv: Improve tlb_flush()Alexandre Ghiti1-0/+3
For now, tlb_flush() simply calls flush_tlb_mm() which results in a flush of the whole TLB. So let's use mmu_gather fields to provide a more fine-grained flush of the TLB. Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> # On RZ/Five SMARC Link: https://lore.kernel.org/r/20231030133027.19542-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-22riscv: mm: Fix incorrect ASID argument when flushing TLBDylan Jhong1-0/+2
Currently, we pass the CONTEXTID instead of the ASID to the TLB flush function. We should only take the ASID field to prevent from touching the reserved bit field. Fixes: 3f1e782998cd ("riscv: add ASID-based tlbflushing methods") Signed-off-by: Dylan Jhong <dylan@andestech.com> Reviewed-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Link: https://lore.kernel.org/r/20230313034906.2401730-1-dylan@andestech.com Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2023-03-10Revert "riscv: mm: notify remote harts about mmu cache updates"Sergey Matyukevich1-18/+0
This reverts the remaining bits of commit 4bd1d80efb5a ("riscv: mm: notify remote harts harts about mmu cache updates"). According to bug reports, suggested approach to fix stale TLB entries is not sufficient. It needs to be replaced by a more robust solution. Fixes: 4bd1d80efb5a ("riscv: mm: notify remote harts about mmu cache updates") Reported-by: Zong Li <zong.li@sifive.com> Reported-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Cc: stable@vger.kernel.org Reviewed-by: Guo Ren <guoren@kernel.org> Link: https://lore.kernel.org/r/20230226150137.1919750-2-geomatsi@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-12-09riscv: mm: notify remote harts about mmu cache updatesSergey Matyukevich1-0/+18
Current implementation of update_mmu_cache function performs local TLB flush. It does not take into account ASID information. Besides, it does not take into account other harts currently running the same mm context or possible migration of the running context to other harts. Meanwhile TLB flush is not performed for every context switch if ASID support is enabled. Patch [1] proposed to add ASID support to update_mmu_cache to avoid flushing local TLB entirely. This patch takes into account other harts currently running the same mm context as well as possible migration of this context to other harts. For this purpose the approach from flush_icache_mm is reused. Remote harts currently running the same mm context are informed via SBI calls that they need to flush their local TLBs. All the other harts are marked as needing a deferred TLB flush when this mm context runs on them. [1] https://lore.kernel.org/linux-riscv/20220821013926.8968-1-tjytimi@163.com/ Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com> Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/#t Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2021-06-09riscv: fix build error when CONFIG_SMP is disabledBixuan Cui1-0/+5
Fix build error when disable CONFIG_SMP: mm/pgtable-generic.o: In function `.L19': pgtable-generic.c:(.text+0x42): undefined reference to `flush_pmd_tlb_range' mm/pgtable-generic.o: In function `pmdp_huge_clear_flush': pgtable-generic.c:(.text+0x6c): undefined reference to `flush_pmd_tlb_range' mm/pgtable-generic.o: In function `pmdp_invalidate': pgtable-generic.c:(.text+0x162): undefined reference to `flush_pmd_tlb_range' Fixes: e88b333142e4 ("riscv: mm: add THP support on 64-bit") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Bixuan Cui <cuibixuan@huawei.com> Acked-by: Nanyong Sun <sunnanyong@huawei.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-04-26riscv: sifive: Apply errata "cip-1200" patchVincent Chen1-1/+2
For certain SiFive CPUs, "sfence.vma addr" cannot exactly flush addr from TLB in the particular cases. The details could be found here: https://sifive.cdn.prismic.io/sifive/167a1a56-03f4-4615-a79e-b2a86153148f_FU740_errata_20210205.pdf In order to ensure the functionality, this patch uses the Alternative scheme to replace all "sfence.vma addr" with "sfence.vma" at runtime. Signed-off-by: Vincent Chen <vincent.chen@sifive.com> Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2019-11-18riscv: add nommu supportChristoph Hellwig1-3/+9
The kernel runs in M-mode without using page tables, and thus can't run bare metal without help from additional firmware. Most of the patch is just stubbing out code not needed without page tables, but there is an interesting detail in the signals implementation: - The normal RISC-V syscall ABI only implements rt_sigreturn as VDSO entry point, but the ELF VDSO is not supported for nommu Linux. We instead copy the code to call the syscall onto the stack. In addition to enabling the nommu code a new defconfig for a small kernel image that can run in nommu mode on qemu is also provided, to run a kernel in qemu you can use the following command line: qemu-system-riscv64 -smp 2 -m 64 -machine virt -nographic \ -kernel arch/riscv/boot/loader \ -drive file=rootfs.ext2,format=raw,id=hd0 \ -device virtio-blk-device,drive=hd0 Contains contributions from Damien Le Moal <Damien.LeMoal@wdc.com>. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Anup Patel <anup@brainfault.org> [paul.walmsley@sifive.com: updated to apply; add CONFIG_MMU guards around PCI_IOBASE definition to fix build issues; fixed checkpatch issues; move the PCI_IO_* and VMEMMAP address space macros along with the others; resolve sparse warning] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-10-14riscv: tlbflush: remove confusing comment on local_flush_tlb_all()Paul Walmsley1-4/+0
Remove a confusing comment on our local_flush_tlb_all() implementation. Per an internal discussion with Andrew, while it's true that the fence.i is not necessary, it's not the case that an sfence.vma implies a fence.i. We also drop the section about "flush[ing] the entire local TLB" to better align with the language in section 4.2.1 "Supervisor Memory-Management Fence Instruction" of the RISC-V Privileged Specification v20190608. Fixes: c901e45a999a1 ("RISC-V: `sfence.vma` orderes the instruction cache") Reported-by: Alan Kao <alankao@andestech.com> Cc: Palmer Dabbelt <palmer@sifive.com> Cc: Andrew Waterman <andrew@sifive.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-09-05riscv: move the TLB flush logic out of lineChristoph Hellwig1-30/+7
The TLB flush logic is going to become more complex. Start moving it out of line. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Atish Patra <atish.patra@wdc.com> [paul.walmsley@sifive.com: fixed checkpatch whitespace warnings] Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-09-05riscv: cleanup riscv_cpuid_to_hartid_maskChristoph Hellwig1-1/+0
Move the initial clearing of the mask from the callers to riscv_cpuid_to_hartid_mask, and remove the unused !CONFIG_SMP stub. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Atish Patra <atish.patra@wdc.com> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com>
2019-08-14riscv: fix flush_tlb_range() end address for flush_tlb_page()Paul Walmsley1-2/+9
The RISC-V kernel implementation of flush_tlb_page() when CONFIG_SMP is set is wrong. It passes zero to flush_tlb_range() as the final address to flush, but it should be at least 'addr'. Some other Linux architecture ports use the beginning address to flush, plus PAGE_SIZE, as the final address to flush. This might flush slightly more than what's needed, but it seems unlikely that being more clever would improve anything. So let's just take that implementation for now. While here, convert the macro into a static inline function, primarily to avoid unintentional multiple evaluations of 'addr'. This second version of the patch fixes a coding style issue found by Christoph Hellwig <hch@lst.de>. Reported-by: Andreas Schwab <schwab@suse.de> Signed-off-by: Paul Walmsley <paul.walmsley@sifive.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-06-05treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 286Thomas Gleixner1-9/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public license as published by the free software foundation version 2 this program is distributed in the hope that it will be useful but without any warranty without even the implied warranty of merchantability or fitness for a particular purpose see the gnu general public license for more details extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 97 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190529141901.025053186@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-10-23RISC-V: Use Linux logical CPU number instead of hartidAtish Patra1-3/+13
Setup the cpu_logical_map during boot. Moreover, every SBI call and PLIC context are based on the physical hartid. Use the logical CPU to hartid mapping to pass correct hartid to respective functions. Signed-off-by: Atish Patra <atish.patra@wdc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-06-07riscv: use NULL instead of a plain 0Luc Van Oostenryck1-1/+1
sbi_remote_sfence_vma() & sbi_remote_fence_i() takes a pointer as first argument but some macros call them with a plain 0 which, while legal C, is frowned upon in the kernel. Change this by replacing the 0 by NULL. Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2018-01-31RISC-V: Limit the scope of TLB shootdownsAndrew Waterman1-8/+12
RISC-V systems perform TLB shootdows via the SBI, which currently performs an IPI to each of the remote harts which then performs a local TLB flush. This process is a bit on the slow side, but we can at least speed it up for some common cases by restricting the set of harts to shoot down to the actual set of harts that are currently participating in the given mm context, as opposed to the entire system. This should provide a measurable performance increase, but we haven't measured it. Regardless, it seems like obviously the right thing to do here. Signed-off-by: Andrew Waterman <andrew@sifive.com> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>
2018-01-08riscv: remove CONFIG_MMU ifdefsChristoph Hellwig1-4/+0
The RISC-V port doesn't suport a nommu mode, so there is no reason to provide some code only under a CONFIG_MMU ifdef. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-12-02RISC-V: User-Visible ChangesPalmer Dabbelt1-0/+2
This merge contains the user-visible, ABI-breaking changes that we want to make sure we have in Linux before our first release. Highlights include: * VDSO entries for clock_get/gettimeofday/getcpu have been added. These are simple syscalls now, but we want to let glibc use them from the start so we can make them faster later. * A VDSO entry for instruction cache flushing has been added so userspace can flush the instruction cache. * The VDSO symbol versions for __vdso_cmpxchg{32,64} have been removed, as those VDSO entries don't actually exist. Conflicts: arch/riscv/include/asm/tlbflush.h
2017-11-30RISC-V: Flush I$ when making a dirty page executableAndrew Waterman1-0/+2
The RISC-V ISA allows for instruction caches that are not coherent WRT stores, even on a single hart. As a result, we need to explicitly flush the instruction cache whenever marking a dirty page as executable in order to preserve the correct system behavior. Local instruction caches aren't that scary (our implementations actually flush the cache, but RISC-V is defined to allow higher-performance implementations to exist), but RISC-V defines no way to perform an instruction cache shootdown. When explicitly asked to do so we can shoot down remote instruction caches via an IPI, but this is a bit on the slow side. Instead of requiring an IPI to all harts whenever marking a page as executable, we simply flush the currently running harts. In order to maintain correct behavior, we additionally mark every other hart as needing a deferred instruction cache which will be taken before anything runs on it. Signed-off-by: Andrew Waterman <andrew@sifive.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-11-29RISC-V: `sfence.vma` orderes the instruction cachePalmer Dabbelt1-1/+4
This is just a comment change, but it's one that bit me on the mailing list. It turns out that issuing a `sfence.vma` enforces instruction cache ordering in addition to TLB ordering. This isn't explicitly called out in the ISA manual, but Andrew will be making that more clear in a future revision. CC: Andrew Waterman <andrew@sifive.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
2017-09-27RISC-V: Atomic and Locking CodePalmer Dabbelt1-0/+64
This contains all the code that directly interfaces with the RISC-V memory model. While this code corforms to the current RISC-V ISA specifications (user 2.2 and priv 1.10), the memory model is somewhat underspecified in those documents. There is a working group that hopes to produce a formal memory model by the end of the year, but my understanding is that the basic definitions we're relying on here won't change significantly. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Palmer Dabbelt <palmer@dabbelt.com>