summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-09-19perf tools: Handle old data in PERF_RECORD_ATTRNamhyung Kim1-5/+6
commit 9bf63282ea77a531ea58acb42fb3f40d2d1e4497 upstream. The PERF_RECORD_ATTR is used for a pipe mode to describe an event with attribute and IDs. The ID table comes after the attr and it calculate size of the table using the total record size and the attr size. n_ids = (total_record_size - end_of_the_attr_field) / sizeof(u64) This is fine for most use cases, but sometimes it saves the pipe output in a file and then process it later. And it becomes a problem if there is a change in attr size between the record and report. $ perf record -o- > perf-pipe.data # old version $ perf report -i- < perf-pipe.data # new version For example, if the attr size is 128 and it has 4 IDs, then it would save them in 168 byte like below: 8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 }, 128 byte: perf event attr { .size = 128, ... }, 32 byte: event IDs [] = { 1234, 1235, 1236, 1237 }, But when report later, it thinks the attr size is 136 then it only read the last 3 entries as ID. 8 byte: perf event header { .type = PERF_RECORD_ATTR, .size = 168 }, 136 byte: perf event attr { .size = 136, ... }, 24 byte: event IDs [] = { 1235, 1236, 1237 }, // 1234 is missing So it should use the recorded version of the attr. The attr has the size field already then it should honor the size when reading data. Fixes: 2c46dbb517a10b18 ("perf: Convert perf header attrs into attr events") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tom Zanussi <zanussi@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230825152552.112913-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19perf test shell stat_bpf_counters: Fix test on IntelNamhyung Kim1-2/+2
commit 68ca249c964f520af7f8763e22f12bd26b57b870 upstream. As of now, bpf counters (bperf) don't support event groups. But the default perf stat includes topdown metrics if supported (on recent Intel machines) which require groups. That makes perf stat exiting. $ sudo perf stat --bpf-counter true bpf managed perf events do not yet support groups. Actually the test explicitly uses cycles event only, but it missed to pass the option when it checks the availability of the command. Fixes: 2c0cb9f56020d2ea ("perf test: Add a shell test for 'perf stat --bpf-counters' new option") Reviewed-by: Song Liu <song@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: bpf@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230825164152.165610-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19perf hists browser: Fix hierarchy mode headerNamhyung Kim1-1/+1
commit e2cabf2a44791f01c21f8d5189b946926e34142e upstream. The commit ef9ff6017e3c4593 ("perf ui browser: Move the extra title lines from the hists browser") introduced ui_browser__gotorc_title() to help moving non-title lines easily. But it missed to update the title for the hierarchy mode so it won't print the header line on TUI at all. $ perf report --hierarchy Fixes: ef9ff6017e3c4593 ("perf ui browser: Move the extra title lines from the hists browser") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230731094934.1616495-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19MIPS: Fix CONFIG_CPU_DADDI_WORKAROUNDS `modules_install' regressionMaciej W. Rozycki1-2/+2
commit a79a404e6c2241ebc528b9ebf4c0832457b498c3 upstream. Remove a build-time check for the presence of the GCC `-msym32' option. This option has been there since GCC 4.1.0, which is below the minimum required as at commit 805b2e1d427a ("kbuild: include Makefile.compiler only when compiler is needed"), when an error message: arch/mips/Makefile:306: *** CONFIG_CPU_DADDI_WORKAROUNDS unsupported without -msym32. Stop. started to trigger for the `modules_install' target with configurations such as `decstation_64_defconfig' that set CONFIG_CPU_DADDI_WORKAROUNDS, because said commit has made `cc-option-yn' an undefined function for non-build targets. Reported-by: Jan-Benedict Glaw <jbglaw@lug-owl.de> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Fixes: 805b2e1d427a ("kbuild: include Makefile.compiler only when compiler is needed") Cc: stable@vger.kernel.org # v5.13+ Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19KVM: SVM: Skip VMSA init in sev_es_init_vmcb() if pointer is NULLSean Christopherson1-2/+5
commit 1952e74da96fb3e48b72a2d0ece78c688a5848c1 upstream. Skip initializing the VMSA physical address in the VMCB if the VMSA is NULL, which occurs during intrahost migration as KVM initializes the VMCB before copying over state from the source to the destination (including the VMSA and its physical address). In normal builds, __pa() is just math, so the bug isn't fatal, but with CONFIG_DEBUG_VIRTUAL=y, the validity of the virtual address is verified and passing in NULL will make the kernel unhappy. Fixes: 6defa24d3b12 ("KVM: SEV: Init target VMCBs in sev_migrate_from") Cc: stable@vger.kernel.org Cc: Peter Gonda <pgonda@google.com> Reviewed-by: Peter Gonda <pgonda@google.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Link: https://lore.kernel.org/r/20230825022357.2852133-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19KVM: SVM: Set target pCPU during IRTE update if target vCPU is runningSean Christopherson1-0/+28
commit f3cebc75e7425d6949d726bb8e937095b0aef025 upstream. Update the target pCPU for IOMMU doorbells when updating IRTE routing if KVM is actively running the associated vCPU. KVM currently only updates the pCPU when loading the vCPU (via avic_vcpu_load()), and so doorbell events will be delayed until the vCPU goes through a put+load cycle (which might very well "never" happen for the lifetime of the VM). To avoid inserting a stale pCPU, e.g. due to racing between updating IRTE routing and vCPU load/put, get the pCPU information from the vCPU's Physical APIC ID table entry (a.k.a. avic_physical_id_cache in KVM) and update the IRTE while holding ir_list_lock. Add comments with --verbose enabled to explain exactly what is and isn't protected by ir_list_lock. Fixes: 411b44ba80ab ("svm: Implements update_pi_irte hook to setup posted interrupt") Reported-by: dengqiao.joey <dengqiao.joey@bytedance.com> Cc: stable@vger.kernel.org Cc: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20230808233132.2499764-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19KVM: nSVM: Load L1's TSC multiplier based on L1 state, not L2 stateSean Christopherson1-2/+2
commit 0c94e2468491cbf0754f49a5136ab51294a96b69 upstream. When emulating nested VM-Exit, load L1's TSC multiplier if L1's desired ratio doesn't match the current ratio, not if the ratio L1 is using for L2 diverges from the default. Functionally, the end result is the same as KVM will run L2 with L1's multiplier if L2's multiplier is the default, i.e. checking that L1's multiplier is loaded is equivalent to checking if L2 has a non-default multiplier. However, the assertion that TSC scaling is exposed to L1 is flawed, as userspace can trigger the WARN at will by writing the MSR and then updating guest CPUID to hide the feature (modifying guest CPUID is allowed anytime before KVM_RUN). E.g. hacking KVM's state_test selftest to do vcpu_set_msr(vcpu, MSR_AMD64_TSC_RATIO, 0); vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_TSCRATEMSR); after restoring state in a new VM+vCPU yields an endless supply of: ------------[ cut here ]------------ WARNING: CPU: 10 PID: 206939 at arch/x86/kvm/svm/nested.c:1105 nested_svm_vmexit+0x6af/0x720 [kvm_amd] Call Trace: nested_svm_exit_handled+0x102/0x1f0 [kvm_amd] svm_handle_exit+0xb9/0x180 [kvm_amd] kvm_arch_vcpu_ioctl_run+0x1eab/0x2570 [kvm] kvm_vcpu_ioctl+0x4c9/0x5b0 [kvm] ? trace_hardirqs_off+0x4d/0xa0 __se_sys_ioctl+0x7a/0xc0 __x64_sys_ioctl+0x21/0x30 do_syscall_64+0x41/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd Unlike the nested VMRUN path, hoisting the svm->tsc_scaling_enabled check into the if-statement is wrong as KVM needs to ensure L1's multiplier is loaded in the above scenario. Alternatively, the WARN_ON() could simply be deleted, but that would make KVM's behavior even more subtle, e.g. it's not immediately obvious why it's safe to write MSR_AMD64_TSC_RATIO when checking only tsc_ratio_msr. Fixes: 5228eb96a487 ("KVM: x86: nSVM: implement nested TSC scaling") Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230729011608.1065019-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19KVM: nSVM: Check instead of asserting on nested TSC scaling supportSean Christopherson1-3/+2
commit 7cafe9b8e22bb3d77f130c461aedf6868c4aaf58 upstream. Check for nested TSC scaling support on nested SVM VMRUN instead of asserting that TSC scaling is exposed to L1 if L1's MSR_AMD64_TSC_RATIO has diverged from KVM's default. Userspace can trigger the WARN at will by writing the MSR and then updating guest CPUID to hide the feature (modifying guest CPUID is allowed anytime before KVM_RUN). E.g. hacking KVM's state_test selftest to do vcpu_set_msr(vcpu, MSR_AMD64_TSC_RATIO, 0); vcpu_clear_cpuid_feature(vcpu, X86_FEATURE_TSCRATEMSR); after restoring state in a new VM+vCPU yields an endless supply of: ------------[ cut here ]------------ WARNING: CPU: 164 PID: 62565 at arch/x86/kvm/svm/nested.c:699 nested_vmcb02_prepare_control+0x3d6/0x3f0 [kvm_amd] Call Trace: <TASK> enter_svm_guest_mode+0x114/0x560 [kvm_amd] nested_svm_vmrun+0x260/0x330 [kvm_amd] vmrun_interception+0x29/0x30 [kvm_amd] svm_invoke_exit_handler+0x35/0x100 [kvm_amd] svm_handle_exit+0xe7/0x180 [kvm_amd] kvm_arch_vcpu_ioctl_run+0x1eab/0x2570 [kvm] kvm_vcpu_ioctl+0x4c9/0x5b0 [kvm] __se_sys_ioctl+0x7a/0xc0 __x64_sys_ioctl+0x21/0x30 do_syscall_64+0x41/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x45ca1b Note, the nested #VMEXIT path has the same flaw, but needs a different fix and will be handled separately. Fixes: 5228eb96a487 ("KVM: x86: nSVM: implement nested TSC scaling") Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230729011608.1065019-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19KVM: SVM: Get source vCPUs from source VM for SEV-ES intrahost migrationSean Christopherson1-1/+1
commit f1187ef24eb8f36e8ad8106d22615ceddeea6097 upstream. Fix a goof where KVM tries to grab source vCPUs from the destination VM when doing intrahost migration. Grabbing the wrong vCPU not only hoses the guest, it also crashes the host due to the VMSA pointer being left NULL. BUG: unable to handle page fault for address: ffffe38687000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 39 PID: 17143 Comm: sev_migrate_tes Tainted: GO 6.5.0-smp--fff2e47e6c3b-next #151 Hardware name: Google, Inc. Arcadia_IT_80/Arcadia_IT_80, BIOS 34.28.0 07/10/2023 RIP: 0010:__free_pages+0x15/0xd0 RSP: 0018:ffff923fcf6e3c78 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffe38687000000 RCX: 0000000000000100 RDX: 0000000000000100 RSI: 0000000000000000 RDI: ffffe38687000000 RBP: ffff923fcf6e3c88 R08: ffff923fcafb0000 R09: 0000000000000000 R10: 0000000000000000 R11: ffffffff83619b90 R12: ffff923fa9540000 R13: 0000000000080007 R14: ffff923f6d35d000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff929d0d7c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffe38687000000 CR3: 0000005224c34005 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace: <TASK> sev_free_vcpu+0xcb/0x110 [kvm_amd] svm_vcpu_free+0x75/0xf0 [kvm_amd] kvm_arch_vcpu_destroy+0x36/0x140 [kvm] kvm_destroy_vcpus+0x67/0x100 [kvm] kvm_arch_destroy_vm+0x161/0x1d0 [kvm] kvm_put_kvm+0x276/0x560 [kvm] kvm_vm_release+0x25/0x30 [kvm] __fput+0x106/0x280 ____fput+0x12/0x20 task_work_run+0x86/0xb0 do_exit+0x2e3/0x9c0 do_group_exit+0xb1/0xc0 __x64_sys_exit_group+0x1b/0x20 do_syscall_64+0x41/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd </TASK> CR2: ffffe38687000000 Fixes: 6defa24d3b12 ("KVM: SEV: Init target VMCBs in sev_migrate_from") Cc: stable@vger.kernel.org Cc: Peter Gonda <pgonda@google.com> Reviewed-by: Peter Gonda <pgonda@google.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Link: https://lore.kernel.org/r/20230825022357.2852133-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19KVM: SVM: Don't inject #UD if KVM attempts to skip SEV guest insnSean Christopherson1-8/+27
commit cb49631ad111570f1bad37702c11c2ae07fa2e3c upstream. Don't inject a #UD if KVM attempts to "emulate" to skip an instruction for an SEV guest, and instead resume the guest and hope that it can make forward progress. When commit 04c40f344def ("KVM: SVM: Inject #UD on attempted emulation for SEV guest w/o insn buffer") added the completely arbitrary #UD behavior, there were no known scenarios where a well-behaved guest would induce a VM-Exit that triggered emulation, i.e. it was thought that injecting #UD would be helpful. However, now that KVM (correctly) attempts to re-inject INT3/INTO, e.g. if a #NPF is encountered when attempting to deliver the INT3/INTO, an SEV guest can trigger emulation without a buffer, through no fault of its own. Resuming the guest and retrying the INT3/INTO is architecturally wrong, e.g. the vCPU will incorrectly re-hit code #DBs, but for SEV guests there is literally no other option that has a chance of making forward progress. Drop the #UD injection for all "skip" emulation, not just those related to INT3/INTO, even though that means that the guest will likely end up in an infinite loop instead of getting a #UD (the vCPU may also crash, e.g. if KVM emulated everything about an instruction except for advancing RIP). There's no evidence that suggests that an unexpected #UD is actually better than hanging the vCPU, e.g. a soft-hung vCPU can still respond to IRQs and NMIs to generate a backtrace. Reported-by: Wu Zongyo <wuzongyo@mail.ustc.edu.cn> Closes: https://lore.kernel.org/all/8eb933fd-2cf3-d7a9-32fe-2a1d82eac42a@mail.ustc.edu.cn Fixes: 6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction") Cc: stable@vger.kernel.org Cc: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/r/20230825013621.2845700-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19KVM: SVM: Take and hold ir_list_lock when updating vCPU's Physical ID entrySean Christopherson1-8/+23
commit 4c08e737f056fec930b416a2bd37ed266d724f95 upstream. Hoist the acquisition of ir_list_lock from avic_update_iommu_vcpu_affinity() to its two callers, avic_vcpu_load() and avic_vcpu_put(), specifically to encapsulate the write to the vCPU's entry in the AVIC Physical ID table. This will allow a future fix to pull information from the Physical ID entry when updating the IRTE, without potentially consuming stale information, i.e. without racing with the vCPU being (un)loaded. Add a comment to call out that ir_list_lock does NOT protect against multiple writers, specifically that reading the Physical ID entry in avic_vcpu_put() outside of the lock is safe. To preserve some semblance of independence from ir_list_lock, keep the READ_ONCE() in avic_vcpu_load() even though acuiring the spinlock effectively ensures the load(s) will be generated after acquiring the lock. Cc: stable@vger.kernel.org Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Link: https://lore.kernel.org/r/20230808233132.2499764-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19drm/amd/display: prevent potential division by zero errorsHamza Mahfooz1-3/+6
commit 07e388aab042774f284a2ad75a70a194517cdad4 upstream. There are two places in apply_below_the_range() where it's possible for a divide by zero error to occur. So, to fix this make sure the divisor is non-zero before attempting the computation in both cases. Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2637 Fixes: a463b263032f ("drm/amd/display: Fix frames_to_insert math") Fixes: ded6119e825a ("drm/amd/display: Reinstate LFC optimization") Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gammaMelissa Wen1-0/+7
commit 57a943ebfcdb4a97fbb409640234bdb44bfa1953 upstream. For DRM legacy gamma, AMD display manager applies implicit sRGB degamma using a pre-defined sRGB transfer function. It works fine for DCN2 family where degamma ROM and custom curves go to the same color block. But, on DCN3+, degamma is split into two blocks: degamma ROM for pre-defined TFs and `gamma correction` for user/custom curves and degamma ROM settings doesn't apply to cursor plane. To get DRM legacy gamma working as expected, enable cursor degamma ROM for implict sRGB degamma on HW with this configuration. Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2803 Fixes: 96b020e2163f ("drm/amd/display: check attr flag before set cursor degamma on DCN3+") Signed-off-by: Melissa Wen <mwen@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controllerWilliam Zhang1-33/+41
commit 2ec2839a9062db8a592525a3fdabd42dcd9a3a9b upstream. v7.2 controller has different ECC level field size and shift in the acc control register than its predecessor and successor controller. It needs to be set specifically. Fixes: decba6d47869 ("mtd: brcmnand: Add v7.2 controller support") Signed-off-by: William Zhang <william.zhang@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-2-william.zhang@broadcom.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19mtd: rawnand: brcmnand: Fix potential false time out warningWilliam Zhang1-0/+8
commit 9cc0a598b944816f2968baf2631757f22721b996 upstream. If system is busy during the command status polling function, the driver may not get the chance to poll the status register till the end of time out and return the premature status. Do a final check after time out happens to ensure reading the correct status. Fixes: 9d2ee0a60b8b ("mtd: nand: brcmnand: Check flash #WP pin status before nand erase/program") Signed-off-by: William Zhang <william.zhang@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-3-william.zhang@broadcom.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19mtd: spi-nor: Correct flags for Winbond w25q128Linus Walleij1-2/+3
commit 83e824a4a595132f9bd7ac4f5afff857bfc5991e upstream. The Winbond "w25q128" (actual vendor name W25Q128JV) has exactly the same flags as the sibling device "w25q128jv". The devices both require unlocking to enable write access. The actual product naming between devices vs the Linux strings in winbond.c: 0xef4018: "w25q128" W25Q128JV-IN/IQ/JQ 0xef7018: "w25q128jv" W25Q128JV-IM/JM The latter device, "w25q128jv" supports features named DTQ and QPI, otherwise it is the same. Not having the right flags has the annoying side effect that write access does not work. After this patch I can write to the flash on the Inteno XG6846 router. The flash memory also supports dual and quad SPI modes. This does not currently manifest, but by turning on SFDP parsing, the right SPI modes are emitted in /sys/kernel/debug/spi-nor/spi1.0/capabilities for this chip, so we also turn on this. Since we now have determined that SFDP parsing works on the device, we also detect the geometry using SFDP. After this dmesg and sysfs says: [ 1.062401] spi-nor spi1.0: w25q128 (16384 Kbytes) cat erasesize 65536 (16384*1024)/65536 = 256 sectors spi-nor sysfs: cat jedec_id ef4018 cat manufacturer winbond cat partname w25q128 hexdump -v -C sfdp 00000000 53 46 44 50 05 01 00 ff 00 05 01 10 80 00 00 ff 00000010 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000020 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000030 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000040 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000050 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000060 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff 00000080 e5 20 f9 ff ff ff ff 07 44 eb 08 6b 08 3b 42 bb 00000090 fe ff ff ff ff ff 00 00 ff ff 40 eb 0c 20 0f 52 000000a0 10 d8 00 00 36 02 a6 00 82 ea 14 c9 e9 63 76 33 000000b0 7a 75 7a 75 f7 a2 d5 5c 19 f7 4d ff e9 30 f8 80 Cc: stable@vger.kernel.org Suggested-by: Michael Walle <michael@walle.cc> Reviewed-by: Michael Walle <michael@walle.cc> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20230718-spi-nor-winbond-w25q128-v5-1-a73653ee46c3@linaro.org Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19mtd: rawnand: brcmnand: Fix potential out-of-bounds access in oob writeWilliam Zhang1-2/+16
commit 5d53244186c9ac58cb88d76a0958ca55b83a15cd upstream. When the oob buffer length is not in multiple of words, the oob write function does out-of-bounds read on the oob source buffer at the last iteration. Fix that by always checking length limit on the oob buffer read and fill with 0xff when reaching the end of the buffer to the oob registers. Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller") Signed-off-by: William Zhang <william.zhang@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-5-william.zhang@broadcom.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19mtd: rawnand: brcmnand: Fix crash during the panic_writeWilliam Zhang1-1/+11
commit e66dd317194daae0475fe9e5577c80aa97f16cb9 upstream. When executing a NAND command within the panic write path, wait for any pending command instead of calling BUG_ON to avoid crashing while already crashing. Fixes: 27c5b17cd1b1 ("mtd: nand: add NAND driver "library" for Broadcom STB NAND controller") Signed-off-by: William Zhang <william.zhang@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Kursad Oney <kursad.oney@broadcom.com> Reviewed-by: Kamal Dasu <kamal.dasu@broadcom.com> Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-4-william.zhang@broadcom.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19drm/mxsfb: Disable overlay plane in mxsfb_plane_overlay_atomic_disable()Liu Ying1-0/+9
commit aa656d48e871a1b062e1bbf9474d8b831c35074c upstream. When disabling overlay plane in mxsfb_plane_overlay_atomic_update(), overlay plane's framebuffer pointer is NULL. So, dereferencing it would cause a kernel Oops(NULL pointer dereferencing). Fix the issue by disabling overlay plane in mxsfb_plane_overlay_atomic_disable() instead. Fixes: cb285a5348e7 ("drm: mxsfb: Replace mxsfb_get_fb_paddr() with drm_fb_cma_get_gem_addr()") Cc: stable@vger.kernel.org # 5.19+ Signed-off-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Marek Vasut <marex@denx.de> Signed-off-by: Marek Vasut <marex@denx.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230612092359.784115-1-victor.liu@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19btrfs: use the correct superblock to compare fsid in btrfs_validate_superAnand Jain1-3/+2
commit d167aa76dc0683828588c25767da07fb549e4f48 upstream. The function btrfs_validate_super() should verify the fsid in the provided superblock argument. Because, all its callers expect it to do that. Such as in the following stack: write_all_supers() sb = fs_info->super_for_commit; btrfs_validate_write_super(.., sb) btrfs_validate_super(.., sb, ..) scrub_one_super() btrfs_validate_super(.., sb, ..) And check_dev_super() btrfs_validate_super(.., sb, ..) However, it currently verifies the fs_info::super_copy::fsid instead, which is not correct. Fix this using the correct fsid in the superblock argument. CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19btrfs: zoned: re-enable metadata over-commit for zoned modeNaohiro Aota1-5/+1
commit 5b135b382a360f4c87cf8896d1465b0b07f10cb0 upstream. Now that, we can re-enable metadata over-commit. As we moved the activation from the reservation time to the write time, we no longer need to ensure all the reserved bytes is properly activated. Without the metadata over-commit, it suffers from lower performance because it needs to flush the delalloc items more often and allocate more block groups. Re-enabling metadata over-commit will solve the issue. Fixes: 79417d040f4f ("btrfs: zoned: disable metadata overcommit for zoned") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19btrfs: set page extent mapped after read_folio in relocate_one_pageJosef Bacik1-3/+9
commit e7f1326cc24e22b38afc3acd328480a1183f9e79 upstream. One of the CI runs triggered the following panic assertion failed: PagePrivate(page) && page->private, in fs/btrfs/subpage.c:229 ------------[ cut here ]------------ kernel BUG at fs/btrfs/subpage.c:229! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP CPU: 0 PID: 923660 Comm: btrfs Not tainted 6.5.0-rc3+ #1 pstate: 61400005 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : btrfs_subpage_assert+0xbc/0xf0 lr : btrfs_subpage_assert+0xbc/0xf0 sp : ffff800093213720 x29: ffff800093213720 x28: ffff8000932138b4 x27: 000000000c280000 x26: 00000001b5d00000 x25: 000000000c281000 x24: 000000000c281fff x23: 0000000000001000 x22: 0000000000000000 x21: ffffff42b95bf880 x20: ffff42b9528e0000 x19: 0000000000001000 x18: ffffffffffffffff x17: 667274622f736620 x16: 6e69202c65746176 x15: 0000000000000028 x14: 0000000000000003 x13: 00000000002672d7 x12: 0000000000000000 x11: ffffcd3f0ccd9204 x10: ffffcd3f0554ae50 x9 : ffffcd3f0379528c x8 : ffff800093213428 x7 : 0000000000000000 x6 : ffffcd3f091771e8 x5 : ffff42b97f333948 x4 : 0000000000000000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : ffff42b9556cde80 x0 : 000000000000004f Call trace: btrfs_subpage_assert+0xbc/0xf0 btrfs_subpage_set_dirty+0x38/0xa0 btrfs_page_set_dirty+0x58/0x88 relocate_one_page+0x204/0x5f0 relocate_file_extent_cluster+0x11c/0x180 relocate_data_extent+0xd0/0xf8 relocate_block_group+0x3d0/0x4e8 btrfs_relocate_block_group+0x2d8/0x490 btrfs_relocate_chunk+0x54/0x1a8 btrfs_balance+0x7f4/0x1150 btrfs_ioctl+0x10f0/0x20b8 __arm64_sys_ioctl+0x120/0x11d8 invoke_syscall.constprop.0+0x80/0xd8 do_el0_svc+0x6c/0x158 el0_svc+0x50/0x1b0 el0t_64_sync_handler+0x120/0x130 el0t_64_sync+0x194/0x198 Code: 91098021 b0007fa0 91346000 97e9c6d2 (d4210000) This is the same problem outlined in 17b17fcd6d44 ("btrfs: set_page_extent_mapped after read_folio in btrfs_cont_expand") , and the fix is the same. I originally looked for the same pattern elsewhere in our code, but mistakenly skipped over this code because I saw the page cache readahead before we set_page_extent_mapped, not realizing that this was only in the !page case, that we can still end up with a !uptodate page and then do the btrfs_read_folio further down. The fix here is the same as the above mentioned patch, move the set_page_extent_mapped call to after the btrfs_read_folio() block to make sure that we have the subpage blocksize stuff setup properly before using the page. CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19btrfs: don't start transaction when joining with TRANS_JOIN_NOSTARTFilipe Manana1-3/+4
commit 4490e803e1fe9fab8db5025e44e23b55df54078b upstream. When joining a transaction with TRANS_JOIN_NOSTART, if we don't find a running transaction we end up creating one. This goes against the purpose of TRANS_JOIN_NOSTART which is to join a running transaction if its state is at or below the state TRANS_STATE_COMMIT_START, otherwise return an -ENOENT error and don't start a new transaction. So fix this to not create a new transaction if there's no running transaction at or below that state. CC: stable@vger.kernel.org # 4.14+ Fixes: a6d155d2e363 ("Btrfs: fix deadlock between fiemap and transaction commits") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19btrfs: free qgroup rsv on io failureBoris Burkov1-0/+7
commit e28b02118b94e42be3355458a2406c6861e2dd32 upstream. If we do a write whose bio suffers an error, we will never reclaim the qgroup reserved space for it. We allocate the space in the write_iter codepath, then release the reservation as we allocate the ordered extent, but we only create a delayed ref if the ordered extent finishes. If it has an error, we simply leak the rsv. This is apparent in running any error injecting (dmerror) fstests like btrfs/146 or btrfs/160. Such tests fail due to dmesg on umount complaining about the leaked qgroup data space. When we clean up other aspects of space on failed ordered_extents, also free the qgroup rsv. Reviewed-by: Josef Bacik <josef@toxicpanda.com> CC: stable@vger.kernel.org # 5.10+ Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19btrfs: fix start transaction qgroup rsv double freeBoris Burkov1-3/+16
commit a6496849671a5bc9218ecec25a983253b34351b1 upstream. btrfs_start_transaction reserves metadata space of the PERTRANS type before it identifies a transaction to start/join. This allows flushing when reserving that space without a deadlock. However, it results in a race which temporarily breaks qgroup rsv accounting. T1 T2 start_transaction do_stuff start_transaction qgroup_reserve_meta_pertrans commit_transaction qgroup_free_meta_all_pertrans hit an error starting txn goto reserve_fail qgroup_free_meta_pertrans (already freed!) The basic issue is that there is nothing preventing another commit from committing before start_transaction finishes (in fact sometimes we intentionally wait for it) so any error path that frees the reserve is at risk of this race. While this exact space was getting freed anyway, and it's not a huge deal to double free it (just a warning, the free code catches this), it can result in incorrectly freeing some other pertrans reservation in this same reservation, which could then lead to spuriously granting reservations we might not have the space for. Therefore, I do believe it is worth fixing. To fix it, use the existing prealloc->pertrans conversion mechanism. When we first reserve the space, we reserve prealloc space and only when we are sure we have a transaction do we convert it to pertrans. This way any racing commits do not blow away our reservation, but we still get a pertrans reservation that is freed when _this_ transaction gets committed. This issue can be reproduced by running generic/269 with either qgroups or squotas enabled via mkfs on the scratch device. Reviewed-by: Josef Bacik <josef@toxicpanda.com> CC: stable@vger.kernel.org # 5.10+ Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19btrfs: zoned: do not zone finish data relocation block groupNaohiro Aota2-23/+36
commit 332581bde2a419d5f12a93a1cdc2856af649a3cc upstream. When multiple writes happen at once, we may need to sacrifice a currently active block group to be zone finished for a new allocation. We choose a block group with the least free space left, and zone finish it. To do the finishing, we need to send IOs for already allocated region and wait for them and on-going IOs. Otherwise, these IOs fail because the zone is already finished at the time the IO reach a device. However, if a block group dedicated to the data relocation is zone finished, there is a chance that finishing it before an ongoing write IO reaches the device. That is because there is timing gap between an allocation is done (block_group->reservations == 0, as pre-allocation is done) and an ordered extent is created when the relocation IO starts. Thus, if we finish the zone between them, we can fail the IOs. We cannot simply use "fs_info->data_reloc_bg == block_group->start" to avoid the zone finishing. Because, the data_reloc_bg may already switch to a new block group, while there are still ongoing write IOs to the old data_reloc_bg. So, this patch reworks the BLOCK_GROUP_FLAG_ZONED_DATA_RELOC bit to indicate there is a data relocation allocation and/or ongoing write to the block group. The bit is set on allocation and cleared in end_io function of the last IO for the currently allocated region. To change the timing of the bit setting also solves the issue that the bit being left even after there is no IO going on. With the current code, if the data_reloc_bg switches after the last IO to the current data_reloc_bg, the bit is set at this timing and there is no one clearing that bit. As a result, that block group is kept unallocatable for anything. Fixes: 343d8a30851c ("btrfs: zoned: prevent allocation from previous data relocation BG") Fixes: 74e91b12b115 ("btrfs: zoned: zone finish unused block group") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19fuse: nlookup missing decrement in fuse_direntplus_linkruanmeisi1-1/+9
commit b8bd342d50cbf606666488488f9fea374aceb2d5 upstream. During our debugging of glusterfs, we found an Assertion failed error: inode_lookup >= nlookup, which was caused by the nlookup value in the kernel being greater than that in the FUSE file system. The issue was introduced by fuse_direntplus_link, where in the function, fuse_iget increments nlookup, and if d_splice_alias returns failure, fuse_direntplus_link returns failure without decrementing nlookup https://github.com/gluster/glusterfs/pull/4081 Signed-off-by: ruanmeisi <ruan.meisi@zte.com.cn> Fixes: 0b05b18381ee ("fuse: implement NFS-like readdirplus support") Cc: <stable@vger.kernel.org> # v3.9 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ata: pata_ftide010: Add missing MODULE_DESCRIPTIONDamien Le Moal1-0/+1
commit 7274eef5729037300f29d14edeb334a47a098f65 upstream. Add the missing MODULE_DESCRIPTION() to avoid warnings such as: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ata/pata_ftide010.o when compiling with W=1. Fixes: be4e456ed3a5 ("ata: Add driver for Faraday Technology FTIDE010") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ata: sata_gemini: Add missing MODULE_DESCRIPTIONDamien Le Moal1-0/+1
commit 8566572bf3b4d6e416a4bf2110dbb4817d11ba59 upstream. Add the missing MODULE_DESCRIPTION() to avoid warnings such as: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ata/sata_gemini.o when compiling with W=1. Fixes: be4e456ed3a5 ("ata: Add driver for Faraday Technology FTIDE010") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ata: pata_falcon: fix IO base selection for Q40Michael Schmitz1-21/+29
commit 8a1f00b753ecfdb117dc1a07e68c46d80e7923ea upstream. With commit 44b1fbc0f5f3 ("m68k/q40: Replace q40ide driver with pata_falcon and falconide"), the Q40 IDE driver was replaced by pata_falcon.c. Both IO and memory resources were defined for the Q40 IDE platform device, but definition of the IDE register addresses was modeled after the Falcon case, both in use of the memory resources and in including register shift and byte vs. word offset in the address. This was correct for the Falcon case, which does not apply any address translation to the register addresses. In the Q40 case, all of device base address, byte access offset and register shift is included in the platform specific ISA access translation (in asm/mm_io.h). As a consequence, such address translation gets applied twice, and register addresses are mangled. Use the device base address from the platform IO resource for Q40 (the IO address translation will then add the correct ISA window base address and byte access offset), with register shift 1. Use MMIO base address and register shift 2 as before for Falcon. Encode PIO_OFFSET into IO port addresses for all registers for Q40 except the data transfer register. Encode the MMIO offset there (pata_falcon_data_xfer() directly uses raw IO with no address translation). Reported-by: William R Sowerbutts <will@sowerbutts.com> Closes: https://lore.kernel.org/r/CAMuHMdUU62jjunJh9cqSqHT87B0H0A4udOOPs=WN7WZKpcagVA@mail.gmail.com Link: https://lore.kernel.org/r/CAMuHMdUU62jjunJh9cqSqHT87B0H0A4udOOPs=WN7WZKpcagVA@mail.gmail.com Fixes: 44b1fbc0f5f3 ("m68k/q40: Replace q40ide driver with pata_falcon and falconide") Cc: stable@vger.kernel.org Cc: Finn Thain <fthain@linux-m68k.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: William R Sowerbutts <will@sowerbutts.com> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ata: ahci: Add Elkhart Lake AHCI controllerWerner Fischer1-0/+2
commit 2a2df98ec592667927b5c1351afa6493ea125c9f upstream. Elkhart Lake is the successor of Apollo Lake and Gemini Lake. These CPUs and their PCHs are used in mobile and embedded environments. With this patch I suggest that Elkhart Lake SATA controllers [1] should use the default LPM policy for mobile chipsets. The disadvantage of missing hot-plug support with this setting should not be an issue, as those CPUs are used in embedded environments and not in servers with hot-plug backplanes. We discovered that the Elkhart Lake SATA controllers have been missing in ahci.c after a customer reported the throttling of his SATA SSD after a short period of higher I/O. We determined the high temperature of the SSD controller in idle mode as the root cause for that. Depending on the used SSD, we have seen up to 1.8 Watt lower system idle power usage and up to 30°C lower SSD controller temperatures in our tests, when we set med_power_with_dipm manually. I have provided a table showing seven different SATA SSDs from ATP, Intel/Solidigm and Samsung [2]. Intel lists a total of 3 SATA controller IDs (4B60, 4B62, 4B63) in [1] for those mobile PCHs. This commit just adds 0x4b63 as I do not have test systems with 0x4b60 and 0x4b62 SATA controllers. I have tested this patch with a system which uses 0x4b63 as SATA controller. [1] https://sata-io.org/product/8803 [2] https://www.thomas-krenn.com/en/wiki/SATA_Link_Power_Management#Example_LES_v4 Signed-off-by: Werner Fischer <devlists@wefi.net> Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19hwspinlock: qcom: add missing regmap config for SFPB MMIO implementationChristian Marangi1-0/+9
commit 23316be8a9d450f33a21f1efe7d89570becbec58 upstream. Commit 5d4753f741d8 ("hwspinlock: qcom: add support for MMIO on older SoCs") introduced and made regmap_config mandatory in the of_data struct but didn't add the regmap_config for sfpb based devices. SFPB based devices can both use the legacy syscon way to probe or the new MMIO way and currently device that use the MMIO way are broken as they lack the definition of the now required regmap_config and always return -EINVAL (and indirectly makes fail probing everything that depends on it, smem, nandc with smem-parser...) Fix this by correctly adding the missing regmap_config and restore function of hwspinlock on SFPB based devices with MMIO implementation. Cc: stable@vger.kernel.org Fixes: 5d4753f741d8 ("hwspinlock: qcom: add support for MMIO on older SoCs") Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20230716022804.21239-1-ansuelsmth@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19lib: test_scanf: Add explicit type cast to result initialization in ↵Nathan Chancellor1-1/+1
test_number_prefix() commit 92382d744176f230101d54f5c017bccd62770f01 upstream. A recent change in clang allows it to consider more expressions as compile time constants, which causes it to point out an implicit conversion in the scanf tests: lib/test_scanf.c:661:2: warning: implicit conversion from 'int' to 'unsigned char' changes value from -168 to 88 [-Wconstant-conversion] 661 | test_number_prefix(unsigned char, "0xA7", "%2hhx%hhx", 0, 0xa7, 2, check_uchar); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ lib/test_scanf.c:609:29: note: expanded from macro 'test_number_prefix' 609 | T result[2] = {~expect[0], ~expect[1]}; \ | ~ ^~~~~~~~~~ 1 warning generated. The result of the bitwise negation is the type of the operand after going through the integer promotion rules, so this truncation is expected but harmless, as the initial values in the result array get overwritten by _test() anyways. Add an explicit cast to the expected type in test_number_prefix() to silence the warning. There is no functional change, as all the tests still pass with GCC 13.1.0 and clang 18.0.0. Cc: stable@vger.kernel.org Link: https://github.com/ClangBuiltLinux/linuxq/issues/1899 Link: https://github.com/llvm/llvm-project/commit/610ec954e1f81c0e8fcadedcd25afe643f5a094e Suggested-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20230807-test_scanf-wconstant-conversion-v2-1-839ca39083e1@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19f2fs: avoid false alarm of circular lockingJaegeuk Kim2-10/+17
commit 5c13e2388bf3426fd69a89eb46e50469e9624e56 upstream. ====================================================== WARNING: possible circular locking dependency detected 6.5.0-rc5-syzkaller-00353-gae545c3283dc #0 Not tainted ------------------------------------------------------ syz-executor273/5027 is trying to acquire lock: ffff888077fe1fb0 (&fi->i_sem){+.+.}-{3:3}, at: f2fs_down_write fs/f2fs/f2fs.h:2133 [inline] ffff888077fe1fb0 (&fi->i_sem){+.+.}-{3:3}, at: f2fs_add_inline_entry+0x300/0x6f0 fs/f2fs/inline.c:644 but task is already holding lock: ffff888077fe07c8 (&fi->i_xattr_sem){.+.+}-{3:3}, at: f2fs_down_read fs/f2fs/f2fs.h:2108 [inline] ffff888077fe07c8 (&fi->i_xattr_sem){.+.+}-{3:3}, at: f2fs_add_dentry+0x92/0x230 fs/f2fs/dir.c:783 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&fi->i_xattr_sem){.+.+}-{3:3}: down_read+0x9c/0x470 kernel/locking/rwsem.c:1520 f2fs_down_read fs/f2fs/f2fs.h:2108 [inline] f2fs_getxattr+0xb1e/0x12c0 fs/f2fs/xattr.c:532 __f2fs_get_acl+0x5a/0x900 fs/f2fs/acl.c:179 f2fs_acl_create fs/f2fs/acl.c:377 [inline] f2fs_init_acl+0x15c/0xb30 fs/f2fs/acl.c:420 f2fs_init_inode_metadata+0x159/0x1290 fs/f2fs/dir.c:558 f2fs_add_regular_entry+0x79e/0xb90 fs/f2fs/dir.c:740 f2fs_add_dentry+0x1de/0x230 fs/f2fs/dir.c:788 f2fs_do_add_link+0x190/0x280 fs/f2fs/dir.c:827 f2fs_add_link fs/f2fs/f2fs.h:3554 [inline] f2fs_mkdir+0x377/0x620 fs/f2fs/namei.c:781 vfs_mkdir+0x532/0x7e0 fs/namei.c:4117 do_mkdirat+0x2a9/0x330 fs/namei.c:4140 __do_sys_mkdir fs/namei.c:4160 [inline] __se_sys_mkdir fs/namei.c:4158 [inline] __x64_sys_mkdir+0xf2/0x140 fs/namei.c:4158 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd -> #0 (&fi->i_sem){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:3142 [inline] check_prevs_add kernel/locking/lockdep.c:3261 [inline] validate_chain kernel/locking/lockdep.c:3876 [inline] __lock_acquire+0x2e3d/0x5de0 kernel/locking/lockdep.c:5144 lock_acquire kernel/locking/lockdep.c:5761 [inline] lock_acquire+0x1ae/0x510 kernel/locking/lockdep.c:5726 down_write+0x93/0x200 kernel/locking/rwsem.c:1573 f2fs_down_write fs/f2fs/f2fs.h:2133 [inline] f2fs_add_inline_entry+0x300/0x6f0 fs/f2fs/inline.c:644 f2fs_add_dentry+0xa6/0x230 fs/f2fs/dir.c:784 f2fs_do_add_link+0x190/0x280 fs/f2fs/dir.c:827 f2fs_add_link fs/f2fs/f2fs.h:3554 [inline] f2fs_mkdir+0x377/0x620 fs/f2fs/namei.c:781 vfs_mkdir+0x532/0x7e0 fs/namei.c:4117 ovl_do_mkdir fs/overlayfs/overlayfs.h:196 [inline] ovl_mkdir_real+0xb5/0x370 fs/overlayfs/dir.c:146 ovl_workdir_create+0x3de/0x820 fs/overlayfs/super.c:309 ovl_make_workdir fs/overlayfs/super.c:711 [inline] ovl_get_workdir fs/overlayfs/super.c:864 [inline] ovl_fill_super+0xdab/0x6180 fs/overlayfs/super.c:1400 vfs_get_super+0xf9/0x290 fs/super.c:1152 vfs_get_tree+0x88/0x350 fs/super.c:1519 do_new_mount fs/namespace.c:3335 [inline] path_mount+0x1492/0x1ed0 fs/namespace.c:3662 do_mount fs/namespace.c:3675 [inline] __do_sys_mount fs/namespace.c:3884 [inline] __se_sys_mount fs/namespace.c:3861 [inline] __x64_sys_mount+0x293/0x310 fs/namespace.c:3861 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- rlock(&fi->i_xattr_sem); lock(&fi->i_sem); lock(&fi->i_xattr_sem); lock(&fi->i_sem); Cc: <stable@vger.kernel.org> Reported-and-tested-by: syzbot+e5600587fa9cbf8e3826@syzkaller.appspotmail.com Fixes: 5eda1ad1aaff "f2fs: fix deadlock in i_xattr_sem and inode page lock" Tested-by: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19f2fs: flush inode if atomic file is abortedJaegeuk Kim1-0/+2
commit a3ab55746612247ce3dcaac6de66f5ffc055b9df upstream. Let's flush the inode being aborted atomic operation to avoid stale dirty inode during eviction in this call stack: f2fs_mark_inode_dirty_sync+0x22/0x40 [f2fs] f2fs_abort_atomic_write+0xc4/0xf0 [f2fs] f2fs_evict_inode+0x3f/0x690 [f2fs] ? sugov_start+0x140/0x140 evict+0xc3/0x1c0 evict_inodes+0x17b/0x210 generic_shutdown_super+0x32/0x120 kill_block_super+0x21/0x50 deactivate_locked_super+0x31/0x90 cleanup_mnt+0x100/0x160 task_work_run+0x59/0x90 do_exit+0x33b/0xa50 do_group_exit+0x2d/0x80 __x64_sys_exit_group+0x14/0x20 do_syscall_64+0x3b/0x90 entry_SYSCALL_64_after_hwframe+0x63/0xcd This triggers f2fs_bug_on() in f2fs_evict_inode: f2fs_bug_on(sbi, is_inode_flag_set(inode, FI_DIRTY_INODE)); This fixes the syzbot report: loop0: detected capacity change from 0 to 131072 F2FS-fs (loop0): invalid crc value F2FS-fs (loop0): Found nat_bits in checkpoint F2FS-fs (loop0): Mounted with checkpoint version = 48b305e4 ------------[ cut here ]------------ kernel BUG at fs/f2fs/inode.c:869! invalid opcode: 0000 [#1] PREEMPT SMP KASAN CPU: 0 PID: 5014 Comm: syz-executor220 Not tainted 6.4.0-syzkaller-11479-g6cd06ab12d1a #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/27/2023 RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869 Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007 RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000 R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0 Call Trace: <TASK> evict+0x2ed/0x6b0 fs/inode.c:665 dispose_list+0x117/0x1e0 fs/inode.c:698 evict_inodes+0x345/0x440 fs/inode.c:748 generic_shutdown_super+0xaf/0x480 fs/super.c:478 kill_block_super+0x64/0xb0 fs/super.c:1417 kill_f2fs_super+0x2af/0x3c0 fs/f2fs/super.c:4704 deactivate_locked_super+0x98/0x160 fs/super.c:330 deactivate_super+0xb1/0xd0 fs/super.c:361 cleanup_mnt+0x2ae/0x3d0 fs/namespace.c:1254 task_work_run+0x16f/0x270 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0xa9a/0x29a0 kernel/exit.c:874 do_group_exit+0xd4/0x2a0 kernel/exit.c:1024 __do_sys_exit_group kernel/exit.c:1035 [inline] __se_sys_exit_group kernel/exit.c:1033 [inline] __x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1033 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f309be71a09 Code: Unable to access opcode bytes at 0x7f309be719df. RSP: 002b:00007fff171df518 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 00007f309bef7330 RCX: 00007f309be71a09 RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001 RBP: 0000000000000001 R08: ffffffffffffffc0 R09: 00007f309bef1e40 R10: 0000000000010600 R11: 0000000000000246 R12: 00007f309bef7330 R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- RIP: 0010:f2fs_evict_inode+0x172d/0x1e00 fs/f2fs/inode.c:869 Code: ff df 48 c1 ea 03 80 3c 02 00 0f 85 6a 06 00 00 8b 75 40 ba 01 00 00 00 4c 89 e7 e8 6d ce 06 00 e9 aa fc ff ff e8 63 22 e2 fd <0f> 0b e8 5c 22 e2 fd 48 c7 c0 a8 3a 18 8d 48 ba 00 00 00 00 00 fc RSP: 0018:ffffc90003a6fa00 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000 RDX: ffff8880273b8000 RSI: ffffffff83a2bd0d RDI: 0000000000000007 RBP: ffff888077db91b0 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff888029a3c000 R13: ffff888077db9660 R14: ffff888029a3c0b8 R15: ffff888077db9c50 FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f1909bb9000 CR3: 00000000276a9000 CR4: 0000000000350ef0 Cc: <stable@vger.kernel.org> Reported-and-tested-by: syzbot+e1246909d526a9d470fa@syzkaller.appspotmail.com Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ext4: fix memory leaks in ext4_fname_{setup_filename,prepare_lookup}Luís Henriques1-0/+4
commit 7ca4b085f430f3774c3838b3da569ceccd6a0177 upstream. If the filename casefolding fails, we'll be leaking memory from the fscrypt_name struct, namely from the 'crypto_buf.name' member. Make sure we free it in the error path on both ext4_fname_setup_filename() and ext4_fname_prepare_lookup() functions. Cc: stable@kernel.org Fixes: 1ae98e295fa2 ("ext4: optimize match for casefolded encrypted dirs") Signed-off-by: Luís Henriques <lhenriques@suse.de> Reviewed-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20230803091713.13239-1-lhenriques@suse.de Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ext4: add correct group descriptors and reserved GDT blocks to system zoneWang Jianjian3-8/+17
commit 68228da51c9a436872a4ef4b5a7692e29f7e5bc7 upstream. When setup_system_zone, flex_bg is not initialized so it is always 1. Use a new helper function, ext4_num_base_meta_blocks() which does not depend on sbi->s_log_groups_per_flex being initialized. [ Squashed two patches in the Link URL's below together into a single commit, which is simpler to review/understand. Also fix checkpatch warnings. --TYT ] Cc: stable@kernel.org Signed-off-by: Wang Jianjian <wangjianjian0@foxmail.com> Link: https://lore.kernel.org/r/tencent_21AF0D446A9916ED5C51492CC6C9A0A77B05@qq.com Link: https://lore.kernel.org/r/tencent_D744D1450CC169AEA77FCF0A64719909ED05@qq.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19jbd2: correct the end of the journal recovery scan rangeZhang Yi1-9/+3
commit 2dfba3bb40ad8536b9fa802364f2d40da31aa88e upstream. We got a filesystem inconsistency issue below while running generic/475 I/O failure pressure test with fast_commit feature enabled. Symlink /p3/d3/d1c/d6c/dd6/dce/l101 (inode #132605) is invalid. If fast_commit feature is enabled, a special fast_commit journal area is appended to the end of the normal journal area. The journal->j_last point to the first unused block behind the normal journal area instead of the whole log area, and the journal->j_fc_last point to the first unused block behind the fast_commit journal area. While doing journal recovery, do_one_pass(PASS_SCAN) should first scan the normal journal area and turn around to the first block once it meet journal->j_last, but the wrap() macro misuse the journal->j_fc_last, so the recovering could not read the next magic block (commit block perhaps) and would end early mistakenly and missing tN and every transaction after it in the following example. Finally, it could lead to filesystem inconsistency. | normal journal area | fast commit area | +-------------------------------------------------+------------------+ | tN(rere) | tN+1 |~| tN-x |...| tN-1 | tN(front) | .... | +-------------------------------------------------+------------------+ / / / start journal->j_last journal->j_fc_last This patch fix it by use the correct ending journal->j_last. Fixes: 5b849b5f96b4 ("jbd2: fast commit recovery path") Cc: stable@kernel.org Reported-by: Theodore Ts'o <tytso@mit.edu> Link: https://lore.kernel.org/linux-ext4/20230613043120.GB1584772@mit.edu/ Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230626073322.3956567-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19jbd2: check 'jh->b_transaction' before removing it from checkpointZhihao Cheng1-0/+2
commit 590a809ff743e7bd890ba5fb36bc38e20a36de53 upstream. Following process will corrupt ext4 image: Step 1: jbd2_journal_commit_transaction __jbd2_journal_insert_checkpoint(jh, commit_transaction) // Put jh into trans1->t_checkpoint_list journal->j_checkpoint_transactions = commit_transaction // Put trans1 into journal->j_checkpoint_transactions Step 2: do_get_write_access test_clear_buffer_dirty(bh) // clear buffer dirty,set jbd dirty __jbd2_journal_file_buffer(jh, transaction) // jh belongs to trans2 Step 3: drop_cache journal_shrink_one_cp_list jbd2_journal_try_remove_checkpoint if (!trylock_buffer(bh)) // lock bh, true if (buffer_dirty(bh)) // buffer is not dirty __jbd2_journal_remove_checkpoint(jh) // remove jh from trans1->t_checkpoint_list Step 4: jbd2_log_do_checkpoint trans1 = journal->j_checkpoint_transactions // jh is not in trans1->t_checkpoint_list jbd2_cleanup_journal_tail(journal) // trans1 is done Step 5: Power cut, trans2 is not committed, jh is lost in next mounting. Fix it by checking 'jh->b_transaction' before remove it from checkpoint. Cc: stable@kernel.org Fixes: 46f881b5b175 ("jbd2: fix a race when checking checkpoint buffer busy") Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230714025528.564988-3-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19jbd2: fix checkpoint cleanup performance regressionZhang Yi1-6/+14
commit 373ac521799d9e97061515aca6ec6621789036bb upstream. journal_clean_one_cp_list() has been merged into journal_shrink_one_cp_list(), but do chekpoint buffer cleanup from the committing process is just a best effort, it should stop scan once it meet a busy buffer, or else it will cause a lot of invalid buffer scan and checks. We catch a performance regression when doing fs_mark tests below. Test cmd: ./fs_mark -d scratch -s 1024 -n 10000 -t 1 -D 100 -N 100 Before merging checkpoint buffer cleanup: FSUse% Count Size Files/sec App Overhead 95 10000 1024 8304.9 49033 After merging checkpoint buffer cleanup: FSUse% Count Size Files/sec App Overhead 95 10000 1024 7649.0 50012 FSUse% Count Size Files/sec App Overhead 95 10000 1024 2107.1 50871 After merging checkpoint buffer cleanup, the total loop count in journal_shrink_one_cp_list() could be up to 6,261,600+ (50,000+ ~ 100,000+ in general), most of them are invalid. This patch fix it through passing 'shrink_type' into journal_shrink_one_cp_list() and add a new 'SHRINK_BUSY_STOP' to indicate it should stop once meet a busy buffer. After fix, the loop count descending back to 10,000+. After this fix: FSUse% Count Size Files/sec App Overhead 95 10000 1024 8558.4 49109 Cc: stable@kernel.org Fixes: b98dba273a0e ("jbd2: remove journal_clean_one_cp_list()") Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230714025528.564988-2-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19dmaengine: sh: rz-dmac: Fix destination and source data size settingHien Huynh1-4/+7
commit c6ec8c83a29fb3aec3efa6fabbf5344498f57c7f upstream. Before setting DDS and SDS values, we need to clear its value first otherwise, we get incorrect results when we change/update the DMA bus width several times due to the 'OR' expression. Fixes: 5000d37042a6 ("dmaengine: sh: Add DMAC driver for RZ/G2L SoC") Cc: stable@kernel.org Signed-off-by: Hien Huynh <hien.huynh.px@renesas.com> Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20230706112150.198941-3-biju.das.jz@bp.renesas.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19clocksource/drivers/arm_arch_timer: Disable timer before programming CVALWalter Chang1-0/+7
commit e7d65e40ab5a5940785c5922f317602d0268caaf upstream. Due to the fact that the use of `writeq_relaxed()` to program CVAL is not guaranteed to be atomic, it is necessary to disable the timer before programming CVAL. However, if the MMIO timer is already enabled and has not yet expired, there is a possibility of unexpected behavior occurring: when the CPU enters the idle state during this period, and if the CPU's local event is earlier than the broadcast event, the following process occurs: tick_broadcast_enter() tick_broadcast_oneshot_control(TICK_BROADCAST_ENTER) __tick_broadcast_oneshot_control() ___tick_broadcast_oneshot_control() tick_broadcast_set_event() clockevents_program_event() set_next_event_mem() During this process, the MMIO timer remains enabled while programming CVAL. To prevent such behavior, disable timer explicitly prior to programming CVAL. Fixes: 8b82c4f883a7 ("clocksource/drivers/arm_arch_timer: Move MMIO timer programming over to CVAL") Cc: stable@vger.kernel.org Signed-off-by: Walter Chang <walter.chang@mediatek.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20230717090735.19370-1-walter.chang@mediatek.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19ARC: atomics: Add compiler barrier to atomic operations...Pavel Kozlov2-6/+6
commit 42f51fb24fd39cc547c086ab3d8a314cc603a91c upstream. ... to avoid unwanted gcc optimizations SMP kernels fail to boot with commit 596ff4a09b89 ("cpumask: re-introduce constant-sized cpumask optimizations"). | | percpu: BUG: failure at mm/percpu.c:2981/pcpu_build_alloc_info()! | The write operation performed by the SCOND instruction in the atomic inline asm code is not properly passed to the compiler. The compiler cannot correctly optimize a nested loop that runs through the cpumask in the pcpu_build_alloc_info() function. Fix this by add a compiler barrier (memory clobber in inline asm). Apparently atomic ops used to have memory clobber implicitly via surrounding smp_mb(). However commit b64be6836993c431e ("ARC: atomics: implement relaxed variants") removed the smp_mb() for the relaxed variants, but failed to add the explicit compiler barrier. Link: https://github.com/foss-for-synopsys-dwc-arc-processors/linux/issues/135 Cc: <stable@vger.kernel.org> # v6.3+ Fixes: b64be6836993c43 ("ARC: atomics: implement relaxed variants") Signed-off-by: Pavel Kozlov <pavel.kozlov@synopsys.com> Signed-off-by: Vineet Gupta <vgupta@kernel.org> [vgupta: tweaked the changelog and added Fixes tag] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19net/mlx5: Free IRQ rmap and notifier on kernel shutdownSaeed Mahameed1-6/+20
commit 314ded538e5f22e7610b1bf621402024a180ec80 upstream. The kernel IRQ system needs the irq affinity notifier to be clear before attempting to free the irq, see WARN_ON log below. On a normal driver unload we don't have this issue since we do the complete cleanup of the irq resources. To fix this, put the important resources cleanup in a helper function and use it in both normal driver unload and shutdown flows. [ 4497.498434] ------------[ cut here ]------------ [ 4497.498726] WARNING: CPU: 0 PID: 9 at kernel/irq/manage.c:2034 free_irq+0x295/0x340 [ 4497.499193] Modules linked in: [ 4497.499386] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.4.0-rc4+ #10 [ 4497.499876] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 4497.500518] Workqueue: events do_poweroff [ 4497.500849] RIP: 0010:free_irq+0x295/0x340 [ 4497.501132] Code: 85 c0 0f 84 1d ff ff ff 48 89 ef ff d0 0f 1f 00 e9 10 ff ff ff 0f 0b e9 72 ff ff ff 49 8d 7f 28 ff d0 0f 1f 00 e9 df fd ff ff <0f> 0b 48 c7 80 c0 008 [ 4497.502269] RSP: 0018:ffffc90000053da0 EFLAGS: 00010282 [ 4497.502589] RAX: ffff888100949600 RBX: ffff88810330b948 RCX: 0000000000000000 [ 4497.503035] RDX: ffff888100949600 RSI: ffff888100400490 RDI: 0000000000000023 [ 4497.503472] RBP: ffff88810330c7e0 R08: ffff8881004005d0 R09: ffffffff8273a260 [ 4497.503923] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881009ae000 [ 4497.504359] R13: ffff8881009ae148 R14: 0000000000000000 R15: ffff888100949600 [ 4497.504804] FS: 0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 [ 4497.505302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4497.505671] CR2: 00007fce98806298 CR3: 000000000262e005 CR4: 0000000000370ef0 [ 4497.506104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4497.506540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4497.507002] Call Trace: [ 4497.507158] <TASK> [ 4497.507299] ? free_irq+0x295/0x340 [ 4497.507522] ? __warn+0x7c/0x130 [ 4497.507740] ? free_irq+0x295/0x340 [ 4497.507963] ? report_bug+0x171/0x1a0 [ 4497.508197] ? handle_bug+0x3c/0x70 [ 4497.508417] ? exc_invalid_op+0x17/0x70 [ 4497.508662] ? asm_exc_invalid_op+0x1a/0x20 [ 4497.508926] ? free_irq+0x295/0x340 [ 4497.509146] mlx5_irq_pool_free_irqs+0x48/0x90 [ 4497.509421] mlx5_irq_table_free_irqs+0x38/0x50 [ 4497.509714] mlx5_core_eq_free_irqs+0x27/0x40 [ 4497.509984] shutdown+0x7b/0x100 [ 4497.510184] pci_device_shutdown+0x30/0x60 [ 4497.510440] device_shutdown+0x14d/0x240 [ 4497.510698] kernel_power_off+0x30/0x70 [ 4497.510938] process_one_work+0x1e6/0x3e0 [ 4497.511183] worker_thread+0x49/0x3b0 [ 4497.511407] ? __pfx_worker_thread+0x10/0x10 [ 4497.511679] kthread+0xe0/0x110 [ 4497.511879] ? __pfx_kthread+0x10/0x10 [ 4497.512114] ret_from_fork+0x29/0x50 [ 4497.512342] </TASK> Fixes: 9c2d08010963 ("net/mlx5: Free irqs only on shutdown callback") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Mathieu Tortuyaux <mtortuyaux@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19Multi-gen LRU: avoid race in inc_min_seq()Kalesh Singh1-5/+7
commit bb5e7f234eacf34b65be67ebb3613e3b8cf11b87 upstream. inc_max_seq() will try to inc_min_seq() if nr_gens == MAX_NR_GENS. This is because the generations are reused (the last oldest now empty generation will become the next youngest generation). inc_min_seq() is retried until successful, dropping the lru_lock and yielding the CPU on each failure, and retaking the lock before trying again: while (!inc_min_seq(lruvec, type, can_swap)) { spin_unlock_irq(&lruvec->lru_lock); cond_resched(); spin_lock_irq(&lruvec->lru_lock); } However, the initial condition that required incrementing the min_seq (nr_gens == MAX_NR_GENS) is not retested. This can change by another call to inc_max_seq() from run_aging() with force_scan=true from the debugfs interface. Since the eviction stalls when the nr_gens == MIN_NR_GENS, avoid unnecessarily incrementing the min_seq by rechecking the number of generations before each attempt. This issue was uncovered in previous discussion on the list by Yu Zhao and Aneesh Kumar [1]. [1] https://lore.kernel.org/linux-mm/CAOUHufbO7CaVm=xjEb1avDhHVvnC8pJmGyKcFf2iY_dpf+zR3w@mail.gmail.com/ Link: https://lkml.kernel.org/r/20230802025606.346758-2-kaleshsingh@google.com Fixes: d6c3af7d8a2b ("mm: multi-gen LRU: debugfs interface") Signed-off-by: Kalesh Singh <kaleshsingh@google.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> [mediatek] Tested-by: Charan Teja Kalla <quic_charante@quicinc.com> Cc: Yu Zhao <yuzhao@google.com> Cc: Aneesh Kumar K V <aneesh.kumar@linux.ibm.com> Cc: Barry Song <baohua@kernel.org> Cc: Brian Geffon <bgeffon@google.com> Cc: Jan Alexander Steffens (heftig) <heftig@archlinux.org> Cc: Lecopzer Chen <lecopzer.chen@mediatek.com> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: Oleksandr Natalenko <oleksandr@natalenko.name> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Steven Barrett <steven@liquorix.net> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-09-19sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory()Petr Tesarik5-11/+7
[ Upstream commit fb60211f377b69acffead3147578f86d0092a7a5 ] In all these cases, the last argument to dma_declare_coherent_memory() is the buffer end address, but the expected value should be the size of the reserved region. Fixes: 39fb993038e1 ("media: arch: sh: ap325rxa: Use new renesas-ceu camera driver") Fixes: c2f9b05fd5c1 ("media: arch: sh: ecovec: Use new renesas-ceu camera driver") Fixes: f3590dc32974 ("media: arch: sh: kfr2r09: Use new renesas-ceu camera driver") Fixes: 186c446f4b84 ("media: arch: sh: migor: Use new renesas-ceu camera driver") Fixes: 1a3c230b4151 ("media: arch: sh: ms7724se: Use new renesas-ceu camera driver") Signed-off-by: Petr Tesarik <petr.tesarik.ext@huawei.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Reviewed-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Link: https://lore.kernel.org/r/20230724120742.2187-1-petrtesarik@huaweicloud.com Signed-off-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19net: hns3: remove GSO partial feature bitJie Wang1-2/+0
[ Upstream commit 60326634f6c54528778de18bfef1e8a7a93b3771 ] HNS3 NIC does not support GSO partial packets segmentation. Actually tunnel packets for example NvGRE packets segment offload and checksum offload is already supported. There is no need to keep gso partial feature bit. So this patch removes it. Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19net: hns3: fix the port information display when sfp is absentYisen Zhuang1-1/+3
[ Upstream commit 674d9591a32d01df75d6b5fffed4ef942a294376 ] When sfp is absent or unidentified, the port type should be displayed as PORT_OTHERS, rather than PORT_FIBRE. Fixes: 88d10bd6f730 ("net: hns3: add support for multiple media type") Signed-off-by: Yisen Zhuang <yisen.zhuang@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19net: hns3: fix invalid mutex between tc qdisc and dcb ets command issueJijie Shao4-19/+9
[ Upstream commit fa5564945f7d15ae2390b00c08b6abaef0165cda ] We hope that tc qdisc and dcb ets commands can not be used crosswise. If we want to use any of the commands to configure tc, We must use the other command to clear the existing configuration. However, when we configure a single tc with tc qdisc, we can still configure it with dcb ets. Because we use mqprio_active as the tag of tc qdisc configuration, but with dcb ets, we do not check mqprio_active. This patch fix this issue by check mqprio_active before executing the dcb ets command. and add dcb_ets_active to replace HCLGE_FLAG_DCB_ENABLE and HCLGE_FLAG_MQPRIO_ENABLE at the hclge layer, Fixes: cacde272dd00 ("net: hns3: Add hclge_dcb module for the support of DCB feature") Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-19net: hns3: fix debugfs concurrency issue between kfree buffer and readHao Chen1-3/+4
[ Upstream commit c295160b1d95e885f1af4586a221cb221d232d10 ] Now in hns3_dbg_uninit(), there may be concurrency between kfree buffer and read, it may result in memory error. Moving debugfs_remove_recursive() in front of kfree buffer to ensure they don't happen at the same time. Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process") Signed-off-by: Hao Chen <chenhao418@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>