summaryrefslogtreecommitdiff
path: root/arch/s390/boot/kaslr.c
AgeCommit message (Collapse)AuthorFilesLines
2023-04-13s390/kaslr: generalize and improve random base distributionVasily Gorbik1-9/+101
Improve the distribution algorithm of random base address to ensure a uniformity among all suitable addresses. To generate a random value once, and to build a continuous range in which every value is suitable, count all the suitable addresses (referred to as positions) that can be used as a base address. The positions are counted by iterating over the usable memory ranges. For each range that is big enough to accommodate the image, count all the suitable addresses where the image can be placed, while taking reserved memory ranges into consideration. A new function "iterate_valid_positions()" has dual purpose. Firstly, it is called to count the positions in a given memory range, and secondly, to convert a random position back to an address. "get_random_base()" has been replaced with more generic "randomize_within_range()" which now could be called for randomizing base addresses not just for the kernel image. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2023-03-20s390/boot: rework decompressor reserved trackingVasily Gorbik1-105/+8
Currently several approaches for finding unused memory in decompressor are utilized. While "safe_addr" grows towards higher addresses, vmem code allocates paging structures top down. The former requires careful ordering. In addition to that ipl report handling code verifies potential intersections with secure boot certificates on its own. Neither of two approaches are memory holes aware and consistent with each other in low memory conditions. To solve that, existing approaches are generalized and combined together, as well as online memory ranges are now taken into consideration. physmem_info has been extended to contain reserved memory ranges. New set of functions allow to handle reserves and find unused memory. All reserves and memory allocations are "typed". In case of out of memory condition decompressor fails with detailed info on current reserved ranges and usable online memory. Linux version 6.2.0 ... Kernel command line: ... mem=100M Our of memory allocating 100000 bytes 100000 aligned in range 0:5800000 Reserved memory ranges: 0000000000000000 0000000003e33000 DECOMPRESSOR 0000000003f00000 00000000057648a3 INITRD 00000000063e0000 00000000063e8000 VMEM 00000000063eb000 00000000063f4000 VMEM 00000000063f7800 0000000006400000 VMEM 0000000005800000 0000000006300000 KASAN Usable online memory ranges (info source: sclp read info [3]): 0000000000000000 0000000006400000 Usable online memory total: 6400000 Reserved: 61b10a3 Free: 24ef5d Call Trace: (sp:000000000002bd58 [<0000000000012a70>] physmem_alloc_top_down+0x60/0x14c) sp:000000000002bdc8 [<0000000000013756>] _pa+0x56/0x6a sp:000000000002bdf0 [<0000000000013bcc>] pgtable_populate+0x45c/0x65e sp:000000000002be90 [<00000000000140aa>] setup_vmem+0x2da/0x424 sp:000000000002bec8 [<0000000000011c20>] startup_kernel+0x428/0x8b4 sp:000000000002bf60 [<00000000000100f4>] startup_normal+0xd4/0xd4 physmem_alloc_range allows to find free memory in specified range. It should be used for one time allocations only like finding position for amode31 and vmlinux. physmem_alloc_top_down can be used just like physmem_alloc_range, but it also allows multiple allocations per type and tries to merge sequential allocations together. Which is useful for paging structures allocations. If sequential allocations cannot be merged together they are "chained", allowing easy per type reserved ranges enumeration and migration to memblock later. Extra "struct reserved_range" allocated for chaining are not tracked or reserved but rely on the fact that both physmem_alloc_range and physmem_alloc_top_down search for free memory only below current top down allocator position. All reserved ranges should be transferred to memblock before memblock allocations are enabled. The startup code has been reordered to delay any memory allocations until online memory ranges are detected and occupied memory ranges are marked as reserved to be excluded from follow-up allocations. Ipl report certificates are a special case, ipl report certificates list is checked together with other memory reserves until certificates are saved elsewhere. KASAN required memory for shadow memory allocation and mapping is reserved as 1 large chunk which is later passed to KASAN early initialization code. Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-03-20s390/boot: rename mem_detect to physmem_infoVasily Gorbik1-7/+7
In preparation to extending mem_detect with additional information like reserved ranges rename it to more generic physmem_info. This new naming also help to avoid confusion by using more exact terms like "physmem online ranges", etc. Acked-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-14s390/mem_detect: do not truncate online memory ranges infoVasily Gorbik1-5/+5
Commit bf64f0517e5d ("s390/mem_detect: handle online memory limit just once") introduced truncation of mem_detect online ranges based on identity mapping size. For kdump case however the full set of online memory ranges has to be feed into memblock_physmem_add so that crashed system memory could be extracted. Instead of truncating introduce a "usable limit" which is respected by mem_detect api. Also add extra online memory ranges iterator which still provides full set of online memory ranges disregarding the "usable limit". Fixes: bf64f0517e5d ("s390/mem_detect: handle online memory limit just once") Reported-by: Alexander Egorenkov <egorenar@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06s390/kasan: avoid mapping KASAN shadow for standby memoryVasily Gorbik1-1/+1
KASAN common code is able to handle memory hotplug and create KASAN shadow memory on a fly. Online memory ranges are available from mem_detect, use this information to avoid mapping KASAN shadow for standby memory. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06s390/boot: avoid page tables memory in kaslrVasily Gorbik1-2/+4
If kernel is build without KASAN support there is a chance that kernel image is going to be positioned by KASLR code to overlap with identity mapping page tables. When kernel is build with KASAN support enabled memory which is potentially going to be used for page tables and KASAN shadow mapping is accounted for in KASLR with the use of kasan_estimate_memory_needs(). Split this function and introduce vmem_estimate_memory_needs() to cover decompressor's vmem identity mapping page tables. Fixes: bb1520d581a3 ("s390/mm: start kernel with DAT enabled") Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06s390/mem_detect: handle online memory limit just onceVasily Gorbik1-2/+0
Introduce mem_detect_truncate() to cut any online memory ranges above established identity mapping size, so that mem_detect users wouldn't have to do it over and over again. Suggested-by: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2023-02-06s390/boot: fix mem_detect extended area allocationVasily Gorbik1-6/+0
Allocation of mem_detect extended area was not considered neither in commit 9641b8cc733f ("s390/ipl: read IPL report at early boot") nor in commit b2d24b97b2a9 ("s390/kernel: add support for kernel address space layout randomization (KASLR)"). As a result mem_detect extended theoretically may overlap with ipl report or randomized kernel image position. But as mem_detect code will allocate extended area only upon exceeding 255 online regions (which should alternate with offline memory regions) it is not seen in practice. To make sure mem_detect extended area does not overlap with ipl report or randomized kernel position extend usage of "safe_addr". Make initrd handling and mem_detect extended area allocation code move it further right and make KASLR takes in into consideration as well. Fixes: 9641b8cc733f ("s390/ipl: read IPL report at early boot") Fixes: b2d24b97b2a9 ("s390/kernel: add support for kernel address space layout randomization (KASLR)") Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2022-05-06s390/boot: get rid of startup archiveHeiko Carstens1-1/+1
The final kernel image is created by linking decompressor object files with a startup archive. The startup archive file however does not contain only optional code and data which can be discarded if not referenced. It also contains mandatory object data like head.o which must never be discarded, even if not referenced. Move the decompresser code and linker script to the boot directory and get rid of the startup archive so everything is kept during link time. Acked-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-07-27s390/boot: introduce boot data 'initrd_data'Alexander Egorenkov1-3/+3
The new boot data struct shall replace global variables INITRD_START and INITRD_SIZE. It is initialized in the decompressor and passed to the decompressed kernel. In comparison to the old solution, this one doesn't access data at fixed physical addresses which will become important when the decompressor becomes relocatable. Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-20s390: unify identity mapping limits handlingVasily Gorbik1-2/+1
Currently we have to consider too many different values which in the end only affect identity mapping size. These are: 1. max_physmem_end - end of physical memory online or standby. Always <= end of the last online memory block (get_mem_detect_end()). 2. CONFIG_MAX_PHYSMEM_BITS - the maximum size of physical memory the kernel is able to support. 3. "mem=" kernel command line option which limits physical memory usage. 4. OLDMEM_BASE which is a kdump memory limit when the kernel is executed as crash kernel. 5. "hsa" size which is a memory limit when the kernel is executed during zfcp/nvme dump. Through out kernel startup and run we juggle all those values at once but that does not bring any amusement, only confusion and complexity. Unify all those values to a single one we should really care, that is our identity mapping size. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-11-09s390/kasan: move memory needs estimation into a functionVasily Gorbik1-22/+8
Also correct rounding downs in estimation calculations. Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-09-29s390/kaslr: correct and explain randomization base generationVasily Gorbik1-38/+92
Currently there are several minor problems with randomization base generation code: 1. It might misbehave in low memory conditions. In particular there might be enough space for the kernel on [0, block_sum] but after if (base < safe_addr) base = safe_addr; it might not be enough anymore. 2. It does not correctly handle minimal address constraint. In condition if (base < safe_addr) base = safe_addr; a synthetic value is compared with an address. If we have a memory setup with memory holes due to offline memory regions, and safe_addr is close to the end of the first online memory block - we might position the kernel in invalid memory. 3. block_sum calculation logic contains off-by-one error. Let's say we have a memory block in which the kernel fits perfectly (end - start == kernel_size). In this case: if (end - start < kernel_size) continue; block_sum += end - start - kernel_size; block_sum is not increased, while it is a valid kernel position. So, address problems listed and explain algorithm used. Besides that restructuring the code makes it possible to extend kernel positioning algorithm further. Currently we pick position in between single [min, max] range (min = safe_addr, max = memory_limit). In future we can do that for multiple ranges as well (by calling count_valid_kernel_positions for each range). Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-09-29s390/kaslr: avoid mixing valid random value and an error codeVasily Gorbik1-5/+5
0 is a valid random value. To avoid mixing it with error code 0 as an return code make get_random() take extra argument to output random value and return an error code. Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-06-09mm: reorder includes after introduction of linux/pgtable.hMike Rapoport1-1/+1
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include of the latter in the middle of asm includes. Fix this up with the aid of the below script and manual adjustments here and there. import sys import re if len(sys.argv) is not 3: print "USAGE: %s <file> <header>" % (sys.argv[0]) sys.exit(1) hdr_to_move="#include <linux/%s>" % sys.argv[2] moved = False in_hdrs = False with open(sys.argv[1], "r") as f: lines = f.readlines() for _line in lines: line = _line.rstrip(' ') if line == hdr_to_move: continue if line.startswith("#include <linux/"): in_hdrs = True elif not moved and in_hdrs: moved = True print hdr_to_move print line Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09mm: introduce include/linux/pgtable.hMike Rapoport1-1/+1
The include/linux/pgtable.h is going to be the home of generic page table manipulation functions. Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and make the latter include asm/pgtable.h. Signed-off-by: Mike Rapoport <rppt@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Cain <bcain@codeaurora.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Chris Zankel <chris@zankel.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Greentime Hu <green.hu@gmail.com> Cc: Greg Ungerer <gerg@linux-m68k.org> Cc: Guan Xuetao <gxt@pku.edu.cn> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ley Foon Tan <ley.foon.tan@intel.com> Cc: Mark Salter <msalter@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Nick Hu <nickhu@andestech.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: Vincent Chen <deanbo422@gmail.com> Cc: Vineet Gupta <vgupta@synopsys.com> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-02-17s390/kaslr: Fix casts in get_randomNathan Chancellor1-1/+1
Clang warns: ../arch/s390/boot/kaslr.c:78:25: warning: passing 'char *' to parameter of type 'const u8 *' (aka 'const unsigned char *') converts between pointers to integer types with different sign [-Wpointer-sign] (char *) entropy, (char *) entropy, ^~~~~~~~~~~~~~~~ ../arch/s390/include/asm/cpacf.h:280:28: note: passing argument to parameter 'src' here u8 *dest, const u8 *src, long src_len) ^ 2 warnings generated. Fix the cast to match what else is done in this function. Fixes: b2d24b97b2a9 ("s390/kernel: add support for kernel address space layout randomization (KASLR)") Link: https://github.com/ClangBuiltLinux/linux/issues/862 Link: https://lkml.kernel.org/r/20200208141052.48476-1-natechancellor@gmail.com Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-09-18Merge tag 's390-5.4-1' of ↵Linus Torvalds1-8/+33
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Vasily Gorbik: - Add support for IBM z15 machines. - Add SHA3 and CCA AES cipher key support in zcrypt and pkey refactoring. - Move to arch_stack_walk infrastructure for the stack unwinder. - Various kasan fixes and improvements. - Various command line parsing fixes. - Improve decompressor phase debuggability. - Lift no bss usage restriction for the early code. - Use refcount_t for reference counters for couple of places in mm code. - Logging improvements and return code fix in vfio-ccw code. - Couple of zpci fixes and minor refactoring. - Remove some outdated documentation. - Fix secure boot detection. - Other various minor code clean ups. * tag 's390-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (48 commits) s390: remove pointless drivers-y in drivers/s390/Makefile s390/cpum_sf: Fix line length and format string s390/pci: fix MSI message data s390: add support for IBM z15 machines s390/crypto: Support for SHA3 via CPACF (MSA6) s390/startup: add pgm check info printing s390/crypto: xts-aes-s390 fix extra run-time crypto self tests finding vfio-ccw: fix error return code in vfio_ccw_sch_init() s390: vfio-ap: fix warning reset not completed s390/base: remove unused s390_base_mcck_handler s390/sclp: Fix bit checked for has_sipl s390/zcrypt: fix wrong handling of cca cipher keygenflags s390/kasan: add kdump support s390/setup: avoid using strncmp with hardcoded length s390/sclp: avoid using strncmp with hardcoded length s390/module: avoid using strncmp with hardcoded length s390/pci: avoid using strncmp with hardcoded length s390/kaslr: reserve memory for kasan usage s390/mem_detect: provide single get_mem_detect_end s390/cmma: reuse kstrtobool for option value parsing ...
2019-08-26s390/kaslr: reserve memory for kasan usageVasily Gorbik1-8/+33
Sometimes the kernel fails to boot with: "The Linux kernel failed to boot with the KernelAddressSanitizer: out of memory during initialisation" even with big amounts of memory when both kaslr and kasan are enabled. The problem is that kasan initialization code requires 1/8 of physical memory plus some for page tables. To keep as much code instrumented as possible kasan avoids using memblock for memory allocations. Instead kasan uses trivial memory allocator which simply chops off the memory from the end of online physical memory. For that reason when kaslr is enabled together with kasan avoid positioning kernel into upper memory region which would be utilized during kasan initialization. Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-29s390/boot: add missing declarations and includesVasily Gorbik1-0/+1
Add __swsusp_reset_dma declaration to avoid the following sparse warnings: arch/s390/kernel/setup.c:107:15: warning: symbol '__swsusp_reset_dma' was not declared. Should it be static? arch/s390/boot/startup.c:52:15: warning: symbol '__swsusp_reset_dma' was not declared. Should it be static? Add verify_facilities declaration to avoid the following sparse warning: arch/s390/boot/als.c:105:6: warning: symbol 'verify_facilities' was not declared. Should it be static? Include "boot.h" into arch/s390/boot/kaslr.c to expose get_random_base function declaration and avoid the following sparse warning: arch/s390/boot/kaslr.c:90:15: warning: symbol 'get_random_base' was not declared. Should it be static? Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-04-29s390/kernel: add support for kernel address space layout randomization (KASLR)Gerald Schaefer1-0/+144
This patch adds support for relocating the kernel to a random address. The random kernel offset is obtained from cpacf, using either TRNG, PRNO, or KMC_PRNG, depending on supported MSA level. KERNELOFFSET is added to vmcoreinfo, for crash --kaslr support. Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> Reviewed-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>