diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2023-02-20 23:33:41 +0300 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2023-02-20 23:33:41 +0300 |
commit | 6639c3ce7fd217c22b26aa9f2a3cb69dc19221f8 (patch) | |
tree | 743eadc88bc0422c227484805f97d2b23b21fb3b /Documentation/filesystems | |
parent | f18f9845f2f10d3d1fc63e4ad16ee52d2d9292fa (diff) | |
parent | 51e4e3153ebc32d3280d5d17418ae6f1a44f1ec1 (diff) | |
download | linux-6639c3ce7fd217c22b26aa9f2a3cb69dc19221f8.tar.xz |
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux
Pull fsverity updates from Eric Biggers:
"Fix the longstanding implementation limitation that fsverity was only
supported when the Merkle tree block size, filesystem block size, and
PAGE_SIZE were all equal.
Specifically, add support for Merkle tree block sizes less than
PAGE_SIZE, and make ext4 support fsverity on filesystems where the
filesystem block size is less than PAGE_SIZE.
Effectively, this means that fsverity can now be used on systems with
non-4K pages, at least on ext4. These changes have been tested using
the verity group of xfstests, newly updated to cover the new code
paths.
Also update fs/verity/ to support verifying data from large folios.
There's also a similar patch for fs/crypto/, to support decrypting
data from large folios, which I'm including in here to avoid a merge
conflict between the fscrypt and fsverity branches"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
fscrypt: support decrypting data from large folios
fsverity: support verifying data from large folios
fsverity.rst: update git repo URL for fsverity-utils
ext4: allow verity with fs block size < PAGE_SIZE
fs/buffer.c: support fsverity in block_read_full_folio()
f2fs: simplify f2fs_readpage_limit()
ext4: simplify ext4_readpage_limit()
fsverity: support enabling with tree block size < PAGE_SIZE
fsverity: support verification with tree block size < PAGE_SIZE
fsverity: replace fsverity_hash_page() with fsverity_hash_block()
fsverity: use EFBIG for file too large to enable verity
fsverity: store log2(digest_size) precomputed
fsverity: simplify Merkle tree readahead size calculation
fsverity: use unsigned long for level_start
fsverity: remove debug messages and CONFIG_FS_VERITY_DEBUG
fsverity: pass pos and size to ->write_merkle_tree_block
fsverity: optimize fsverity_cleanup_inode() on non-verity files
fsverity: optimize fsverity_prepare_setattr() on non-verity files
fsverity: optimize fsverity_file_open() on non-verity files
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r-- | Documentation/filesystems/fscrypt.rst | 4 | ||||
-rw-r--r-- | Documentation/filesystems/fsverity.rst | 96 |
2 files changed, 49 insertions, 51 deletions
diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst index ef183387da20..eccd327e6df5 100644 --- a/Documentation/filesystems/fscrypt.rst +++ b/Documentation/filesystems/fscrypt.rst @@ -1277,8 +1277,8 @@ the file contents themselves, as described below: For the read path (->read_folio()) of regular files, filesystems can read the ciphertext into the page cache and decrypt it in-place. The -page lock must be held until decryption has finished, to prevent the -page from becoming visible to userspace prematurely. +folio lock must be held until decryption has finished, to prevent the +folio from becoming visible to userspace prematurely. For the write path (->writepage()) of regular files, filesystems cannot encrypt data in-place in the page cache, since the cached diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst index cb8e7573882a..ede672dedf11 100644 --- a/Documentation/filesystems/fsverity.rst +++ b/Documentation/filesystems/fsverity.rst @@ -118,10 +118,11 @@ as follows: - ``hash_algorithm`` must be the identifier for the hash algorithm to use for the Merkle tree, such as FS_VERITY_HASH_ALG_SHA256. See ``include/uapi/linux/fsverity.h`` for the list of possible values. -- ``block_size`` must be the Merkle tree block size. Currently, this - must be equal to the system page size, which is usually 4096 bytes. - Other sizes may be supported in the future. This value is not - necessarily the same as the filesystem block size. +- ``block_size`` is the Merkle tree block size, in bytes. In Linux + v6.3 and later, this can be any power of 2 between (inclusively) + 1024 and the minimum of the system page size and the filesystem + block size. In earlier versions, the page size was the only allowed + value. - ``salt_size`` is the size of the salt in bytes, or 0 if no salt is provided. The salt is a value that is prepended to every hashed block; it can be used to personalize the hashing for a particular @@ -161,6 +162,7 @@ FS_IOC_ENABLE_VERITY can fail with the following errors: - ``EBUSY``: this ioctl is already running on the file - ``EEXIST``: the file already has verity enabled - ``EFAULT``: the caller provided inaccessible memory +- ``EFBIG``: the file is too large to enable verity on - ``EINTR``: the operation was interrupted by a fatal signal - ``EINVAL``: unsupported version, hash algorithm, or block size; or reserved bits are set; or the file descriptor refers to neither a @@ -495,9 +497,11 @@ To create verity files on an ext4 filesystem, the filesystem must have been formatted with ``-O verity`` or had ``tune2fs -O verity`` run on it. "verity" is an RO_COMPAT filesystem feature, so once set, old kernels will only be able to mount the filesystem readonly, and old -versions of e2fsck will be unable to check the filesystem. Moreover, -currently ext4 only supports mounting a filesystem with the "verity" -feature when its block size is equal to PAGE_SIZE (often 4096 bytes). +versions of e2fsck will be unable to check the filesystem. + +Originally, an ext4 filesystem with the "verity" feature could only be +mounted when its block size was equal to the system page size +(typically 4096 bytes). In Linux v6.3, this limitation was removed. ext4 sets the EXT4_VERITY_FL on-disk inode flag on verity files. It can only be set by `FS_IOC_ENABLE_VERITY`_, and it cannot be cleared. @@ -518,9 +522,7 @@ support paging multi-gigabyte xattrs into memory, and to support encrypting xattrs. Note that the verity metadata *must* be encrypted when the file is, since it contains hashes of the plaintext data. -Currently, ext4 verity only supports the case where the Merkle tree -block size, filesystem block size, and page size are all the same. It -also only supports extent-based files. +ext4 only allows verity on extent-based files. f2fs ---- @@ -538,11 +540,10 @@ Like ext4, f2fs stores the verity metadata (Merkle tree and fsverity_descriptor) past the end of the file, starting at the first 64K boundary beyond i_size. See explanation for ext4 above. Moreover, f2fs supports at most 4096 bytes of xattr entries per inode -which wouldn't be enough for even a single Merkle tree block. +which usually wouldn't be enough for even a single Merkle tree block. -Currently, f2fs verity only supports a Merkle tree block size of 4096. -Also, f2fs doesn't support enabling verity on files that currently -have atomic or volatile writes pending. +f2fs doesn't support enabling verity on files that currently have +atomic or volatile writes pending. btrfs ----- @@ -567,51 +568,48 @@ Pagecache ~~~~~~~~~ For filesystems using Linux's pagecache, the ``->read_folio()`` and -``->readahead()`` methods must be modified to verify pages before they -are marked Uptodate. Merely hooking ``->read_iter()`` would be +``->readahead()`` methods must be modified to verify folios before +they are marked Uptodate. Merely hooking ``->read_iter()`` would be insufficient, since ``->read_iter()`` is not used for memory maps. -Therefore, fs/verity/ provides a function fsverity_verify_page() which -verifies a page that has been read into the pagecache of a verity -inode, but is still locked and not Uptodate, so it's not yet readable -by userspace. As needed to do the verification, -fsverity_verify_page() will call back into the filesystem to read -Merkle tree pages via fsverity_operations::read_merkle_tree_page(). +Therefore, fs/verity/ provides the function fsverity_verify_blocks() +which verifies data that has been read into the pagecache of a verity +inode. The containing folio must still be locked and not Uptodate, so +it's not yet readable by userspace. As needed to do the verification, +fsverity_verify_blocks() will call back into the filesystem to read +hash blocks via fsverity_operations::read_merkle_tree_page(). -fsverity_verify_page() returns false if verification failed; in this -case, the filesystem must not set the page Uptodate. Following this, +fsverity_verify_blocks() returns false if verification failed; in this +case, the filesystem must not set the folio Uptodate. Following this, as per the usual Linux pagecache behavior, attempts by userspace to -read() from the part of the file containing the page will fail with -EIO, and accesses to the page within a memory map will raise SIGBUS. - -fsverity_verify_page() currently only supports the case where the -Merkle tree block size is equal to PAGE_SIZE (often 4096 bytes). +read() from the part of the file containing the folio will fail with +EIO, and accesses to the folio within a memory map will raise SIGBUS. -In principle, fsverity_verify_page() verifies the entire path in the -Merkle tree from the data page to the root hash. However, for -efficiency the filesystem may cache the hash pages. Therefore, -fsverity_verify_page() only ascends the tree reading hash pages until -an already-verified hash page is seen, as indicated by the PageChecked -bit being set. It then verifies the path to that page. +In principle, verifying a data block requires verifying the entire +path in the Merkle tree from the data block to the root hash. +However, for efficiency the filesystem may cache the hash blocks. +Therefore, fsverity_verify_blocks() only ascends the tree reading hash +blocks until an already-verified hash block is seen. It then verifies +the path to that block. This optimization, which is also used by dm-verity, results in excellent sequential read performance. This is because usually (e.g. -127 in 128 times for 4K blocks and SHA-256) the hash page from the +127 in 128 times for 4K blocks and SHA-256) the hash block from the bottom level of the tree will already be cached and checked from -reading a previous data page. However, random reads perform worse. +reading a previous data block. However, random reads perform worse. Block device based filesystems ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Block device based filesystems (e.g. ext4 and f2fs) in Linux also use the pagecache, so the above subsection applies too. However, they -also usually read many pages from a file at once, grouped into a +also usually read many data blocks from a file at once, grouped into a structure called a "bio". To make it easier for these types of filesystems to support fs-verity, fs/verity/ also provides a function -fsverity_verify_bio() which verifies all pages in a bio. +fsverity_verify_bio() which verifies all data blocks in a bio. ext4 and f2fs also support encryption. If a verity file is also -encrypted, the pages must be decrypted before being verified. To +encrypted, the data must be decrypted before being verified. To support this, these filesystems allocate a "post-read context" for each bio and store it in ``->bi_private``:: @@ -626,14 +624,14 @@ each bio and store it in ``->bi_private``:: verity, or both is enabled. After the bio completes, for each needed postprocessing step the filesystem enqueues the bio_post_read_ctx on a workqueue, and then the workqueue work does the decryption or -verification. Finally, pages where no decryption or verity error -occurred are marked Uptodate, and the pages are unlocked. +verification. Finally, folios where no decryption or verity error +occurred are marked Uptodate, and the folios are unlocked. On many filesystems, files can contain holes. Normally, -``->readahead()`` simply zeroes holes and sets the corresponding pages -Uptodate; no bios are issued. To prevent this case from bypassing -fs-verity, these filesystems use fsverity_verify_page() to verify hole -pages. +``->readahead()`` simply zeroes hole blocks and considers the +corresponding data to be up-to-date; no bios are issued. To prevent +this case from bypassing fs-verity, filesystems use +fsverity_verify_blocks() to verify hole blocks. Filesystems also disable direct I/O on verity files, since otherwise direct I/O would bypass fs-verity. @@ -644,7 +642,7 @@ Userspace utility This document focuses on the kernel, but a userspace utility for fs-verity can be found at: - https://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/fsverity-utils.git + https://git.kernel.org/pub/scm/fs/fsverity/fsverity-utils.git See the README.md file in the fsverity-utils source tree for details, including examples of setting up fs-verity protected files. @@ -793,9 +791,9 @@ weren't already directly answered in other parts of this document. :A: There are many reasons why this is not possible or would be very difficult, including the following: - - To prevent bypassing verification, pages must not be marked + - To prevent bypassing verification, folios must not be marked Uptodate until they've been verified. Currently, each - filesystem is responsible for marking pages Uptodate via + filesystem is responsible for marking folios Uptodate via ``->readahead()``. Therefore, currently it's not possible for the VFS to do the verification on its own. Changing this would require significant changes to the VFS and all filesystems. |