summaryrefslogtreecommitdiff
path: root/Documentation/filesystems
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems')
-rw-r--r--Documentation/filesystems/api-summary.rst3
-rw-r--r--Documentation/filesystems/ext4/journal.rst6
-rw-r--r--Documentation/filesystems/ext4/super.rst7
-rw-r--r--Documentation/filesystems/files.rst8
-rw-r--r--Documentation/filesystems/fsverity.rst68
-rw-r--r--Documentation/filesystems/index.rst2
-rw-r--r--Documentation/filesystems/journalling.rst6
-rw-r--r--Documentation/filesystems/mount_api.rst4
-rw-r--r--Documentation/filesystems/nfs/exporting.rst52
-rw-r--r--Documentation/filesystems/proc.rst3
-rw-r--r--Documentation/filesystems/tmpfs.rst8
11 files changed, 113 insertions, 54 deletions
diff --git a/Documentation/filesystems/api-summary.rst b/Documentation/filesystems/api-summary.rst
index bbb0c1c0e5cf..a94f17d9b836 100644
--- a/Documentation/filesystems/api-summary.rst
+++ b/Documentation/filesystems/api-summary.rst
@@ -86,9 +86,6 @@ Other Functions
.. kernel-doc:: fs/dax.c
:export:
-.. kernel-doc:: fs/direct-io.c
- :export:
-
.. kernel-doc:: fs/libfs.c
:export:
diff --git a/Documentation/filesystems/ext4/journal.rst b/Documentation/filesystems/ext4/journal.rst
index 805a1e9ea3a5..849d5b119eb8 100644
--- a/Documentation/filesystems/ext4/journal.rst
+++ b/Documentation/filesystems/ext4/journal.rst
@@ -256,6 +256,10 @@ which is 1024 bytes long:
- s\_padding2
-
* - 0x54
+ - \_\_be32
+ - s\_num\_fc\_blocks
+ - Number of fast commit blocks in the journal.
+ * - 0x58
- \_\_u32
- s\_padding[42]
-
@@ -310,6 +314,8 @@ The journal incompat features are any combination of the following:
- This journal uses v3 of the checksum on-disk format. This is the same as
v2, but the journal block tag size is fixed regardless of the size of
block numbers. (JBD2\_FEATURE\_INCOMPAT\_CSUM\_V3)
+ * - 0x20
+ - Journal has fast commit blocks. (JBD2\_FEATURE\_INCOMPAT\_FAST\_COMMIT)
.. _jbd2_checksum_type:
diff --git a/Documentation/filesystems/ext4/super.rst b/Documentation/filesystems/ext4/super.rst
index 93e55d7c1d40..2eb1ab20498d 100644
--- a/Documentation/filesystems/ext4/super.rst
+++ b/Documentation/filesystems/ext4/super.rst
@@ -596,6 +596,13 @@ following:
- Sparse Super Block, v2. If this flag is set, the SB field s\_backup\_bgs
points to the two block groups that contain backup superblocks
(COMPAT\_SPARSE\_SUPER2).
+ * - 0x400
+ - Fast commits supported. Although fast commits blocks are
+ backward incompatible, fast commit blocks are not always
+ present in the journal. If fast commit blocks are present in
+ the journal, JBD2 incompat feature
+ (JBD2\_FEATURE\_INCOMPAT\_FAST\_COMMIT) gets
+ set (COMPAT\_FAST\_COMMIT).
.. _super_incompat:
diff --git a/Documentation/filesystems/files.rst b/Documentation/filesystems/files.rst
index cbf8e57376bf..bcf84459917f 100644
--- a/Documentation/filesystems/files.rst
+++ b/Documentation/filesystems/files.rst
@@ -62,7 +62,7 @@ the fdtable structure -
be held.
4. To look up the file structure given an fd, a reader
- must use either fcheck() or fcheck_files() APIs. These
+ must use either lookup_fd_rcu() or files_lookup_fd_rcu() APIs. These
take care of barrier requirements due to lock-free lookup.
An example::
@@ -70,7 +70,7 @@ the fdtable structure -
struct file *file;
rcu_read_lock();
- file = fcheck(fd);
+ file = lookup_fd_rcu(fd);
if (file) {
...
}
@@ -84,7 +84,7 @@ the fdtable structure -
on ->f_count::
rcu_read_lock();
- file = fcheck_files(files, fd);
+ file = files_lookup_fd_rcu(files, fd);
if (file) {
if (atomic_long_inc_not_zero(&file->f_count))
*fput_needed = 1;
@@ -104,7 +104,7 @@ the fdtable structure -
lock-free, they must be installed using rcu_assign_pointer()
API. If they are looked up lock-free, rcu_dereference()
must be used. However it is advisable to use files_fdtable()
- and fcheck()/fcheck_files() which take care of these issues.
+ and lookup_fd_rcu()/files_lookup_fd_rcu() which take care of these issues.
7. While updating, the fdtable pointer must be looked up while
holding files->file_lock. If ->file_lock is dropped, then
diff --git a/Documentation/filesystems/fsverity.rst b/Documentation/filesystems/fsverity.rst
index 895e9711ed88..e0204a23e997 100644
--- a/Documentation/filesystems/fsverity.rst
+++ b/Documentation/filesystems/fsverity.rst
@@ -27,9 +27,9 @@ automatically verified against the file's Merkle tree. Reads of any
corrupted data, including mmap reads, will fail.
Userspace can use another ioctl to retrieve the root hash (actually
-the "file measurement", which is a hash that includes the root hash)
-that fs-verity is enforcing for the file. This ioctl executes in
-constant time, regardless of the file size.
+the "fs-verity file digest", which is a hash that includes the Merkle
+tree root hash) that fs-verity is enforcing for the file. This ioctl
+executes in constant time, regardless of the file size.
fs-verity is essentially a way to hash a file in constant time,
subject to the caveat that reads which would violate the hash will
@@ -177,9 +177,10 @@ FS_IOC_ENABLE_VERITY can fail with the following errors:
FS_IOC_MEASURE_VERITY
---------------------
-The FS_IOC_MEASURE_VERITY ioctl retrieves the measurement of a verity
-file. The file measurement is a digest that cryptographically
-identifies the file contents that are being enforced on reads.
+The FS_IOC_MEASURE_VERITY ioctl retrieves the digest of a verity file.
+The fs-verity file digest is a cryptographic digest that identifies
+the file contents that are being enforced on reads; it is computed via
+a Merkle tree and is different from a traditional full-file digest.
This ioctl takes in a pointer to a variable-length structure::
@@ -197,7 +198,7 @@ On success, 0 is returned and the kernel fills in the structure as
follows:
- ``digest_algorithm`` will be the hash algorithm used for the file
- measurement. It will match ``fsverity_enable_arg::hash_algorithm``.
+ digest. It will match ``fsverity_enable_arg::hash_algorithm``.
- ``digest_size`` will be the size of the digest in bytes, e.g. 32
for SHA-256. (This can be redundant with ``digest_algorithm``.)
- ``digest`` will be the actual bytes of the digest.
@@ -257,25 +258,24 @@ non-verity one, with the following exceptions:
with EIO (for read()) or SIGBUS (for mmap() reads).
- If the sysctl "fs.verity.require_signatures" is set to 1 and the
- file's verity measurement is not signed by a key in the fs-verity
- keyring, then opening the file will fail. See `Built-in signature
- verification`_.
+ file is not signed by a key in the fs-verity keyring, then opening
+ the file will fail. See `Built-in signature verification`_.
Direct access to the Merkle tree is not supported. Therefore, if a
verity file is copied, or is backed up and restored, then it will lose
its "verity"-ness. fs-verity is primarily meant for files like
executables that are managed by a package manager.
-File measurement computation
-============================
+File digest computation
+=======================
This section describes how fs-verity hashes the file contents using a
-Merkle tree to produce the "file measurement" which cryptographically
-identifies the file contents. This algorithm is the same for all
-filesystems that support fs-verity.
+Merkle tree to produce the digest which cryptographically identifies
+the file contents. This algorithm is the same for all filesystems
+that support fs-verity.
Userspace only needs to be aware of this algorithm if it needs to
-compute the file measurement itself, e.g. in order to sign the file.
+compute fs-verity file digests itself, e.g. in order to sign files.
.. _fsverity_merkle_tree:
@@ -325,26 +325,22 @@ can't a distinguish a large file from a small second file whose data
is exactly the top-level hash block of the first file. Ambiguities
also arise from the convention of padding to the next block boundary.
-To solve this problem, the verity file measurement is actually
-computed as a hash of the following structure, which contains the
-Merkle tree root hash as well as other fields such as the file size::
+To solve this problem, the fs-verity file digest is actually computed
+as a hash of the following structure, which contains the Merkle tree
+root hash as well as other fields such as the file size::
struct fsverity_descriptor {
__u8 version; /* must be 1 */
__u8 hash_algorithm; /* Merkle tree hash algorithm */
__u8 log_blocksize; /* log2 of size of data and tree blocks */
__u8 salt_size; /* size of salt in bytes; 0 if none */
- __le32 sig_size; /* must be 0 */
+ __le32 __reserved_0x04; /* must be 0 */
__le64 data_size; /* size of file the Merkle tree is built over */
__u8 root_hash[64]; /* Merkle tree root hash */
__u8 salt[32]; /* salt prepended to each hashed block */
__u8 __reserved[144]; /* must be 0's */
};
-Note that the ``sig_size`` field must be set to 0 for the purpose of
-computing the file measurement, even if a signature was provided (or
-will be provided) to `FS_IOC_ENABLE_VERITY`_.
-
Built-in signature verification
===============================
@@ -359,20 +355,20 @@ kernel. Specifically, it adds support for:
certificates from being added.
2. `FS_IOC_ENABLE_VERITY`_ accepts a pointer to a PKCS#7 formatted
- detached signature in DER format of the file measurement. On
- success, this signature is persisted alongside the Merkle tree.
+ detached signature in DER format of the file's fs-verity digest.
+ On success, this signature is persisted alongside the Merkle tree.
Then, any time the file is opened, the kernel will verify the
- file's actual measurement against this signature, using the
- certificates in the ".fs-verity" keyring.
+ file's actual digest against this signature, using the certificates
+ in the ".fs-verity" keyring.
3. A new sysctl "fs.verity.require_signatures" is made available.
When set to 1, the kernel requires that all verity files have a
- correctly signed file measurement as described in (2).
+ correctly signed digest as described in (2).
-File measurements must be signed in the following format, which is
-similar to the structure used by `FS_IOC_MEASURE_VERITY`_::
+fs-verity file digests must be signed in the following format, which
+is similar to the structure used by `FS_IOC_MEASURE_VERITY`_::
- struct fsverity_signed_digest {
+ struct fsverity_formatted_digest {
char magic[8]; /* must be "FSVerity" */
__le16 digest_algorithm;
__le16 digest_size;
@@ -421,8 +417,8 @@ can only be set by `FS_IOC_ENABLE_VERITY`_, and it cannot be cleared.
ext4 also supports encryption, which can be used simultaneously with
fs-verity. In this case, the plaintext data is verified rather than
-the ciphertext. This is necessary in order to make the file
-measurement meaningful, since every file is encrypted differently.
+the ciphertext. This is necessary in order to make the fs-verity file
+digest meaningful, since every file is encrypted differently.
ext4 stores the verity metadata (Merkle tree and fsverity_descriptor)
past the end of the file, starting at the first 64K boundary beyond
@@ -592,8 +588,8 @@ weren't already directly answered in other parts of this document.
:Q: Isn't fs-verity useless because the attacker can just modify the
hashes in the Merkle tree, which is stored on-disk?
:A: To verify the authenticity of an fs-verity file you must verify
- the authenticity of the "file measurement", which is basically the
- root hash of the Merkle tree. See `Use cases`_.
+ the authenticity of the "fs-verity file digest", which
+ incorporates the root hash of the Merkle tree. See `Use cases`_.
:Q: Isn't fs-verity useless because the attacker can just replace a
verity file with a non-verity one?
diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystems/index.rst
index 98f59a864242..7be9b46d85d9 100644
--- a/Documentation/filesystems/index.rst
+++ b/Documentation/filesystems/index.rst
@@ -113,7 +113,7 @@ Documentation for filesystem implementations.
sysv-fs
tmpfs
ubifs
- ubifs-authentication.rst
+ ubifs-authentication
udf
virtiofs
vfat
diff --git a/Documentation/filesystems/journalling.rst b/Documentation/filesystems/journalling.rst
index 5a5f70b4063e..e18f90ffc6fd 100644
--- a/Documentation/filesystems/journalling.rst
+++ b/Documentation/filesystems/journalling.rst
@@ -136,10 +136,8 @@ Fast commits
~~~~~~~~~~~~
JBD2 to also allows you to perform file-system specific delta commits known as
-fast commits. In order to use fast commits, you first need to call
-:c:func:`jbd2_fc_init` and tell how many blocks at the end of journal
-area should be reserved for fast commits. Along with that, you will also need
-to set following callbacks that perform correspodning work:
+fast commits. In order to use fast commits, you will need to set following
+callbacks that perform correspodning work:
`journal->j_fc_cleanup_cb`: Cleanup function called after every full commit and
fast commit.
diff --git a/Documentation/filesystems/mount_api.rst b/Documentation/filesystems/mount_api.rst
index d7f53d62b5bb..eb358a00be27 100644
--- a/Documentation/filesystems/mount_api.rst
+++ b/Documentation/filesystems/mount_api.rst
@@ -774,7 +774,7 @@ process the parameters it is given.
should just be set to lie inside the low-to-high range.
If all is good, true is returned. If the table is invalid, errors are
- logged to dmesg and false is returned.
+ logged to the kernel log buffer and false is returned.
* ::
@@ -782,7 +782,7 @@ process the parameters it is given.
This performs some validation checks on a parameter description. It
returns true if the description is good and false if it is not. It will
- log errors to dmesg if validation fails.
+ log errors to the kernel log buffer if validation fails.
* ::
diff --git a/Documentation/filesystems/nfs/exporting.rst b/Documentation/filesystems/nfs/exporting.rst
index 33d588a01ace..0e98edd353b5 100644
--- a/Documentation/filesystems/nfs/exporting.rst
+++ b/Documentation/filesystems/nfs/exporting.rst
@@ -154,6 +154,11 @@ struct which has the following members:
to find potential names, and matches inode numbers to find the correct
match.
+ flags
+ Some filesystems may need to be handled differently than others. The
+ export_operations struct also includes a flags field that allows the
+ filesystem to communicate such information to nfsd. See the Export
+ Operations Flags section below for more explanation.
A filehandle fragment consists of an array of 1 or more 4byte words,
together with a one byte "type".
@@ -163,3 +168,50 @@ generated by encode_fh, in which case it will have been padded with
nuls. Rather, the encode_fh routine should choose a "type" which
indicates the decode_fh how much of the filehandle is valid, and how
it should be interpreted.
+
+Export Operations Flags
+-----------------------
+In addition to the operation vector pointers, struct export_operations also
+contains a "flags" field that allows the filesystem to communicate to nfsd
+that it may want to do things differently when dealing with it. The
+following flags are defined:
+
+ EXPORT_OP_NOWCC - disable NFSv3 WCC attributes on this filesystem
+ RFC 1813 recommends that servers always send weak cache consistency
+ (WCC) data to the client after each operation. The server should
+ atomically collect attributes about the inode, do an operation on it,
+ and then collect the attributes afterward. This allows the client to
+ skip issuing GETATTRs in some situations but means that the server
+ is calling vfs_getattr for almost all RPCs. On some filesystems
+ (particularly those that are clustered or networked) this is expensive
+ and atomicity is difficult to guarantee. This flag indicates to nfsd
+ that it should skip providing WCC attributes to the client in NFSv3
+ replies when doing operations on this filesystem. Consider enabling
+ this on filesystems that have an expensive ->getattr inode operation,
+ or when atomicity between pre and post operation attribute collection
+ is impossible to guarantee.
+
+ EXPORT_OP_NOSUBTREECHK - disallow subtree checking on this fs
+ Many NFS operations deal with filehandles, which the server must then
+ vet to ensure that they live inside of an exported tree. When the
+ export consists of an entire filesystem, this is trivial. nfsd can just
+ ensure that the filehandle live on the filesystem. When only part of a
+ filesystem is exported however, then nfsd must walk the ancestors of the
+ inode to ensure that it's within an exported subtree. This is an
+ expensive operation and not all filesystems can support it properly.
+ This flag exempts the filesystem from subtree checking and causes
+ exportfs to get back an error if it tries to enable subtree checking
+ on it.
+
+ EXPORT_OP_CLOSE_BEFORE_UNLINK - always close cached files before unlinking
+ On some exportable filesystems (such as NFS) unlinking a file that
+ is still open can cause a fair bit of extra work. For instance,
+ the NFS client will do a "sillyrename" to ensure that the file
+ sticks around while it's still open. When reexporting, that open
+ file is held by nfsd so we usually end up doing a sillyrename, and
+ then immediately deleting the sillyrenamed file just afterward when
+ the link count actually goes to zero. Sometimes this delete can race
+ with other operations (for instance an rmdir of the parent directory).
+ This flag causes nfsd to close any open files for this inode _before_
+ calling into the vfs to do an unlink or a rename that would replace
+ an existing file.
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index 533c79e8d2cd..2fa69f710e2a 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -210,6 +210,7 @@ read the file /proc/PID/status::
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
+ SpeculationIndirectBranch: conditional enabled
voluntary_ctxt_switches: 0
nonvoluntary_ctxt_switches: 1
@@ -292,6 +293,7 @@ It's slow but very precise.
NoNewPrivs no_new_privs, like prctl(PR_GET_NO_NEW_PRIV, ...)
Seccomp seccomp mode, like prctl(PR_GET_SECCOMP, ...)
Speculation_Store_Bypass speculative store bypass mitigation status
+ SpeculationIndirectBranch indirect branch speculation mode
Cpus_allowed mask of CPUs on which this process may run
Cpus_allowed_list Same as previous, but in "list format"
Mems_allowed mask of memory nodes allowed to this process
@@ -546,6 +548,7 @@ encoded manner. The codes are the following:
nh no huge page advise flag
mg mergable advise flag
bt arm64 BTI guarded page
+ mt arm64 MTE allocation tags are enabled
== =======================================
Note that there is no guarantee that every flag and associated mnemonic will
diff --git a/Documentation/filesystems/tmpfs.rst b/Documentation/filesystems/tmpfs.rst
index c44f8b1d3cab..0408c245785e 100644
--- a/Documentation/filesystems/tmpfs.rst
+++ b/Documentation/filesystems/tmpfs.rst
@@ -4,7 +4,7 @@
Tmpfs
=====
-Tmpfs is a file system which keeps all files in virtual memory.
+Tmpfs is a file system which keeps all of its files in virtual memory.
Everything in tmpfs is temporary in the sense that no files will be
@@ -35,7 +35,7 @@ tmpfs has the following uses:
memory.
This mount does not depend on CONFIG_TMPFS. If CONFIG_TMPFS is not
- set, the user visible part of tmpfs is not build. But the internal
+ set, the user visible part of tmpfs is not built. But the internal
mechanisms are always present.
2) glibc 2.2 and above expects tmpfs to be mounted at /dev/shm for
@@ -50,7 +50,7 @@ tmpfs has the following uses:
This mount is _not_ needed for SYSV shared memory. The internal
mount is used for that. (In the 2.3 kernel versions it was
necessary to mount the predecessor of tmpfs (shm fs) to use SYSV
- shared memory)
+ shared memory.)
3) Some people (including me) find it very convenient to mount it
e.g. on /tmp and /var/tmp and have a big swap partition. And now
@@ -83,7 +83,7 @@ If nr_blocks=0 (or size=0), blocks will not be limited in that instance;
if nr_inodes=0, inodes will not be limited. It is generally unwise to
mount with such options, since it allows any user with write access to
use up all the memory on the machine; but enhances the scalability of
-that instance in a system with many cpus making intensive use of it.
+that instance in a system with many CPUs making intensive use of it.
tmpfs has a mount option to set the NUMA memory allocation policy for