summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--Documentation/filesystems/caching/fscache.rst565
-rw-r--r--Documentation/filesystems/caching/fscache.txt448
-rw-r--r--Documentation/filesystems/caching/index.rst1
-rw-r--r--fs/fscache/Kconfig8
4 files changed, 570 insertions, 452 deletions
diff --git a/Documentation/filesystems/caching/fscache.rst b/Documentation/filesystems/caching/fscache.rst
new file mode 100644
index 000000000000..950602147aa5
--- /dev/null
+++ b/Documentation/filesystems/caching/fscache.rst
@@ -0,0 +1,565 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================
+General Filesystem Caching
+==========================
+
+Overview
+========
+
+This facility is a general purpose cache for network filesystems, though it
+could be used for caching other things such as ISO9660 filesystems too.
+
+FS-Cache mediates between cache backends (such as CacheFS) and network
+filesystems::
+
+ +---------+
+ | | +--------------+
+ | NFS |--+ | |
+ | | | +-->| CacheFS |
+ +---------+ | +----------+ | | /dev/hda5 |
+ | | | | +--------------+
+ +---------+ +-->| | |
+ | | | |--+
+ | AFS |----->| FS-Cache |
+ | | | |--+
+ +---------+ +-->| | |
+ | | | | +--------------+
+ +---------+ | +----------+ | | |
+ | | | +-->| CacheFiles |
+ | ISOFS |--+ | /var/cache |
+ | | +--------------+
+ +---------+
+
+Or to look at it another way, FS-Cache is a module that provides a caching
+facility to a network filesystem such that the cache is transparent to the
+user::
+
+ +---------+
+ | |
+ | Server |
+ | |
+ +---------+
+ | NETWORK
+ ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ |
+ | +----------+
+ V | |
+ +---------+ | |
+ | | | |
+ | NFS |----->| FS-Cache |
+ | | | |--+
+ +---------+ | | | +--------------+ +--------------+
+ | | | | | | | |
+ V +----------+ +-->| CacheFiles |-->| Ext3 |
+ +---------+ | /var/cache | | /dev/sda6 |
+ | | +--------------+ +--------------+
+ | VFS | ^ ^
+ | | | |
+ +---------+ +--------------+ |
+ | KERNEL SPACE | |
+ ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|~~~~~~|~~~~
+ | USER SPACE | |
+ V | |
+ +---------+ +--------------+
+ | | | |
+ | Process | | cachefilesd |
+ | | | |
+ +---------+ +--------------+
+
+
+FS-Cache does not follow the idea of completely loading every netfs file
+opened in its entirety into a cache before permitting it to be accessed and
+then serving the pages out of that cache rather than the netfs inode because:
+
+ (1) It must be practical to operate without a cache.
+
+ (2) The size of any accessible file must not be limited to the size of the
+ cache.
+
+ (3) The combined size of all opened files (this includes mapped libraries)
+ must not be limited to the size of the cache.
+
+ (4) The user should not be forced to download an entire file just to do a
+ one-off access of a small portion of it (such as might be done with the
+ "file" program).
+
+It instead serves the cache out in PAGE_SIZE chunks as and when requested by
+the netfs('s) using it.
+
+
+FS-Cache provides the following facilities:
+
+ (1) More than one cache can be used at once. Caches can be selected
+ explicitly by use of tags.
+
+ (2) Caches can be added / removed at any time.
+
+ (3) The netfs is provided with an interface that allows either party to
+ withdraw caching facilities from a file (required for (2)).
+
+ (4) The interface to the netfs returns as few errors as possible, preferring
+ rather to let the netfs remain oblivious.
+
+ (5) Cookies are used to represent indices, files and other objects to the
+ netfs. The simplest cookie is just a NULL pointer - indicating nothing
+ cached there.
+
+ (6) The netfs is allowed to propose - dynamically - any index hierarchy it
+ desires, though it must be aware that the index search function is
+ recursive, stack space is limited, and indices can only be children of
+ indices.
+
+ (7) Data I/O is done direct to and from the netfs's pages. The netfs
+ indicates that page A is at index B of the data-file represented by cookie
+ C, and that it should be read or written. The cache backend may or may
+ not start I/O on that page, but if it does, a netfs callback will be
+ invoked to indicate completion. The I/O may be either synchronous or
+ asynchronous.
+
+ (8) Cookies can be "retired" upon release. At this point FS-Cache will mark
+ them as obsolete and the index hierarchy rooted at that point will get
+ recycled.
+
+ (9) The netfs provides a "match" function for index searches. In addition to
+ saying whether a match was made or not, this can also specify that an
+ entry should be updated or deleted.
+
+(10) As much as possible is done asynchronously.
+
+
+FS-Cache maintains a virtual indexing tree in which all indices, files, objects
+and pages are kept. Bits of this tree may actually reside in one or more
+caches::
+
+ FSDEF
+ |
+ +------------------------------------+
+ | |
+ NFS AFS
+ | |
+ +--------------------------+ +-----------+
+ | | | |
+ homedir mirror afs.org redhat.com
+ | | |
+ +------------+ +---------------+ +----------+
+ | | | | | |
+ 00001 00002 00007 00125 vol00001 vol00002
+ | | | | |
+ +---+---+ +-----+ +---+ +------+------+ +-----+----+
+ | | | | | | | | | | | | |
+ PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak
+ | |
+ PG0 +-------+
+ | |
+ 00001 00003
+ |
+ +---+---+
+ | | |
+ PG0 PG1 PG2
+
+In the example above, you can see two netfs's being backed: NFS and AFS. These
+have different index hierarchies:
+
+ * The NFS primary index contains per-server indices. Each server index is
+ indexed by NFS file handles to get data file objects. Each data file
+ objects can have an array of pages, but may also have further child
+ objects, such as extended attributes and directory entries. Extended
+ attribute objects themselves have page-array contents.
+
+ * The AFS primary index contains per-cell indices. Each cell index contains
+ per-logical-volume indices. Each of volume index contains up to three
+ indices for the read-write, read-only and backup mirrors of those volumes.
+ Each of these contains vnode data file objects, each of which contains an
+ array of pages.
+
+The very top index is the FS-Cache master index in which individual netfs's
+have entries.
+
+Any index object may reside in more than one cache, provided it only has index
+children. Any index with non-index object children will be assumed to only
+reside in one cache.
+
+
+The netfs API to FS-Cache can be found in:
+
+ Documentation/filesystems/caching/netfs-api.txt
+
+The cache backend API to FS-Cache can be found in:
+
+ Documentation/filesystems/caching/backend-api.txt
+
+A description of the internal representations and object state machine can be
+found in:
+
+ Documentation/filesystems/caching/object.rst
+
+
+Statistical Information
+=======================
+
+If FS-Cache is compiled with the following options enabled::
+
+ CONFIG_FSCACHE_STATS=y
+ CONFIG_FSCACHE_HISTOGRAM=y
+
+then it will gather certain statistics and display them through a number of
+proc files.
+
+/proc/fs/fscache/stats
+----------------------
+
+ This shows counts of a number of events that can happen in FS-Cache:
+
++--------------+-------+-------------------------------------------------------+
+|CLASS |EVENT |MEANING |
++==============+=======+=======================================================+
+|Cookies |idx=N |Number of index cookies allocated |
++ +-------+-------------------------------------------------------+
+| |dat=N |Number of data storage cookies allocated |
++ +-------+-------------------------------------------------------+
+| |spc=N |Number of special cookies allocated |
++--------------+-------+-------------------------------------------------------+
+|Objects |alc=N |Number of objects allocated |
++ +-------+-------------------------------------------------------+
+| |nal=N |Number of object allocation failures |
++ +-------+-------------------------------------------------------+
+| |avl=N |Number of objects that reached the available state |
++ +-------+-------------------------------------------------------+
+| |ded=N |Number of objects that reached the dead state |
++--------------+-------+-------------------------------------------------------+
+|ChkAux |non=N |Number of objects that didn't have a coherency check |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of objects that passed a coherency check |
++ +-------+-------------------------------------------------------+
+| |upd=N |Number of objects that needed a coherency data update |
++ +-------+-------------------------------------------------------+
+| |obs=N |Number of objects that were declared obsolete |
++--------------+-------+-------------------------------------------------------+
+|Pages |mrk=N |Number of pages marked as being cached |
+| |unc=N |Number of uncache page requests seen |
++--------------+-------+-------------------------------------------------------+
+|Acquire |n=N |Number of acquire cookie requests seen |
++ +-------+-------------------------------------------------------+
+| |nul=N |Number of acq reqs given a NULL parent |
++ +-------+-------------------------------------------------------+
+| |noc=N |Number of acq reqs rejected due to no cache available |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of acq reqs succeeded |
++ +-------+-------------------------------------------------------+
+| |nbf=N |Number of acq reqs rejected due to error |
++ +-------+-------------------------------------------------------+
+| |oom=N |Number of acq reqs failed on ENOMEM |
++--------------+-------+-------------------------------------------------------+
+|Lookups |n=N |Number of lookup calls made on cache backends |
++ +-------+-------------------------------------------------------+
+| |neg=N |Number of negative lookups made |
++ +-------+-------------------------------------------------------+
+| |pos=N |Number of positive lookups made |
++ +-------+-------------------------------------------------------+
+| |crt=N |Number of objects created by lookup |
++ +-------+-------------------------------------------------------+
+| |tmo=N |Number of lookups timed out and requeued |
++--------------+-------+-------------------------------------------------------+
+|Updates |n=N |Number of update cookie requests seen |
++ +-------+-------------------------------------------------------+
+| |nul=N |Number of upd reqs given a NULL parent |
++ +-------+-------------------------------------------------------+
+| |run=N |Number of upd reqs granted CPU time |
++--------------+-------+-------------------------------------------------------+
+|Relinqs |n=N |Number of relinquish cookie requests seen |
++ +-------+-------------------------------------------------------+
+| |nul=N |Number of rlq reqs given a NULL parent |
++ +-------+-------------------------------------------------------+
+| |wcr=N |Number of rlq reqs waited on completion of creation |
++--------------+-------+-------------------------------------------------------+
+|AttrChg |n=N |Number of attribute changed requests seen |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of attr changed requests queued |
++ +-------+-------------------------------------------------------+
+| |nbf=N |Number of attr changed rejected -ENOBUFS |
++ +-------+-------------------------------------------------------+
+| |oom=N |Number of attr changed failed -ENOMEM |
++ +-------+-------------------------------------------------------+
+| |run=N |Number of attr changed ops given CPU time |
++--------------+-------+-------------------------------------------------------+
+|Allocs |n=N |Number of allocation requests seen |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of successful alloc reqs |
++ +-------+-------------------------------------------------------+
+| |wt=N |Number of alloc reqs that waited on lookup completion |
++ +-------+-------------------------------------------------------+
+| |nbf=N |Number of alloc reqs rejected -ENOBUFS |
++ +-------+-------------------------------------------------------+
+| |int=N |Number of alloc reqs aborted -ERESTARTSYS |
++ +-------+-------------------------------------------------------+
+| |ops=N |Number of alloc reqs submitted |
++ +-------+-------------------------------------------------------+
+| |owt=N |Number of alloc reqs waited for CPU time |
++ +-------+-------------------------------------------------------+
+| |abt=N |Number of alloc reqs aborted due to object death |
++--------------+-------+-------------------------------------------------------+
+|Retrvls |n=N |Number of retrieval (read) requests seen |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of successful retr reqs |
++ +-------+-------------------------------------------------------+
+| |wt=N |Number of retr reqs that waited on lookup completion |
++ +-------+-------------------------------------------------------+
+| |nod=N |Number of retr reqs returned -ENODATA |
++ +-------+-------------------------------------------------------+
+| |nbf=N |Number of retr reqs rejected -ENOBUFS |
++ +-------+-------------------------------------------------------+
+| |int=N |Number of retr reqs aborted -ERESTARTSYS |
++ +-------+-------------------------------------------------------+
+| |oom=N |Number of retr reqs failed -ENOMEM |
++ +-------+-------------------------------------------------------+
+| |ops=N |Number of retr reqs submitted |
++ +-------+-------------------------------------------------------+
+| |owt=N |Number of retr reqs waited for CPU time |
++ +-------+-------------------------------------------------------+
+| |abt=N |Number of retr reqs aborted due to object death |
++--------------+-------+-------------------------------------------------------+
+|Stores |n=N |Number of storage (write) requests seen |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of successful store reqs |
++ +-------+-------------------------------------------------------+
+| |agn=N |Number of store reqs on a page already pending storage |
++ +-------+-------------------------------------------------------+
+| |nbf=N |Number of store reqs rejected -ENOBUFS |
++ +-------+-------------------------------------------------------+
+| |oom=N |Number of store reqs failed -ENOMEM |
++ +-------+-------------------------------------------------------+
+| |ops=N |Number of store reqs submitted |
++ +-------+-------------------------------------------------------+
+| |run=N |Number of store reqs granted CPU time |
++ +-------+-------------------------------------------------------+
+| |pgs=N |Number of pages given store req processing time |
++ +-------+-------------------------------------------------------+
+| |rxd=N |Number of store reqs deleted from tracking tree |
++ +-------+-------------------------------------------------------+
+| |olm=N |Number of store reqs over store limit |
++--------------+-------+-------------------------------------------------------+
+|VmScan |nos=N |Number of release reqs against pages with no |
+| | |pending store |
++ +-------+-------------------------------------------------------+
+| |gon=N |Number of release reqs against pages stored by |
+| | |time lock granted |
++ +-------+-------------------------------------------------------+
+| |bsy=N |Number of release reqs ignored due to in-progress store|
++ +-------+-------------------------------------------------------+
+| |can=N |Number of page stores cancelled due to release req |
++--------------+-------+-------------------------------------------------------+
+|Ops |pend=N |Number of times async ops added to pending queues |
++ +-------+-------------------------------------------------------+
+| |run=N |Number of times async ops given CPU time |
++ +-------+-------------------------------------------------------+
+| |enq=N |Number of times async ops queued for processing |
++ +-------+-------------------------------------------------------+
+| |can=N |Number of async ops cancelled |
++ +-------+-------------------------------------------------------+
+| |rej=N |Number of async ops rejected due to object |
+| | |lookup/create failure |
++ +-------+-------------------------------------------------------+
+| |ini=N |Number of async ops initialised |
++ +-------+-------------------------------------------------------+
+| |dfr=N |Number of async ops queued for deferred release |
++ +-------+-------------------------------------------------------+
+| |rel=N |Number of async ops released |
+| | |(should equal ini=N when idle) |
++ +-------+-------------------------------------------------------+
+| |gc=N |Number of deferred-release async ops garbage collected |
++--------------+-------+-------------------------------------------------------+
+|CacheOp |alo=N |Number of in-progress alloc_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |luo=N |Number of in-progress lookup_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |luc=N |Number of in-progress lookup_complete() cache ops |
++ +-------+-------------------------------------------------------+
+| |gro=N |Number of in-progress grab_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |upo=N |Number of in-progress update_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |dro=N |Number of in-progress drop_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |pto=N |Number of in-progress put_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |syn=N |Number of in-progress sync_cache() cache ops |
++ +-------+-------------------------------------------------------+
+| |atc=N |Number of in-progress attr_changed() cache ops |
++ +-------+-------------------------------------------------------+
+| |rap=N |Number of in-progress read_or_alloc_page() cache ops |
++ +-------+-------------------------------------------------------+
+| |ras=N |Number of in-progress read_or_alloc_pages() cache ops |
++ +-------+-------------------------------------------------------+
+| |alp=N |Number of in-progress allocate_page() cache ops |
++ +-------+-------------------------------------------------------+
+| |als=N |Number of in-progress allocate_pages() cache ops |
++ +-------+-------------------------------------------------------+
+| |wrp=N |Number of in-progress write_page() cache ops |
++ +-------+-------------------------------------------------------+
+| |ucp=N |Number of in-progress uncache_page() cache ops |
++ +-------+-------------------------------------------------------+
+| |dsp=N |Number of in-progress dissociate_pages() cache ops |
++--------------+-------+-------------------------------------------------------+
+|CacheEv |nsp=N |Number of object lookups/creations rejected due to |
+| | |lack of space |
++ +-------+-------------------------------------------------------+
+| |stl=N |Number of stale objects deleted |
++ +-------+-------------------------------------------------------+
+| |rtr=N |Number of objects retired when relinquished |
++ +-------+-------------------------------------------------------+
+| |cul=N |Number of objects culled |
++--------------+-------+-------------------------------------------------------+
+
+
+
+/proc/fs/fscache/histogram
+--------------------------
+
+ ::
+
+ cat /proc/fs/fscache/histogram
+ JIFS SECS OBJ INST OP RUNS OBJ RUNS RETRV DLY RETRIEVLS
+ ===== ===== ========= ========= ========= ========= =========
+
+ This shows the breakdown of the number of times each amount of time
+ between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The
+ columns are as follows:
+
+ ========= =======================================================
+ COLUMN TIME MEASUREMENT
+ ========= =======================================================
+ OBJ INST Length of time to instantiate an object
+ OP RUNS Length of time a call to process an operation took
+ OBJ RUNS Length of time a call to process an object event took
+ RETRV DLY Time between an requesting a read and lookup completing
+ RETRIEVLS Time between beginning and end of a retrieval
+ ========= =======================================================
+
+ Each row shows the number of events that took a particular range of times.
+ Each step is 1 jiffy in size. The JIFS column indicates the particular
+ jiffy range covered, and the SECS field the equivalent number of seconds.
+
+
+
+Object List
+===========
+
+If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a
+list of all the objects currently allocated and allow them to be viewed
+through::
+
+ /proc/fs/fscache/objects
+
+This will look something like::
+
+ [root@andromeda ~]# head /proc/fs/fscache/objects
+ OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA
+ ======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================
+ 17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a
+ 1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a
+
+where the first set of columns before the '|' describe the object:
+
+ ======= ===============================================================
+ COLUMN DESCRIPTION
+ ======= ===============================================================
+ OBJECT Object debugging ID (appears as OBJ%x in some debug messages)
+ PARENT Debugging ID of parent object
+ STAT Object state
+ CHLDN Number of child objects of this object
+ OPS Number of outstanding operations on this object
+ OOP Number of outstanding child object management operations
+ IPR
+ EX Number of outstanding exclusive operations
+ READS Number of outstanding read operations
+ EM Object's event mask
+ EV Events raised on this object
+ F Object flags
+ S Object work item busy state mask (1:pending 2:running)
+ ======= ===============================================================
+
+and the second set of columns describe the object's cookie, if present:
+
+ ================ ======================================================
+ COLUMN DESCRIPTION
+ ================ ======================================================
+ NETFS_COOKIE_DEF Name of netfs cookie definition
+ TY Cookie type (IX - index, DT - data, hex - special)
+ FL Cookie flags
+ NETFS_DATA Netfs private data stored in the cookie
+ OBJECT_KEY Object key } 1 column, with separating comma
+ AUX_DATA Object aux data } presence may be configured
+ ================ ======================================================
+
+The data shown may be filtered by attaching the a key to an appropriate keyring
+before viewing the file. Something like::
+
+ keyctl add user fscache:objlist <restrictions> @s
+
+where <restrictions> are a selection of the following letters:
+
+ == =========================================================
+ K Show hexdump of object key (don't show if not given)
+ A Show hexdump of object aux data (don't show if not given)
+ == =========================================================
+
+and the following paired letters:
+
+ == =========================================================
+ C Show objects that have a cookie
+ c Show objects that don't have a cookie
+ B Show objects that are busy
+ b Show objects that aren't busy
+ W Show objects that have pending writes
+ w Show objects that don't have pending writes
+ R Show objects that have outstanding reads
+ r Show objects that don't have outstanding reads
+ S Show objects that have work queued
+ s Show objects that don't have work queued
+ == =========================================================
+
+If neither side of a letter pair is given, then both are implied. For example:
+
+ keyctl add user fscache:objlist KB @s
+
+shows objects that are busy, and lists their object keys, but does not dump
+their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is
+not implied.
+
+By default all objects and all fields will be shown.
+
+
+Debugging
+=========
+
+If CONFIG_FSCACHE_DEBUG is enabled, the FS-Cache facility can have runtime
+debugging enabled by adjusting the value in::
+
+ /sys/module/fscache/parameters/debug
+
+This is a bitmask of debugging streams to enable:
+
+ ======= ======= =============================== =======================
+ BIT VALUE STREAM POINT
+ ======= ======= =============================== =======================
+ 0 1 Cache management Function entry trace
+ 1 2 Function exit trace
+ 2 4 General
+ 3 8 Cookie management Function entry trace
+ 4 16 Function exit trace
+ 5 32 General
+ 6 64 Page handling Function entry trace
+ 7 128 Function exit trace
+ 8 256 General
+ 9 512 Operation management Function entry trace
+ 10 1024 Function exit trace
+ 11 2048 General
+ ======= ======= =============================== =======================
+
+The appropriate set of values should be OR'd together and the result written to
+the control file. For example::
+
+ echo $((1|8|64)) >/sys/module/fscache/parameters/debug
+
+will turn on all function entry debugging.
diff --git a/Documentation/filesystems/caching/fscache.txt b/Documentation/filesystems/caching/fscache.txt
deleted file mode 100644
index 071ff50a774d..000000000000
--- a/Documentation/filesystems/caching/fscache.txt
+++ /dev/null
@@ -1,448 +0,0 @@
- ==========================
- General Filesystem Caching
- ==========================
-
-========
-OVERVIEW
-========
-
-This facility is a general purpose cache for network filesystems, though it
-could be used for caching other things such as ISO9660 filesystems too.
-
-FS-Cache mediates between cache backends (such as CacheFS) and network
-filesystems:
-
- +---------+
- | | +--------------+
- | NFS |--+ | |
- | | | +-->| CacheFS |
- +---------+ | +----------+ | | /dev/hda5 |
- | | | | +--------------+
- +---------+ +-->| | |
- | | | |--+
- | AFS |----->| FS-Cache |
- | | | |--+
- +---------+ +-->| | |
- | | | | +--------------+
- +---------+ | +----------+ | | |
- | | | +-->| CacheFiles |
- | ISOFS |--+ | /var/cache |
- | | +--------------+
- +---------+
-
-Or to look at it another way, FS-Cache is a module that provides a caching
-facility to a network filesystem such that the cache is transparent to the
-user:
-
- +---------+
- | |
- | Server |
- | |
- +---------+
- | NETWORK
- ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- |
- | +----------+
- V | |
- +---------+ | |
- | | | |
- | NFS |----->| FS-Cache |
- | | | |--+
- +---------+ | | | +--------------+ +--------------+
- | | | | | | | |
- V +----------+ +-->| CacheFiles |-->| Ext3 |
- +---------+ | /var/cache | | /dev/sda6 |
- | | +--------------+ +--------------+
- | VFS | ^ ^
- | | | |
- +---------+ +--------------+ |
- | KERNEL SPACE | |
- ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|~~~~~~|~~~~
- | USER SPACE | |
- V | |
- +---------+ +--------------+
- | | | |
- | Process | | cachefilesd |
- | | | |
- +---------+ +--------------+
-
-
-FS-Cache does not follow the idea of completely loading every netfs file
-opened in its entirety into a cache before permitting it to be accessed and
-then serving the pages out of that cache rather than the netfs inode because:
-
- (1) It must be practical to operate without a cache.
-
- (2) The size of any accessible file must not be limited to the size of the
- cache.
-
- (3) The combined size of all opened files (this includes mapped libraries)
- must not be limited to the size of the cache.
-
- (4) The user should not be forced to download an entire file just to do a
- one-off access of a small portion of it (such as might be done with the
- "file" program).
-
-It instead serves the cache out in PAGE_SIZE chunks as and when requested by
-the netfs('s) using it.
-
-
-FS-Cache provides the following facilities:
-
- (1) More than one cache can be used at once. Caches can be selected
- explicitly by use of tags.
-
- (2) Caches can be added / removed at any time.
-
- (3) The netfs is provided with an interface that allows either party to
- withdraw caching facilities from a file (required for (2)).
-
- (4) The interface to the netfs returns as few errors as possible, preferring
- rather to let the netfs remain oblivious.
-
- (5) Cookies are used to represent indices, files and other objects to the
- netfs. The simplest cookie is just a NULL pointer - indicating nothing
- cached there.
-
- (6) The netfs is allowed to propose - dynamically - any index hierarchy it
- desires, though it must be aware that the index search function is
- recursive, stack space is limited, and indices can only be children of
- indices.
-
- (7) Data I/O is done direct to and from the netfs's pages. The netfs
- indicates that page A is at index B of the data-file represented by cookie
- C, and that it should be read or written. The cache backend may or may
- not start I/O on that page, but if it does, a netfs callback will be
- invoked to indicate completion. The I/O may be either synchronous or
- asynchronous.
-
- (8) Cookies can be "retired" upon release. At this point FS-Cache will mark
- them as obsolete and the index hierarchy rooted at that point will get
- recycled.
-
- (9) The netfs provides a "match" function for index searches. In addition to
- saying whether a match was made or not, this can also specify that an
- entry should be updated or deleted.
-
-(10) As much as possible is done asynchronously.
-
-
-FS-Cache maintains a virtual indexing tree in which all indices, files, objects
-and pages are kept. Bits of this tree may actually reside in one or more
-caches.
-
- FSDEF
- |
- +------------------------------------+
- | |
- NFS AFS
- | |
- +--------------------------+ +-----------+
- | | | |
- homedir mirror afs.org redhat.com
- | | |
- +------------+ +---------------+ +----------+
- | | | | | |
- 00001 00002 00007 00125 vol00001 vol00002
- | | | | |
- +---+---+ +-----+ +---+ +------+------+ +-----+----+
- | | | | | | | | | | | | |
-PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak
- | |
- PG0 +-------+
- | |
- 00001 00003
- |
- +---+---+
- | | |
- PG0 PG1 PG2
-
-In the example above, you can see two netfs's being backed: NFS and AFS. These
-have different index hierarchies:
-
- (*) The NFS primary index contains per-server indices. Each server index is
- indexed by NFS file handles to get data file objects. Each data file
- objects can have an array of pages, but may also have further child
- objects, such as extended attributes and directory entries. Extended
- attribute objects themselves have page-array contents.
-
- (*) The AFS primary index contains per-cell indices. Each cell index contains
- per-logical-volume indices. Each of volume index contains up to three
- indices for the read-write, read-only and backup mirrors of those volumes.
- Each of these contains vnode data file objects, each of which contains an
- array of pages.
-
-The very top index is the FS-Cache master index in which individual netfs's
-have entries.
-
-Any index object may reside in more than one cache, provided it only has index
-children. Any index with non-index object children will be assumed to only
-reside in one cache.
-
-
-The netfs API to FS-Cache can be found in:
-
- Documentation/filesystems/caching/netfs-api.txt
-
-The cache backend API to FS-Cache can be found in:
-
- Documentation/filesystems/caching/backend-api.txt
-
-A description of the internal representations and object state machine can be
-found in:
-
- Documentation/filesystems/caching/object.rst
-
-
-=======================
-STATISTICAL INFORMATION
-=======================
-
-If FS-Cache is compiled with the following options enabled:
-
- CONFIG_FSCACHE_STATS=y
- CONFIG_FSCACHE_HISTOGRAM=y
-
-then it will gather certain statistics and display them through a number of
-proc files.
-
- (*) /proc/fs/fscache/stats
-
- This shows counts of a number of events that can happen in FS-Cache:
-
- CLASS EVENT MEANING
- ======= ======= =======================================================
- Cookies idx=N Number of index cookies allocated
- dat=N Number of data storage cookies allocated
- spc=N Number of special cookies allocated
- Objects alc=N Number of objects allocated
- nal=N Number of object allocation failures
- avl=N Number of objects that reached the available state
- ded=N Number of objects that reached the dead state
- ChkAux non=N Number of objects that didn't have a coherency check
- ok=N Number of objects that passed a coherency check
- upd=N Number of objects that needed a coherency data update
- obs=N Number of objects that were declared obsolete
- Pages mrk=N Number of pages marked as being cached
- unc=N Number of uncache page requests seen
- Acquire n=N Number of acquire cookie requests seen
- nul=N Number of acq reqs given a NULL parent
- noc=N Number of acq reqs rejected due to no cache available
- ok=N Number of acq reqs succeeded
- nbf=N Number of acq reqs rejected due to error
- oom=N Number of acq reqs failed on ENOMEM
- Lookups n=N Number of lookup calls made on cache backends
- neg=N Number of negative lookups made
- pos=N Number of positive lookups made
- crt=N Number of objects created by lookup
- tmo=N Number of lookups timed out and requeued
- Updates n=N Number of update cookie requests seen
- nul=N Number of upd reqs given a NULL parent
- run=N Number of upd reqs granted CPU time
- Relinqs n=N Number of relinquish cookie requests seen
- nul=N Number of rlq reqs given a NULL parent
- wcr=N Number of rlq reqs waited on completion of creation
- AttrChg n=N Number of attribute changed requests seen
- ok=N Number of attr changed requests queued
- nbf=N Number of attr changed rejected -ENOBUFS
- oom=N Number of attr changed failed -ENOMEM
- run=N Number of attr changed ops given CPU time
- Allocs n=N Number of allocation requests seen
- ok=N Number of successful alloc reqs
- wt=N Number of alloc reqs that waited on lookup completion
- nbf=N Number of alloc reqs rejected -ENOBUFS
- int=N Number of alloc reqs aborted -ERESTARTSYS
- ops=N Number of alloc reqs submitted
- owt=N Number of alloc reqs waited for CPU time
- abt=N Number of alloc reqs aborted due to object death
- Retrvls n=N Number of retrieval (read) requests seen
- ok=N Number of successful retr reqs
- wt=N Number of retr reqs that waited on lookup completion
- nod=N Number of retr reqs returned -ENODATA
- nbf=N Number of retr reqs rejected -ENOBUFS
- int=N Number of retr reqs aborted -ERESTARTSYS
- oom=N Number of retr reqs failed -ENOMEM
- ops=N Number of retr reqs submitted
- owt=N Number of retr reqs waited for CPU time
- abt=N Number of retr reqs aborted due to object death
- Stores n=N Number of storage (write) requests seen
- ok=N Number of successful store reqs
- agn=N Number of store reqs on a page already pending storage
- nbf=N Number of store reqs rejected -ENOBUFS
- oom=N Number of store reqs failed -ENOMEM
- ops=N Number of store reqs submitted
- run=N Number of store reqs granted CPU time
- pgs=N Number of pages given store req processing time
- rxd=N Number of store reqs deleted from tracking tree
- olm=N Number of store reqs over store limit
- VmScan nos=N Number of release reqs against pages with no pending store
- gon=N Number of release reqs against pages stored by time lock granted
- bsy=N Number of release reqs ignored due to in-progress store
- can=N Number of page stores cancelled due to release req
- Ops pend=N Number of times async ops added to pending queues
- run=N Number of times async ops given CPU time
- enq=N Number of times async ops queued for processing
- can=N Number of async ops cancelled
- rej=N Number of async ops rejected due to object lookup/create failure
- ini=N Number of async ops initialised
- dfr=N Number of async ops queued for deferred release
- rel=N Number of async ops released (should equal ini=N when idle)
- gc=N Number of deferred-release async ops garbage collected
- CacheOp alo=N Number of in-progress alloc_object() cache ops
- luo=N Number of in-progress lookup_object() cache ops
- luc=N Number of in-progress lookup_complete() cache ops
- gro=N Number of in-progress grab_object() cache ops
- upo=N Number of in-progress update_object() cache ops
- dro=N Number of in-progress drop_object() cache ops
- pto=N Number of in-progress put_object() cache ops
- syn=N Number of in-progress sync_cache() cache ops
- atc=N Number of in-progress attr_changed() cache ops
- rap=N Number of in-progress read_or_alloc_page() cache ops
- ras=N Number of in-progress read_or_alloc_pages() cache ops
- alp=N Number of in-progress allocate_page() cache ops
- als=N Number of in-progress allocate_pages() cache ops
- wrp=N Number of in-progress write_page() cache ops
- ucp=N Number of in-progress uncache_page() cache ops
- dsp=N Number of in-progress dissociate_pages() cache ops
- CacheEv nsp=N Number of object lookups/creations rejected due to lack of space
- stl=N Number of stale objects deleted
- rtr=N Number of objects retired when relinquished
- cul=N Number of objects culled
-
-
- (*) /proc/fs/fscache/histogram
-
- cat /proc/fs/fscache/histogram
- JIFS SECS OBJ INST OP RUNS OBJ RUNS RETRV DLY RETRIEVLS
- ===== ===== ========= ========= ========= ========= =========
-
- This shows the breakdown of the number of times each amount of time
- between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The
- columns are as follows:
-
- COLUMN TIME MEASUREMENT
- ======= =======================================================
- OBJ INST Length of time to instantiate an object
- OP RUNS Length of time a call to process an operation took
- OBJ RUNS Length of time a call to process an object event took
- RETRV DLY Time between an requesting a read and lookup completing
- RETRIEVLS Time between beginning and end of a retrieval
-
- Each row shows the number of events that took a particular range of times.
- Each step is 1 jiffy in size. The JIFS column indicates the particular
- jiffy range covered, and the SECS field the equivalent number of seconds.
-
-
-===========
-OBJECT LIST
-===========
-
-If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a
-list of all the objects currently allocated and allow them to be viewed
-through:
-
- /proc/fs/fscache/objects
-
-This will look something like:
-
- [root@andromeda ~]# head /proc/fs/fscache/objects
- OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA
- ======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================
- 17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a
- 1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a
-
-where the first set of columns before the '|' describe the object:
-
- COLUMN DESCRIPTION
- ======= ===============================================================
- OBJECT Object debugging ID (appears as OBJ%x in some debug messages)
- PARENT Debugging ID of parent object
- STAT Object state
- CHLDN Number of child objects of this object
- OPS Number of outstanding operations on this object
- OOP Number of outstanding child object management operations
- IPR
- EX Number of outstanding exclusive operations
- READS Number of outstanding read operations
- EM Object's event mask
- EV Events raised on this object
- F Object flags
- S Object work item busy state mask (1:pending 2:running)
-
-and the second set of columns describe the object's cookie, if present:
-
- COLUMN DESCRIPTION
- =============== =======================================================
- NETFS_COOKIE_DEF Name of netfs cookie definition
- TY Cookie type (IX - index, DT - data, hex - special)
- FL Cookie flags
- NETFS_DATA Netfs private data stored in the cookie
- OBJECT_KEY Object key } 1 column, with separating comma
- AUX_DATA Object aux data } presence may be configured
-
-The data shown may be filtered by attaching the a key to an appropriate keyring
-before viewing the file. Something like:
-
- keyctl add user fscache:objlist <restrictions> @s
-
-where <restrictions> are a selection of the following letters:
-
- K Show hexdump of object key (don't show if not given)
- A Show hexdump of object aux data (don't show if not given)
-
-and the following paired letters:
-
- C Show objects that have a cookie
- c Show objects that don't have a cookie
- B Show objects that are busy
- b Show objects that aren't busy
- W Show objects that have pending writes
- w Show objects that don't have pending writes
- R Show objects that have outstanding reads
- r Show objects that don't have outstanding reads
- S Show objects that have work queued
- s Show objects that don't have work queued
-
-If neither side of a letter pair is given, then both are implied. For example:
-
- keyctl add user fscache:objlist KB @s
-
-shows objects that are busy, and lists their object keys, but does not dump
-their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is
-not implied.
-
-By default all objects and all fields will be shown.
-
-
-=========
-DEBUGGING
-=========
-
-If CONFIG_FSCACHE_DEBUG is enabled, the FS-Cache facility can have runtime
-debugging enabled by adjusting the value in:
-
- /sys/module/fscache/parameters/debug
-
-This is a bitmask of debugging streams to enable:
-
- BIT VALUE STREAM POINT
- ======= ======= =============================== =======================
- 0 1 Cache management Function entry trace
- 1 2 Function exit trace
- 2 4 General
- 3 8 Cookie management Function entry trace
- 4 16 Function exit trace
- 5 32 General
- 6 64 Page handling Function entry trace
- 7 128 Function exit trace
- 8 256 General
- 9 512 Operation management Function entry trace
- 10 1024 Function exit trace
- 11 2048 General
-
-The appropriate set of values should be OR'd together and the result written to
-the control file. For example:
-
- echo $((1|8|64)) >/sys/module/fscache/parameters/debug
-
-will turn on all function entry debugging.
diff --git a/Documentation/filesystems/caching/index.rst b/Documentation/filesystems/caching/index.rst
index e5ec95ff0be2..f488747630aa 100644
--- a/Documentation/filesystems/caching/index.rst
+++ b/Documentation/filesystems/caching/index.rst
@@ -6,4 +6,5 @@ Filesystem Caching
.. toctree::
:maxdepth: 2
+ fscache
object
diff --git a/fs/fscache/Kconfig b/fs/fscache/Kconfig
index 506c5e643f0d..5e796e6c38e5 100644
--- a/fs/fscache/Kconfig
+++ b/fs/fscache/Kconfig
@@ -8,7 +8,7 @@ config FSCACHE
Different sorts of caches can be plugged in, depending on the
resources available.
- See Documentation/filesystems/caching/fscache.txt for more information.
+ See Documentation/filesystems/caching/fscache.rst for more information.
config FSCACHE_STATS
bool "Gather statistical information on local caching"
@@ -25,7 +25,7 @@ config FSCACHE_STATS
between CPUs. On the other hand, the stats are very useful for
debugging purposes. Saying 'Y' here is recommended.
- See Documentation/filesystems/caching/fscache.txt for more information.
+ See Documentation/filesystems/caching/fscache.rst for more information.
config FSCACHE_HISTOGRAM
bool "Gather latency information on local caching"
@@ -42,7 +42,7 @@ config FSCACHE_HISTOGRAM
bouncing between CPUs. On the other hand, the histogram may be
useful for debugging purposes. Saying 'N' here is recommended.
- See Documentation/filesystems/caching/fscache.txt for more information.
+ See Documentation/filesystems/caching/fscache.rst for more information.
config FSCACHE_DEBUG
bool "Debug FS-Cache"
@@ -52,7 +52,7 @@ config FSCACHE_DEBUG
management module. If this is set, the debugging output may be
enabled by setting bits in /sys/modules/fscache/parameter/debug.
- See Documentation/filesystems/caching/fscache.txt for more information.
+ See Documentation/filesystems/caching/fscache.rst for more information.
config FSCACHE_OBJECT_LIST
bool "Maintain global object list for debugging purposes"