summaryrefslogtreecommitdiff
path: root/fs/cachefiles/internal.h
AgeCommit message (Collapse)AuthorFilesLines
2024-07-05Merge patch series "cachefiles: random bugfixes"Christian Brauner1-0/+3
libaokun@huaweicloud.com <libaokun@huaweicloud.com> says: This is the third version of this patch series, in which another patch set is subsumed into this one to avoid confusing the two patch sets. (https://patchwork.kernel.org/project/linux-fsdevel/list/?series=854914) We've been testing ondemand mode for cachefiles since January, and we're almost done. We hit a lot of issues during the testing period, and this patch series fixes some of the issues. The patches have passed internal testing without regression. The following is a brief overview of the patches, see the patches for more details. Patch 1-2: Add fscache_try_get_volume() helper function to avoid fscache_volume use-after-free on cache withdrawal. Patch 3: Fix cachefiles_lookup_cookie() and cachefiles_withdraw_cache() concurrency causing cachefiles_volume use-after-free. Patch 4: Propagate error codes returned by vfs_getxattr() to avoid endless loops. Patch 5-7: A read request waiting for reopen could be closed maliciously before the reopen worker is executing or waiting to be scheduled. So ondemand_object_worker() may be called after the info and object and even the cache have been freed and trigger use-after-free. So use cancel_work_sync() in cachefiles_ondemand_clean_object() to cancel the reopen worker or wait for it to finish. Since it makes no sense to wait for the daemon to complete the reopen request, to avoid this pointless operation blocking cancel_work_sync(), Patch 1 avoids request generation by the DROPPING state when the request has not been sent, and Patch 2 flushes the requests of the current object before cancel_work_sync(). Patch 8: Cyclic allocation of msg_id to avoid msg_id reuse misleading the daemon to cause hung. Patch 9: Hold xas_lock during polling to avoid dereferencing reqs causing use-after-free. This issue was triggered frequently in our tests, and we found that anolis 5.10 had fixed it. So to avoid failing the test, this patch is pushed upstream as well. Baokun Li (7): netfs, fscache: export fscache_put_volume() and add fscache_try_get_volume() cachefiles: fix slab-use-after-free in fscache_withdraw_volume() cachefiles: fix slab-use-after-free in cachefiles_withdraw_cookie() cachefiles: propagate errors from vfs_getxattr() to avoid infinite loop cachefiles: stop sending new request when dropping object cachefiles: cancel all requests for the object that is being dropped cachefiles: cyclic allocation of msg_id to avoid reuse Hou Tao (1): cachefiles: wait for ondemand_object_worker to finish when dropping object Jingbo Xu (1): cachefiles: add missing lock protection when polling fs/cachefiles/cache.c | 45 ++++++++++++++++++++++++++++- fs/cachefiles/daemon.c | 4 +-- fs/cachefiles/internal.h | 3 ++ fs/cachefiles/ondemand.c | 52 ++++++++++++++++++++++++++++++---- fs/cachefiles/volume.c | 1 - fs/cachefiles/xattr.c | 5 +++- fs/netfs/fscache_volume.c | 14 +++++++++ fs/netfs/internal.h | 2 -- include/linux/fscache-cache.h | 6 ++++ include/trace/events/fscache.h | 4 +++ 10 files changed, 123 insertions(+), 13 deletions(-) Link: https://lore.kernel.org/r/20240628062930.2467993-1-libaokun@huaweicloud.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03cachefiles: cyclic allocation of msg_id to avoid reuseBaokun Li1-0/+1
Reusing the msg_id after a maliciously completed reopen request may cause a read request to remain unprocessed and result in a hung, as shown below: t1 | t2 | t3 ------------------------------------------------- cachefiles_ondemand_select_req cachefiles_ondemand_object_is_close(A) cachefiles_ondemand_set_object_reopening(A) queue_work(fscache_object_wq, &info->work) ondemand_object_worker cachefiles_ondemand_init_object(A) cachefiles_ondemand_send_req(OPEN) // get msg_id 6 wait_for_completion(&req_A->done) cachefiles_ondemand_daemon_read // read msg_id 6 req_A cachefiles_ondemand_get_fd copy_to_user // Malicious completion msg_id 6 copen 6,-1 cachefiles_ondemand_copen complete(&req_A->done) // will not set the object to close // because ondemand_id && fd is valid. // ondemand_object_worker() is done // but the object is still reopening. // new open req_B cachefiles_ondemand_init_object(B) cachefiles_ondemand_send_req(OPEN) // reuse msg_id 6 process_open_req copen 6,A.size // The expected failed copen was executed successfully Expect copen to fail, and when it does, it closes fd, which sets the object to close, and then close triggers reopen again. However, due to msg_id reuse resulting in a successful copen, the anonymous fd is not closed until the daemon exits. Therefore read requests waiting for reopen to complete may trigger hung task. To avoid this issue, allocate the msg_id cyclically to avoid reusing the msg_id for a very short duration of time. Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-9-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-03cachefiles: stop sending new request when dropping objectBaokun Li1-0/+2
Added CACHEFILES_ONDEMAND_OBJSTATE_DROPPING indicates that the cachefiles object is being dropped, and is set after the close request for the dropped object completes, and no new requests are allowed to be sent after this state. This prepares for the later addition of cancel_work_sync(). It prevents leftover reopen requests from being sent, to avoid processing unnecessary requests and to avoid cancel_work_sync() blocking by waiting for daemon to complete the reopen requests. Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240628062930.2467993-6-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-05-29cachefiles: flush all requests after setting CACHEFILES_DEADBaokun Li1-0/+3
In ondemand mode, when the daemon is processing an open request, if the kernel flags the cache as CACHEFILES_DEAD, the cachefiles_daemon_write() will always return -EIO, so the daemon can't pass the copen to the kernel. Then the kernel process that is waiting for the copen triggers a hung_task. Since the DEAD state is irreversible, it can only be exited by closing /dev/cachefiles. Therefore, after calling cachefiles_io_error() to mark the cache as CACHEFILES_DEAD, if in ondemand mode, flush all requests to avoid the above hungtask. We may still be able to read some of the cached data before closing the fd of /dev/cachefiles. Note that this relies on the patch that adds reference counting to the req, otherwise it may UAF. Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-12-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-05-29cachefiles: add spin_lock for cachefiles_ondemand_infoBaokun Li1-0/+1
The following concurrency may cause a read request to fail to be completed and result in a hung: t1 | t2 --------------------------------------------------------- cachefiles_ondemand_copen req = xa_erase(&cache->reqs, id) // Anon fd is maliciously closed. cachefiles_ondemand_fd_release xa_lock(&cache->reqs) cachefiles_ondemand_set_object_close(object) xa_unlock(&cache->reqs) cachefiles_ondemand_set_object_open // No one will ever close it again. cachefiles_ondemand_daemon_read cachefiles_ondemand_select_req // Get a read req but its fd is already closed. // The daemon can't issue a cread ioctl with an closed fd, then hung. So add spin_lock for cachefiles_ondemand_info to protect ondemand_id and state, thus we can avoid the above problem in cachefiles_ondemand_copen() by using ondemand_id to determine if fd has been closed. Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-8-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-05-29cachefiles: fix slab-use-after-free in cachefiles_ondemand_get_fd()Baokun Li1-0/+1
We got the following issue in a fuzz test of randomly issuing the restore command: ================================================================== BUG: KASAN: slab-use-after-free in cachefiles_ondemand_daemon_read+0x609/0xab0 Write of size 4 at addr ffff888109164a80 by task ondemand-04-dae/4962 CPU: 11 PID: 4962 Comm: ondemand-04-dae Not tainted 6.8.0-rc7-dirty #542 Call Trace: kasan_report+0x94/0xc0 cachefiles_ondemand_daemon_read+0x609/0xab0 vfs_read+0x169/0xb50 ksys_read+0xf5/0x1e0 Allocated by task 626: __kmalloc+0x1df/0x4b0 cachefiles_ondemand_send_req+0x24d/0x690 cachefiles_create_tmpfile+0x249/0xb30 cachefiles_create_file+0x6f/0x140 cachefiles_look_up_object+0x29c/0xa60 cachefiles_lookup_cookie+0x37d/0xca0 fscache_cookie_state_machine+0x43c/0x1230 [...] Freed by task 626: kfree+0xf1/0x2c0 cachefiles_ondemand_send_req+0x568/0x690 cachefiles_create_tmpfile+0x249/0xb30 cachefiles_create_file+0x6f/0x140 cachefiles_look_up_object+0x29c/0xa60 cachefiles_lookup_cookie+0x37d/0xca0 fscache_cookie_state_machine+0x43c/0x1230 [...] ================================================================== Following is the process that triggers the issue: mount | daemon_thread1 | daemon_thread2 ------------------------------------------------------------ cachefiles_ondemand_init_object cachefiles_ondemand_send_req REQ_A = kzalloc(sizeof(*req) + data_len) wait_for_completion(&REQ_A->done) cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req cachefiles_ondemand_get_fd copy_to_user(_buffer, msg, n) process_open_req(REQ_A) ------ restore ------ cachefiles_ondemand_restore xas_for_each(&xas, req, ULONG_MAX) xas_set_mark(&xas, CACHEFILES_REQ_NEW); cachefiles_daemon_read cachefiles_ondemand_daemon_read REQ_A = cachefiles_ondemand_select_req write(devfd, ("copen %u,%llu", msg->msg_id, size)); cachefiles_ondemand_copen xa_erase(&cache->reqs, id) complete(&REQ_A->done) kfree(REQ_A) cachefiles_ondemand_get_fd(REQ_A) fd = get_unused_fd_flags file = anon_inode_getfile fd_install(fd, file) load = (void *)REQ_A->msg.data; load->fd = fd; // load UAF !!! This issue is caused by issuing a restore command when the daemon is still alive, which results in a request being processed multiple times thus triggering a UAF. So to avoid this problem, add an additional reference count to cachefiles_req, which is held while waiting and reading, and then released when the waiting and reading is over. Note that since there is only one reference count for waiting, we need to avoid the same request being completed multiple times, so we can only complete the request if it is successfully removed from the xarray. Fixes: e73fa11a356c ("cachefiles: add restore command to recover inflight ondemand read requests") Suggested-by: Hou Tao <houtao1@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240522114308.2402121-4-libaokun@huaweicloud.com Acked-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jia Zhu <zhujia.zj@bytedance.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-01-19Merge tag 'vfs-6.8.netfs' of ↵Linus Torvalds1-1/+1
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs Pull netfs updates from Christian Brauner: "This extends the netfs helper library that network filesystems can use to replace their own implementations. Both afs and 9p are ported. cifs is ready as well but the patches are way bigger and will be routed separately once this is merged. That will remove lots of code as well. The overal goal is to get high-level I/O and knowledge of the page cache and ouf of the filesystem drivers. This includes knowledge about the existence of pages and folios The pull request converts afs and 9p. This removes about 800 lines of code from afs and 300 from 9p. For 9p it is now possible to do writes in larger than a page chunks. Additionally, multipage folio support can be turned on for 9p. Separate patches exist for cifs removing another 2000+ lines. I've included detailed information in the individual pulls I took. Summary: - Add NFS-style (and Ceph-style) locking around DIO vs buffered I/O calls to prevent these from happening at the same time. - Support for direct and unbuffered I/O. - Support for write-through caching in the page cache. - O_*SYNC and RWF_*SYNC writes use write-through rather than writing to the page cache and then flushing afterwards. - Support for write-streaming. - Support for write grouping. - Skip reads for which the server could only return zeros or EOF. - The fscache module is now part of the netfs library and the corresponding maintainer entry is updated. - Some helpers from the fscache subsystem are renamed to mark them as belonging to the netfs library. - Follow-up fixes for the netfs library. - Follow-up fixes for the 9p conversion" * tag 'vfs-6.8.netfs' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (50 commits) netfs: Fix wrong #ifdef hiding wait cachefiles: Fix signed/unsigned mixup netfs: Fix the loop that unmarks folios after writing to the cache netfs: Fix interaction between write-streaming and cachefiles culling netfs: Count DIO writes netfs: Mark netfs_unbuffered_write_iter_locked() static netfs: Fix proc/fs/fscache symlink to point to "netfs" not "../netfs" netfs: Rearrange netfs_io_subrequest to put request pointer first 9p: Use length of data written to the server in preference to error 9p: Do a couple of cleanups 9p: Fix initialisation of netfs_inode for 9p cachefiles: Fix __cachefiles_prepare_write() 9p: Use netfslib read/write_iter afs: Use the netfs write helpers netfs: Export the netfs_sreq tracepoint netfs: Optimise away reads above the point at which there can be no data netfs: Implement a write-through caching option netfs: Provide a launder_folio implementation netfs: Provide a writepages implementation netfs, cachefiles: Pass upper bound length to allow expansion ...
2023-12-28netfs, cachefiles: Pass upper bound length to allow expansionDavid Howells1-1/+1
Make netfslib pass the maximum length to the ->prepare_write() op to tell the cache how much it can expand the length of a write to. This allows a write to the server at the end of a file to be limited to a few bytes whilst writing an entire block to the cache (something required by direct I/O). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org
2023-11-25cachefiles: add restore command to recover inflight ondemand read requestsJia Zhu1-0/+3
Previously, in ondemand read scenario, if the anonymous fd was closed by user daemon, inflight and subsequent read requests would return EIO. As long as the device connection is not released, user daemon can hold and restore inflight requests by setting the request flag to CACHEFILES_REQ_NEW. Suggested-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com> Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Link: https://lore.kernel.org/r/20231120041422.75170-6-zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-25cachefiles: narrow the scope of triggering EPOLLIN events in ondemand modeJia Zhu1-0/+12
Don't trigger EPOLLIN when there are only reopening read requests in xarray. Suggested-by: Xin Yin <yinxin.x@bytedance.com> Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com> Link: https://lore.kernel.org/r/20231120041422.75170-5-zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-25cachefiles: resend an open request if the read request's object is closedJia Zhu1-0/+3
When an anonymous fd is closed by user daemon, if there is a new read request for this file comes up, the anonymous fd should be re-opened to handle that read request rather than fail it directly. 1. Introduce reopening state for objects that are closed but have inflight/subsequent read requests. 2. No longer flush READ requests but only CLOSE requests when anonymous fd is closed. 3. Enqueue the reopen work to workqueue, thus user daemon could get rid of daemon_read context and handle that request smoothly. Otherwise, the user daemon will send a reopen request and wait for itself to process the request. Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com> Link: https://lore.kernel.org/r/20231120041422.75170-4-zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-25cachefiles: extract ondemand info field from cachefiles_objectJia Zhu1-4/+22
We'll introduce a @work_struct field for @object in subsequent patches, it will enlarge the size of @object. As the result of that, this commit extracts ondemand info field from @object. Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com> Link: https://lore.kernel.org/r/20231120041422.75170-3-zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2023-11-25cachefiles: introduce object ondemand stateJia Zhu1-0/+21
Previously, @ondemand_id field was used not only to identify ondemand state of the object, but also to represent the index of the xarray. This commit introduces @state field to decouple the role of @ondemand_id and adds helpers to access it. Signed-off-by: Jia Zhu <zhujia.zj@bytedance.com> Link: https://lore.kernel.org/r/20231120041422.75170-2-zhujia.zj@bytedance.com Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2022-08-31cachefiles: make on-demand request distribution fairerXin Yin1-0/+1
For now, enqueuing and dequeuing on-demand requests all start from idx 0, this makes request distribution unfair. In the weighty concurrent I/O scenario, the request stored in higher idx will starve. Searching requests cyclically in cachefiles_ondemand_daemon_read, makes distribution fairer. Fixes: c8383054506c ("cachefiles: notify the user daemon when looking up cookie") Reported-by: Yongqing Li <liyongqing@bytedance.com> Signed-off-by: Xin Yin <yinxin.x@bytedance.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeffle Xu <jefflexu@linux.alibaba.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20220817065200.11543-1-yinxin.x@bytedance.com/ # v1 Link: https://lore.kernel.org/r/20220825020945.2293-1-yinxin.x@bytedance.com/ # v2
2022-05-17cachefiles: implement on-demand readJeffle Xu1-0/+9
Implement the data plane of on-demand read mode. The early implementation [1] place the entry to cachefiles_ondemand_read() in fscache_read(). However, fscache_read() can only detect if the requested file range is fully cache miss, whilst we need to notify the user daemon as long as there's a hole inside the requested file range. Thus the entry is now placed in cachefiles_prepare_read(). When working in on-demand read mode, once a hole detected, the read routine will send a READ request to the user daemon. The user daemon needs to fetch the data and write it to the cache file. After sending the READ request, the read routine will hang there, until the READ request is handled by the user daemon. Then it will retry to read from the same file range. If no progress encountered, the read routine will fail then. A new NETFS_SREQ_ONDEMAND flag is introduced to indicate that on-demand read should be done when a cache miss encountered. [1] https://lore.kernel.org/all/20220406075612.60298-6-jefflexu@linux.alibaba.com/ #v8 Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Acked-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20220425122143.56815-6-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-05-17cachefiles: notify the user daemon when withdrawing cookieJeffle Xu1-0/+5
Notify the user daemon that cookie is going to be withdrawn, providing a hint that the associated anonymous fd can be closed. Be noted that this is only a hint. The user daemon may close the associated anonymous fd when receiving the CLOSE request, then it will receive another anonymous fd when the cookie gets looked up. Or it may ignore the CLOSE request, and keep writing data through the anonymous fd. However the next time the cookie gets looked up, the user daemon will still receive another new anonymous fd. Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Acked-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20220425122143.56815-5-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-05-17cachefiles: unbind cachefiles gracefully in on-demand modeJeffle Xu1-0/+3
Add a refcount to avoid the deadlock in on-demand read mode. The on-demand read mode will pin the corresponding cachefiles object for each anonymous fd. The cachefiles object is unpinned when the anonymous fd gets closed. When the user daemon exits and the fd of "/dev/cachefiles" device node gets closed, it will wait for all cahcefiles objects getting withdrawn. Then if there's any anonymous fd getting closed after the fd of the device node, the user daemon will hang forever, waiting for all objects getting withdrawn. To fix this, add a refcount indicating if there's any object pinned by anonymous fds. The cachefiles cache gets unbound and withdrawn when the refcount is decreased to 0. It won't change the behaviour of the original mode, in which case the cachefiles cache gets unbound and withdrawn as long as the fd of the device node gets closed. Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Link: https://lore.kernel.org/r/20220509074028.74954-4-jefflexu@linux.alibaba.com Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-05-17cachefiles: notify the user daemon when looking up cookieJeffle Xu1-0/+51
Fscache/CacheFiles used to serve as a local cache for a remote networking fs. A new on-demand read mode will be introduced for CacheFiles, which can boost the scenario where on-demand read semantics are needed, e.g. container image distribution. The essential difference between these two modes is seen when a cache miss occurs: In the original mode, the netfs will fetch the data from the remote server and then write it to the cache file; in on-demand read mode, fetching the data and writing it into the cache is delegated to a user daemon. As the first step, notify the user daemon when looking up cookie. In this case, an anonymous fd is sent to the user daemon, through which the user daemon can write the fetched data to the cache file. Since the user daemon may move the anonymous fd around, e.g. through dup(), an object ID uniquely identifying the cache file is also attached. Also add one advisory flag (FSCACHE_ADV_WANT_CACHE_SIZE) suggesting that the cache file size shall be retrieved at runtime. This helps the scenario where one cache file contains multiple netfs files, e.g. for the purpose of deduplication. In this case, netfs itself has no idea the size of the cache file, whilst the user daemon should give the hint on it. Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Link: https://lore.kernel.org/r/20220509074028.74954-3-jefflexu@linux.alibaba.com Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-05-17cachefiles: extract write routineJeffle Xu1-0/+10
Extract the generic routine of writing data to cache files, and make it generally available. This will be used by the following patch implementing on-demand read mode. Since it's called inside CacheFiles module, make the interface generic and unrelated to netfs_cache_resources. It is worth noting that, ki->inval_counter is not initialized after this cleanup. It shall not make any visible difference, since inval_counter is no longer used in the write completion routine, i.e. cachefiles_write_complete(). Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com> Acked-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/20220425122143.56815-2-jefflexu@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2022-01-22cachefiles: Calculate the blockshift in terms of bytes, not pagesDavid Howells1-1/+1
Cachefiles keeps track of how much space is available on the backing filesystem and refuses new writes permission to start if there isn't enough (we especially don't want ENOSPC happening). It also tracks the amount of data pending in DIO writes (cache->b_writing) and reduces the amount of free space available by this amount before deciding if it can set up a new write. However, the old fscache I/O API was very much page-granularity dependent and, as such, cachefiles's cache->bshift was meant to be a multiplier to get from PAGE_SIZE to block size (ie. a blocksize of 512 would give a shift of 3 for a 4KiB page) - and this was incorrectly being used to turn the number of bytes in a DIO write into a number of blocks, leading to a massive over estimation of the amount of data in flight. Fix this by changing cache->bshift to be a multiplier from bytes to blocksize and deal with quantities of blocks, not quantities of pages. Fix also the rounding in the calculation in cachefiles_write() which needs a "- 1" inserting. Fixes: 047487c947e8 ("cachefiles: Implement the I/O routines") Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/164251398954.3435901.7138806620218474123.stgit@warthog.procyon.org.uk/ # v1
2022-01-07fscache, cachefiles: Display stats of no-space eventsDavid Howells1-2/+9
Add stat counters of no-space events that caused caching not to happen and display in /proc/fs/fscache/stats. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819653216.215744.17210522251617386509.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906958369.143852.7257100711818401748.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967166917.1823006.14842444049198947892.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021566184.640689.4417328329632709265.stgit@warthog.procyon.org.uk/ # v4
2022-01-07fscache, cachefiles: Store the volume coherency dataDavid Howells1-0/+2
Store the volume coherency data in an xattr and check it when we rebind the volume. If it doesn't match the cache volume is moved to the graveyard and rebuilt anew. Changes ======= ver #4: - Remove a couple of debugging prints. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/163967164397.1823006.2950539849831291830.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021563138.640689.15851092065380543119.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Implement begin and end I/O operationDavid Howells1-0/+18
Implement the methods for beginning and ending an I/O operation. When called to begin an I/O operation, we are guaranteed that the cookie has reached a certain stage (we're called by fscache after it has done a suitable wait). If a file is available, we paste a ref over into the cache resources for the I/O routines to use. This means that the object can be invalidated whilst the I/O is ongoing without the need to synchronise as the file pointer in the object is replaced, but the file pointer in the cache resources is unaffected. Ending the operation just requires ditching any refs we have and dropping the access guarantee that fscache got for us on the cookie. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819645033.215744.2199344081658268312.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906951916.143852.9531384743995679857.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967161222.1823006.4461476204800357263.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021559030.640689.3684291785218094142.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Implement backing file wranglingDavid Howells1-0/+9
Implement the wrangling of backing files, including the following pieces: (1) Lookup and creation of a file on disk, using a tmpfile if the file isn't yet present. The file is then opened, sized for DIO and the file handle is attached to the cachefiles_object struct. The inode is marked to indicate that it's in use by a kernel service. (2) Invalidation of an object, creating a tmpfile and switching the file pointer in the cachefiles object. (3) Committing a file to disk, including setting the coherency xattr on it and, if necessary, creating a hard link to it. Note that this would be a good place to use Omar Sandoval's vfs_link() with AT_LINK_REPLACE[1] as I may have to unlink an old file before I can link a tmpfile into place. (4) Withdrawal of open objects when a cache is being withdrawn or a cookie is relinquished. This involves committing or discarding the file. Changes ======= ver #2: - Fix logging of wrong error[1]. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/20211203094950.GA2480@kili/ [1] Link: https://lore.kernel.org/r/163819644097.215744.4505389616742411239.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906949512.143852.14222856795032602080.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967158526.1823006.17482695321424642675.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021557060.640689.16373541458119269871.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Implement culling daemon commandsDavid Howells1-0/+11
Implement the ability for the userspace daemon to try and cull a file or directory in the cache. Two daemon commands are implemented: (1) The "inuse" command. This queries if a file is in use or whether it can be deleted. It checks the S_KERNEL_FILE flag on the inode referred to by the specified filename. (2) The "cull" command. This asks for a file or directory to be removed, where removal means either unlinking it or moving it to the graveyard directory for userspace to dismantle. Changes ======= ver #2: - Fix logging of wrong error[1]. - Need to unmark an inode we've moved to the graveyard before unlocking. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/20211203094950.GA2480@kili/ [1] Link: https://lore.kernel.org/r/163819643179.215744.13641580295708315695.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906945705.143852.8177595531814485350.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967155792.1823006.1088936326902550910.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021555037.640689.9472627499842585255.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Mark a backing file in use with an inode flagDavid Howells1-0/+2
Use an inode flag, S_KERNEL_FILE, to mark that a backing file is in use by the kernel to prevent cachefiles or other kernel services from interfering with that file. Using S_SWAPFILE instead isn't really viable as that has other effects in the I/O paths. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819642273.215744.6414248677118690672.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906943215.143852.16972351425323967014.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967154118.1823006.13227551961786743991.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021541207.640689.564689725898537127.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/164021552299.640689.10578652796777392062.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Implement metadata/coherency data storage in xattrsDavid Howells1-0/+21
Use an xattr on each backing file in the cache to store some metadata, such as the content type and the coherency data. Five content types are defined: (0) No content stored. (1) The file contains a single monolithic blob and must be all or nothing. This would be used for something like an AFS directory or a symlink. (2) The file is populated with content completely up to a point with nothing beyond that. (3) The file has a map attached and is sparsely populated. This would be stored in one or more additional xattrs. (4) The file is dirty, being in the process of local modification and the contents are not necessarily represented correctly by the metadata. The file should be deleted if this is seen on binding. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819641320.215744.16346770087799536862.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906942248.143852.5423738045012094252.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967151734.1823006.9301249989443622576.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021550471.640689.553853918307994335.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Implement key to filename encodingDavid Howells1-0/+5
Implement a function to encode a binary cookie key as something that can be used as a filename. Four options are considered: (1) All printable chars with no '/' characters. Prepend a 'D' to indicate the encoding but otherwise use as-is. (2) Appears to be an array of __be32. Encode as 'S' plus a list of hex-encoded 32-bit ints separated by commas. If a number is 0, it is rendered as "" instead of "0". (3) Appears to be an array of __le32. Encoded as (2) but with a 'T' encoding prefix. (4) Encoded as base64 with an 'E' prefix plus a second char indicating how much padding is involved. A non-standard base64 encoding is used because '/' cannot be used in the encoded form. If (1) is not possible, whichever of (2), (3) or (4) produces the shortest string is selected (hex-encoding a number may be less dense than base64 encoding it). Note that the prefix characters have to be selected from the set [DEIJST@] lest cachefilesd remove the files because it recognise the name. Changes ======= ver #2: - Fix a short allocation that didn't allow for a string terminator[1] Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/bcefb8f2-576a-b3fc-cc29-89808ebfd7c1@linux.alibaba.com/ [1] Link: https://lore.kernel.org/r/163819640393.215744.15212364106412961104.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906940529.143852.17352132319136117053.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967149827.1823006.6088580775428487961.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021549223.640689.14762875188193982341.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Implement object lifecycle funcsDavid Howells1-2/+33
Implement allocate, get, see and put functions for the cachefiles_object struct. The members of the struct we're going to need are also added. Additionally, implement a lifecycle tracepoint. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819639457.215744.4600093239395728232.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906939569.143852.3594314410666551982.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967148857.1823006.6332962598220464364.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021547762.640689.8422781599594931000.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Implement volume supportDavid Howells1-0/+20
Implement support for creating the directory layout for a volume on disk and setting up and withdrawing volume caching. Each volume has a directory named for the volume key under the root of the cache (prefixed with an 'I' to indicate to cachefilesd that it's an index) and then creates a bunch of hash bucket subdirectories under that (named as '@' plus a hex number) in which cookie files will be created. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819635314.215744.13081522301564537723.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906936397.143852.17788457778396467161.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967143860.1823006.7185205806080225038.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021545212.640689.5064821392307582927.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Implement cache registration and withdrawalDavid Howells1-0/+9
Do the following: (1) Fill out cachefiles_daemon_add_cache() so that it sets up the cache directories and registers the cache with cachefiles. (2) Add a function to do the top-level part of cache withdrawal and unregistration. (3) Add a function to sync a cache. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819633175.215744.10857127598041268340.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906935445.143852.15545194974036410029.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967142904.1823006.244055483596047072.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021543872.640689.14370017789605073222.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Implement a function to get/create a directory in the cacheDavid Howells1-0/+9
Implement a function to get/create structural directories in the cache. This is used for setting up a cache and creating volume substructures. The directory in memory are marked with the S_KERNEL_FILE inode flag whilst they're in use to tell rmdir to reject attempts to remove them. Changes ======= ver #3: - Return an indication as to whether the directory was freshly created. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819631182.215744.3322471539523262619.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906933130.143852.962088616746509062.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967141952.1823006.7832985646370603833.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021542169.640689.18266858945694357839.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Provide a function to check how much space there isDavid Howells1-0/+7
Provide a function to check how much space there is. This also flips the state on the cache and will signal the daemon to inform it of the change and to ask it to do some culling if necessary. We will also need to subtract the amount of data currently being written to the cache (cache->b_writing) from the amount of available space to avoid hitting ENOSPC accidentally. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819629322.215744.13457425294680841213.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906930100.143852.1681026700865762069.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967140058.1823006.7781243664702837128.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021539957.640689.12477177372616805706.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Register a miscdev and parse commands over itDavid Howells1-0/+14
Register a misc device with which to talk to the daemon. The misc device holds a cache set up through it around and closing the device kills the cache. cachefilesd communicates with the kernel by passing it single-line text commands. Parse these and use them to parameterise the cache state. This does not implement the command to actually bring a cache online. That's left for later. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819628388.215744.17712097043607299608.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906929128.143852.14065207858943654011.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967139085.1823006.3514846391807454287.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021538400.640689.9172006906288062041.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Add security derivationDavid Howells1-0/+20
Implement code to derive a new set of creds for the cachefiles to use when making VFS or I/O calls and to change the auditing info since the application interacting with the network filesystem is not accessing the cache directly. Cachefiles uses override_creds() to change the effective creds temporarily. set_security_override_from_ctx() is called to derive the LSM 'label' that the cachefiles driver will act with. set_create_files_as() is called to determine the LSM 'label' that will be applied to files and directories created in the cache. These functions alter the new creds. Also implement a couple of functions to wrap the calls to begin/end cred overriding. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819627469.215744.3603633690679962985.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906928172.143852.15886637013364286786.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967138138.1823006.7620933448261939504.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021537001.640689.4081334436031700558.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Add cache error reporting macroDavid Howells1-0/+11
Add a macro to report a cache I/O error and to tell fscache that the cache is in trouble. Also add a pointer to the fscache cache cookie from the cachefiles_cache struct as we need that to pass to fscache_io_error(). Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819626562.215744.1503690975344731661.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906927235.143852.13694625647880837563.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967137158.1823006.2065038830569321335.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021536053.640689.5306822604644352548.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Add a couple of tracepoints for logging errorsDavid Howells1-0/+1
Add two trace points to log errors, one for vfs operations like mkdir or create, and one for I/O operations, like read, write or truncate. Also add the beginnings of a struct that is going to represent a data file and place a debugging ID in it for the tracepoints to record. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819625632.215744.17907340966178411033.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906926297.143852.18267924605548658911.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967135390.1823006.2512120406360156424.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021534029.640689.1875723624947577095.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Add some error injection supportDavid Howells1-1/+41
Add support for injecting ENOSPC or EIO errors. This needs to be enabled by CONFIG_CACHEFILES_ERROR_INJECTION=y. Once enabled, ENOSPC on things like write and mkdir can be triggered by: echo 1 >/proc/sys/cachefiles/error_injection and EIO can be triggered on most operations by: echo 2 >/proc/sys/cachefiles/error_injection Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819624706.215744.6911916249119962943.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906925343.143852.5465695512984025812.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967134412.1823006.7354285948280296595.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021532340.640689.18209494225772443698.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Define structsDavid Howells1-0/+46
Define the cachefiles_cache struct that's going to carry the cache-level parameters and state of a cache. Define the beginning of the cachefiles_object struct that's going to carry the state for a data storage object. For the moment this is just a debugging ID for logging purposes. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819623690.215744.2824739137193655547.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906924292.143852.15881439716653984905.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967131405.1823006.4480555941533935597.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021530610.640689.846094074334176928.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Introduce rewritten driverDavid Howells1-0/+115
Introduce basic skeleton of the rewritten cachefiles driver including config options so that it can be enabled for compilation. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819622766.215744.9108359326983195047.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906923341.143852.3856498104256721447.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967130320.1823006.15791456613198441566.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021528993.640689.9069695476048171884.stgit@warthog.procyon.org.uk/ # v4
2022-01-07cachefiles: Delete the cachefiles driver pending rewriteDavid Howells1-350/+0
Delete the code from the cachefiles driver to make it easier to rewrite and resubmit in a logical manner. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/163819577641.215744.12718114397770666596.stgit@warthog.procyon.org.uk/ # v1 Link: https://lore.kernel.org/r/163906883770.143852.4149714614981373410.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/163967076066.1823006.7175712134577687753.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/164021483619.640689.7586546280515844702.stgit@warthog.procyon.org.uk/ # v4
2021-08-27fscache, cachefiles: Remove the histogram stuffDavid Howells1-25/+0
Remove the histogram stuff as it's mostly going to be outdated. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/162431195953.2908479.16770977195634296638.stgit@warthog.procyon.org.uk/
2021-04-23fscache, cachefiles: Add alternate API to use kiocb for read/write to cacheDavid Howells1-0/+9
Add an alternate API by which the cache can be accessed through a kiocb, doing async DIO, rather than using the current API that tells the cache where all the pages are. The new API is intended to be used in conjunction with the netfs helper library. A filesystem must pick one or the other and not mix them. Filesystems wanting to use the new API must #define FSCACHE_USE_NEW_IO_API before #including the header. This prevents them from continuing to use the old API at the same time as there are incompatibilities in how the PG_fscache page bit is used. Changes: v6: - Provide a routine to shape a write so that the start and length can be aligned for DIO[3]. v4: - Use the vfs_iocb_iter_read/write() helpers[1] - Move initial definition of fscache_begin_read_operation() here. - Remove a commented-out line[2] - Combine ki->term_func calls in cachefiles_read_complete()[2]. - Remove explicit NULL initialiser[2]. - Remove extern on func decl[2]. - Put in param names on func decl[2]. - Remove redundant else[2]. - Fill out the kdoc comment for fscache_begin_read_operation(). - Rename fs/fscache/page2.c to io.c to match later patches. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-and-tested-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dave Wysochanski <dwysocha@redhat.com> Tested-By: Marc Dionne <marc.dionne@auristor.com> cc: Christoph Hellwig <hch@lst.de> cc: linux-cachefs@redhat.com cc: linux-afs@lists.infradead.org cc: linux-nfs@vger.kernel.org cc: linux-cifs@vger.kernel.org cc: ceph-devel@vger.kernel.org cc: v9fs-developer@lists.sourceforge.net cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20210216102614.GA27555@lst.de/ [1] Link: https://lore.kernel.org/r/20210216084230.GA23669@lst.de/ [2] Link: https://lore.kernel.org/r/161781047695.463527.7463536103593997492.stgit@warthog.procyon.org.uk/ [3] Link: https://lore.kernel.org/r/161118142558.1232039.17993829899588971439.stgit@warthog.procyon.org.uk/ # rfc Link: https://lore.kernel.org/r/161161037850.2537118.8819808229350326503.stgit@warthog.procyon.org.uk/ # v2 Link: https://lore.kernel.org/r/161340402057.1303470.8038373593844486698.stgit@warthog.procyon.org.uk/ # v3 Link: https://lore.kernel.org/r/161539545919.286939.14573472672781434757.stgit@warthog.procyon.org.uk/ # v4 Link: https://lore.kernel.org/r/161653801477.2770958.10543270629064934227.stgit@warthog.procyon.org.uk/ # v5 Link: https://lore.kernel.org/r/161789084517.6155.12799689829859169640.stgit@warthog.procyon.org.uk/ # v6
2019-05-24treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36Thomas Gleixner1-5/+1
Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of the gnu general public licence as published by the free software foundation either version 2 of the licence or at your option any later version extracted by the scancode license scanner the SPDX license identifier GPL-2.0-or-later has been chosen to replace the boilerplate/reference in 114 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190520170857.552531963@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-04-04fscache: Add tracepointsDavid Howells1-0/+2
Add some tracepoints to fscache: (*) fscache_cookie - Tracks a cookie's usage count. (*) fscache_netfs - Logs registration of a network filesystem, including the pointer to the cookie allocated. (*) fscache_acquire - Logs cookie acquisition. (*) fscache_relinquish - Logs cookie relinquishment. (*) fscache_enable - Logs enablement of a cookie. (*) fscache_disable - Logs disablement of a cookie. (*) fscache_osm - Tracks execution of states in the object state machine. and cachefiles: (*) cachefiles_ref - Tracks a cachefiles object's usage count. (*) cachefiles_lookup - Logs result of lookup_one_len(). (*) cachefiles_mkdir - Logs result of vfs_mkdir(). (*) cachefiles_create - Logs result of vfs_create(). (*) cachefiles_unlink - Logs calls to vfs_unlink(). (*) cachefiles_rename - Logs calls to vfs_rename(). (*) cachefiles_mark_active - Logs an object becoming active. (*) cachefiles_wait_active - Logs a wait for an old object to be destroyed. (*) cachefiles_mark_inactive - Logs an object becoming inactive. (*) cachefiles_mark_buried - Logs the burial of an object. Signed-off-by: David Howells <dhowells@redhat.com>
2017-06-20sched/wait: Split out the wait_bit*() APIs from <linux/wait.h> into ↵Ingo Molnar1-1/+1
<linux/wait_bit.h> The wait_bit*() types and APIs are mixed into wait.h, but they are a pretty orthogonal extension of wait-queues. Furthermore, only about 50 kernel files use these APIs, while over 1000 use the regular wait-queue functionality. So clean up the main wait.h by moving the wait-bit functionality out of it, into a separate .h and .c file: include/linux/wait_bit.h for types and APIs kernel/sched/wait_bit.c for the implementation Update all header dependencies. This reduces the size of wait.h rather significantly, by about 30%. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-20sched/wait: Rename wait_queue_t => wait_queue_entry_tIngo Molnar1-1/+1
Rename: wait_queue_t => wait_queue_entry_t 'wait_queue_t' was always a slight misnomer: its name implies that it's a "queue", but in reality it's a queue *entry*. The 'real' queue is the wait queue head, which had to carry the name. Start sorting this out by renaming it to 'wait_queue_entry_t'. This also allows the real structure name 'struct __wait_queue' to lose its double underscore and become 'struct wait_queue_entry', which is the more canonical nomenclature for such data types. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-03-02sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h>Ingo Molnar1-0/+1
Add #include <linux/cred.h> dependencies to all .c files rely on sched.h doing that for them. Note that even if the count where we need to add extra headers seems high, it's still a net win, because <linux/sched.h> is included in over 2,200 files ... Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-09-28cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]David Howells1-1/+2
An NULL-pointer dereference happens in cachefiles_mark_object_inactive() when it tries to read i_blocks so that it can tell the cachefilesd daemon how much space it's making available. The problem is that cachefiles_drop_object() calls cachefiles_mark_object_inactive() after calling cachefiles_delete_object() because the object being marked active staves off attempts to (re-)use the file at that filename until after it has been deleted. This means that d_inode is NULL by the time we come to try to access it. To fix the problem, have the caller of cachefiles_mark_object_inactive() supply the number of blocks freed up. Without this, the following oops may occur: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 IP: [<ffffffffa06c5cc1>] cachefiles_mark_object_inactive+0x61/0xb0 [cachefiles] ... CPU: 11 PID: 527 Comm: kworker/u64:4 Tainted: G I ------------ 3.10.0-470.el7.x86_64 #1 Hardware name: Hewlett-Packard HP Z600 Workstation/0B54h, BIOS 786G4 v03.19 03/11/2011 Workqueue: fscache_object fscache_object_work_func [fscache] task: ffff880035edaf10 ti: ffff8800b77c0000 task.ti: ffff8800b77c0000 RIP: 0010:[<ffffffffa06c5cc1>] cachefiles_mark_object_inactive+0x61/0xb0 [cachefiles] RSP: 0018:ffff8800b77c3d70 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8800bf6cc400 RCX: 0000000000000034 RDX: 0000000000000000 RSI: ffff880090ffc710 RDI: ffff8800bf761ef8 RBP: ffff8800b77c3d88 R08: 2000000000000000 R09: 0090ffc710000000 R10: ff51005d2ff1c400 R11: 0000000000000000 R12: ffff880090ffc600 R13: ffff8800bf6cc520 R14: ffff8800bf6cc400 R15: ffff8800bf6cc498 FS: 0000000000000000(0000) GS:ffff8800bb8c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000098 CR3: 00000000019ba000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff880090ffc600 ffff8800bf6cc400 ffff8800867df140 ffff8800b77c3db0 ffffffffa06c48cb ffff880090ffc600 ffff880090ffc180 ffff880090ffc658 ffff8800b77c3df0 ffffffffa085d846 ffff8800a96b8150 ffff880090ffc600 Call Trace: [<ffffffffa06c48cb>] cachefiles_drop_object+0x6b/0xf0 [cachefiles] [<ffffffffa085d846>] fscache_drop_object+0xd6/0x1e0 [fscache] [<ffffffffa085d615>] fscache_object_work_func+0xa5/0x200 [fscache] [<ffffffff810a605b>] process_one_work+0x17b/0x470 [<ffffffff810a6e96>] worker_thread+0x126/0x410 [<ffffffff810a6d70>] ? rescuer_thread+0x460/0x460 [<ffffffff810ae64f>] kthread+0xcf/0xe0 [<ffffffff810ae580>] ? kthread_create_on_node+0x140/0x140 [<ffffffff81695418>] ret_from_fork+0x58/0x90 [<ffffffff810ae580>] ? kthread_create_on_node+0x140/0x140 The oopsing code shows: callq 0xffffffff810af6a0 <wake_up_bit> mov 0xf8(%r12),%rax mov 0x30(%rax),%rax mov 0x98(%rax),%rax <---- oops here lock add %rax,0x130(%rbx) where this is: d_backing_inode(object->dentry)->i_blocks Fixes: a5b3a80b899bda0f456f1246c4c5a1191ea01519 (CacheFiles: Provide read-and-reset release counters for cachefilesd) Reported-by: Jianhong Yin <jiyin@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Steve Dickson <steved@redhat.com> cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-02-01CacheFiles: Provide read-and-reset release counters for cachefilesdDavid Howells1-0/+4
Provide read-and-reset objects- and blocks-released counters for cachefilesd to use to work out whether there's anything new that can be culled. One of the problems cachefilesd has is that if all the objects in the cache are pinned by inodes lying dormant in the kernel inode cache, there isn't anything for it to cull. In such a case, it just spins around walking the filesystem tree and scanning for something to cull. This eats up a lot of CPU time. By telling cachefilesd if there have been any releases, the daemon can sleep until there is the possibility of something to do. cachefilesd finds this information by the following means: (1) When the control fd is read, the kernel presents a list of values of interest. "freleased=N" and "breleased=N" are added to this list to indicate the number of files released and number of blocks released since the last read call. At this point the counters are reset. (2) POLLIN is signalled if the number of files released becomes greater than 0. Note that by 'released' it just means that the kernel has released its interest in those files for the moment, not necessarily that the files should be deleted from the cache. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Steve Dickson <steved@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>