summaryrefslogtreecommitdiff
path: root/fs/fuse
AgeCommit message (Collapse)AuthorFilesLines
2022-12-19fuse: always revalidate if exclusive createMiklos Szeredi1-1/+1
commit df8629af293493757beccac2d3168fe5a315636e upstream. Failure to do so may result in EEXIST even if the file only exists in the cache and not in the filesystem. The atomic nature of O_EXCL mandates that the cached state should be ignored and existence verified anew. Reported-by: Ken Schalk <kschalk@nvidia.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Wu Bo <bo.wu@vivo.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-02fuse: lock inode unconditionally in fuse_fallocate()Miklos Szeredi1-20/+16
commit 44361e8cf9ddb23f17bdcc40ca944abf32e83e79 upstream. file_modified() must be called with inode lock held. fuse_fallocate() didn't lock the inode in case of just FALLOC_KEEP_SIZE flags value, which resulted in a kernel Warning in notify_change(). Lock the inode unconditionally, like all other fallocate implementations do. Reported-by: Pengfei Xu <pengfei.xu@intel.com> Reported-and-tested-by: syzbot+462da39f0667b357c4b6@syzkaller.appspotmail.com Fixes: 4a6f278d4827 ("fuse: add file_modified() to fallocate") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-11-16fuse: fix readdir cache raceMiklos Szeredi1-1/+9
[ Upstream commit 9fa248c65bdbf5af0a2f74dd38575acfc8dfd2bf ] There's a race in fuse's readdir cache that can result in an uninitilized page being read. The page lock is supposed to prevent this from happening but in the following case it doesn't: Two fuse_add_dirent_to_cache() start out and get the same parameters (size=0,offset=0). One of them wins the race to create and lock the page, after which it fills in data, sets rdc.size and unlocks the page. In the meantime the page gets evicted from the cache before the other instance gets to run. That one also creates the page, but finds the size to be mismatched, bails out and leaves the uninitialized page in the cache. Fix by marking a filled page uptodate and ignoring non-uptodate pages. Reported-by: Frank Sorenson <fsorenso@redhat.com> Fixes: 5d7bc7e8680c ("fuse: allow using readdir cache") Cc: <stable@vger.kernel.org> # v4.20 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-11-10fuse: add file_modified() to fallocateMiklos Szeredi1-0/+4
commit 4a6f278d4827b59ba26ceae0ff4529ee826aa258 upstream. Add missing file_modified() call to fuse_file_fallocate(). Without this fallocate on fuse failed to clear privileges. Fixes: 05ba1f082300 ("fuse: add FALLOCATE operation") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-21fuse: Remove the control interface for virtio-fsXie Yongji1-2/+2
[ Upstream commit c64797809a64c73497082aa05e401a062ec1af34 ] The commit 15c8e72e88e0 ("fuse: allow skipping control interface and forced unmount") tries to remove the control interface for virtio-fs since it does not support aborting requests which are being processed. But it doesn't work now. This patch fixes it by skipping creating the control interface if fuse_conn->no_control is set. Fixes: 15c8e72e88e0 ("fuse: allow skipping control interface and forced unmount") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-08-21fuse: limit nsecMiklos Szeredi1-0/+6
commit 47912eaa061a6a81e4aa790591a1874c650733c0 upstream. Limit nanoseconds to 0..999999999. Fixes: d8a5ba45457e ("[PATCH] FUSE - core") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-03-16fuse: fix pipe buffer lifetime for direct_ioMiklos Szeredi3-1/+13
commit 0c4bcfdecb1ac0967619ee7ff44871d93c08c909 upstream. In FOPEN_DIRECT_IO mode, fuse_file_write_iter() calls fuse_direct_write_iter(), which normally calls fuse_direct_io(), which then imports the write buffer with fuse_get_user_pages(), which uses iov_iter_get_pages() to grab references to userspace pages instead of actually copying memory. On the filesystem device side, these pages can then either be read to userspace (via fuse_dev_read()), or splice()d over into a pipe using fuse_dev_splice_read() as pipe buffers with &nosteal_pipe_buf_ops. This is wrong because after fuse_dev_do_read() unlocks the FUSE request, the userspace filesystem can mark the request as completed, causing write() to return. At that point, the userspace filesystem should no longer have access to the pipe buffer. Fix by copying pages coming from the user address space to new pipe buffers. Reported-by: Jann Horn <jannh@google.com> Fixes: c3021629a0d8 ("fuse: support splice() reading from fuse device") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-27fuse: Pass correct lend value to filemap_write_and_wait_range()Xie Yongji1-1/+1
commit e388164ea385f04666c4633f5dc4f951fca71890 upstream. The acceptable maximum value of lend parameter in filemap_write_and_wait_range() is LLONG_MAX rather than -1. And there is also some logic depending on LLONG_MAX check in write_cache_pages(). So let's pass LLONG_MAX to filemap_write_and_wait_range() in fuse_writeback_range() instead. Fixes: 59bda8ecee2f ("fuse: flush extending writes") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Cc: <stable@vger.kernel.org> # v5.15 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-22fuse: annotate lock in fuse_reverse_inval_entry()Miklos Szeredi1-1/+1
commit bda9a71980e083699a0360963c0135657b73f47a upstream. Add missing inode lock annotatation; found by syzbot. Reported-and-tested-by: syzbot+9f747458f5990eaa8d43@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-17fuse: make sure reclaim doesn't write the inodeMiklos Szeredi4-0/+27
commit 5c791fe1e2a4f401f819065ea4fc0450849f1818 upstream. In writeback cache mode mtime/ctime updates are cached, and flushed to the server using the ->write_inode() callback. Closing the file will result in a dirty inode being immediately written, but in other cases the inode can remain dirty after all references are dropped. This result in the inode being written back from reclaim, which can deadlock on a regular allocation while the request is being served. The usual mechanisms (GFP_NOFS/PF_MEMALLOC*) don't work for FUSE, because serving a request involves unrelated userspace process(es). Instead do the same as for dirty pages: make sure the inode is written before the last reference is gone. - fallocate(2)/copy_file_range(2): these call file_update_time() or file_modified(), so flush the inode before returning from the call - unlink(2), link(2) and rename(2): these call fuse_update_ctime(), so flush the ctime directly from this helper Reported-by: chenguanyou <chenguanyou@xiaomi.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: Ed Tsai <ed.tsai@mediatek.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-01fuse: release pipe buf after last useMiklos Szeredi1-5/+5
commit 473441720c8616dfaf4451f9c7ea14f0eb5e5d65 upstream. Checking buf->flags should be done before the pipe_buf_release() is called on the pipe buffer, since releasing the buffer might modify the flags. This is exactly what page_cache_pipe_buf_release() does, and which results in the same VM_BUG_ON_PAGE(PageLRU(page)) that the original patch was trying to fix. Reported-by: Justin Forbes <jmforbes@linuxtx.org> Fixes: 712a951025c0 ("fuse: fix page stealing") Cc: <stable@vger.kernel.org> # v2.6.35 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-18fuse: fix page stealingMiklos Szeredi1-2/+12
commit 712a951025c0667ff00b25afc360f74e639dfabe upstream. It is possible to trigger a crash by splicing anon pipe bufs to the fuse device. The reason for this is that anon_pipe_buf_release() will reuse buf->page if the refcount is 1, but that page might have already been stolen and its flags modified (e.g. PG_lru added). This happens in the unlikely case of fuse_dev_splice_write() getting around to calling pipe_buf_release() after a page has been stolen, added to the page cache and removed from the page cache. Fix by calling pipe_buf_release() right after the page was inserted into the page cache. In this case the page has an elevated refcount so any release function will know that the page isn't reusable. Reported-by: Frank Dinoff <fdinoff@google.com> Link: https://lore.kernel.org/r/CAAmZXrsGg2xsP1CK+cbuEMumtrqdvD-NKnWzhNcvn71RV3c1yw@mail.gmail.com/ Fixes: dd3bb14f44a6 ("fuse: support splice() writing to fuse device") Cc: <stable@vger.kernel.org> # v2.6.35 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-22fuse: fix use after free in fuse_read_interrupt()Miklos Szeredi1-2/+2
[ Upstream commit e1e71c168813564be0f6ea3d6740a059ca42d177 ] There is a potential race between fuse_read_interrupt() and fuse_request_end(). TASK1 in fuse_read_interrupt(): delete req->intr_entry (while holding fiq->lock) TASK2 in fuse_request_end(): req->intr_entry is empty -> skip fiq->lock wake up TASK3 TASK3 request is freed TASK1 in fuse_read_interrupt(): dereference req->in.h.unique ***BAM*** Fix by always grabbing fiq->lock if the request was ever interrupted (FR_INTERRUPTED set) thereby serializing with concurrent fuse_read_interrupt() calls. FR_INTERRUPTED is set before the request is queued on fiq->interrupts. Dequeing the request is done with list_del_init() but FR_INTERRUPTED is not cleared in this case. Reported-by: lijiazi <lijiazi@xiaomi.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-09-15fuse: flush extending writesMiklos Szeredi1-1/+1
commit 59bda8ecee2ffc6a602b7bf2b9e43ca669cdbdcd upstream. Callers of fuse_writeback_range() assume that the file is ready for modification by the server in the supplied byte range after the call returns. If there's a write that extends the file beyond the end of the supplied range, then the file needs to be extended to at least the end of the range, but currently that's not done. There are at least two cases where this can cause problems: - copy_file_range() will return short count if the file is not extended up to end of the source range. - FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE will not extend the file, hence the region may not be fully allocated. Fix by flushing writes from the start of the range up to the end of the file. This could be optimized if the writes are non-extending, etc, but it's probably not worth the trouble. Fixes: a2bc92362941 ("fuse: fix copy_file_range() in the writeback case") Fixes: 6b1bdb56b17c ("fuse: allow fallocate(FALLOC_FL_ZERO_RANGE)") Cc: <stable@vger.kernel.org> # v5.2 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-15fuse: truncate pagecache on atomic_o_truncMiklos Szeredi1-2/+5
commit 76224355db7570cbe6b6f75c8929a1558828dd55 upstream. fuse_finish_open() will be called with FUSE_NOWRITE in case of atomic O_TRUNC. This can deadlock with fuse_wait_on_page_writeback() in fuse_launder_page() triggered by invalidate_inode_pages2(). Fix by replacing invalidate_inode_pages2() in fuse_finish_open() with a truncate_pagecache() call. This makes sense regardless of FOPEN_KEEP_CACHE or fc->writeback cache, so do it unconditionally. Reported-by: Xie Yongji <xieyongji@bytedance.com> Reported-and-tested-by: syzbot+bea44a5189836d956894@syzkaller.appspotmail.com Fixes: e4648309b85a ("fuse: truncate pending writes on O_TRUNC") Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-08fuse: fix illegal access to inode with reused nodeidAmir Goldstein4-5/+15
commit 15db16837a35d8007cb8563358787412213db25e upstream. Server responds to LOOKUP and other ops (READDIRPLUS/CREATE/MKNOD/...) with ourarg containing nodeid and generation. If a fuse inode is found in inode cache with the same nodeid but different generation, the existing fuse inode should be unhashed and marked "bad" and a new inode with the new generation should be hashed instead. This can happen, for example, with passhrough fuse filesystem that returns the real filesystem ino/generation on lookup and where real inode numbers can get recycled due to real files being unlinked not via the fuse passthrough filesystem. With current code, this situation will not be detected and an old fuse dentry that used to point to an older generation real inode, can be used to access a completely new inode, which should be accessed only via the new dentry. Note that because the FORGET message carries the nodeid w/o generation, the server should wait to get FORGET counts for the nlookup counts of the old and reused inodes combined, before it can free the resources associated to that nodeid. Stable backport notes: * This is not a regression. The bug has been in fuse forever, but only a certain class of low level fuse filesystems can trigger this bug * Because there is no way to check if this fix is applied in runtime, libfuse test_examples.py tests this fix with hardcoded check for kernel version >= 5.14 * After backport to stable kernel(s), the libfuse test can be updated to also check minimal stable kernel version(s) * Depends on "fuse: fix bad inode" which is already applied to stable kernels v5.4.y and v5.10.y * Required backporting helper inode_wrong_type() Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxi8DymG=JO_sAU+wS8akFdzh+PuXwW3Ebgahd2Nwnh7zA@mail.gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-08new helper: inode_wrong_type()Al Viro3-5/+5
commit 6e3e2c4362e41a2f18e3f7a5ad81bd2f49a47b85 upstream. inode_wrong_type(inode, mode) returns true if setting inode->i_mode to given value would've changed the inode type. We have enough of those checks open-coded to make a helper worthwhile. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-14fuse: reject internal errnoMiklos Szeredi1-1/+1
commit 49221cf86d18bb66fe95d3338cb33bd4b9880ca5 upstream. Don't allow userspace to report errors that could be kernel-internal. Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com> Fixes: 334f485df85a ("[PATCH] FUSE - device functions") Cc: <stable@vger.kernel.org> # v2.6.14 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-14fuse: check connected before queueing on fpq->ioMiklos Szeredi1-0/+9
commit 80ef08670d4c28a06a3de954bd350368780bcfef upstream. A request could end up on the fpq->io list after fuse_abort_conn() has reset fpq->connected and aborted requests on that list: Thread-1 Thread-2 ======== ======== ->fuse_simple_request() ->shutdown ->__fuse_request_send() ->queue_request() ->fuse_abort_conn() ->fuse_dev_do_read() ->acquire(fpq->lock) ->wait_for(fpq->lock) ->set err to all req's in fpq->io ->release(fpq->lock) ->acquire(fpq->lock) ->add req to fpq->io After the userspace copy is done the request will be ended, but req->out.h.error will remain uninitialized. Also the copy might block despite being already aborted. Fix both issues by not allowing the request to be queued on the fpq->io list after fuse_abort_conn() has processed this list. Reported-by: Pradeep P V K <pragalla@codeaurora.org> Fixes: fd22d62ed0c3 ("fuse: no fc->lock for iqueue parts") Cc: <stable@vger.kernel.org> # v4.2 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-14fuse: ignore PG_workingset after stealingMiklos Szeredi1-0/+1
commit b89ecd60d38ec042d63bdb376c722a16f92bcb88 upstream. Fix the "fuse: trying to steal weird page" warning. Description from Johannes Weiner: "Think of it as similar to PG_active. It's just another usage/heat indicator of file and anon pages on the reclaim LRU that, unlike PG_active, persists across deactivation and even reclaim (we store it in the page cache / swapper cache tree until the page refaults). So if fuse accepts pages that can legally have PG_active set, PG_workingset is fine too." Reported-by: Thomas Lindroth <thomas.lindroth@gmail.com> Fixes: 1899ad18c607 ("mm: workingset: tell cache transitions from workingset thrashing") Cc: <stable@vger.kernel.org> # v4.20 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-14fuse: Fix infinite loop in sget_fc()Greg Kurz1-0/+11
commit e4a9ccdd1c03b3dc58214874399d24331ea0a3ab upstream. We don't set the SB_BORN flag on submounts. This is wrong as these superblocks are then considered as partially constructed or dying in the rest of the code and can break some assumptions. One such case is when you have a virtiofs filesystem with submounts and you try to mount it again : virtio_fs_get_tree() tries to obtain a superblock with sget_fc(). The logic in sget_fc() is to loop until it has either found an existing matching superblock with SB_BORN set or to create a brand new one. It is assumed that a superblock without SB_BORN is transient and the loop is restarted. Forgetting to set SB_BORN on submounts hence causes sget_fc() to retry forever. Setting SB_BORN requires special care, i.e. a write barrier for super_cache_count() which can check SB_BORN without taking any lock. We should call vfs_get_tree() to deal with that but this requires to have a proper ->get_tree() implementation for submounts, which is a bigger piece of work. Go for a simple bug fix in the meatime. Fixes: bf109c64040f ("fuse: implement crossmounts") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-14fuse: Fix crash if superblock of submount gets killed earlyGreg Kurz1-4/+4
commit e3a43f2a95393000778f8f302d48795add2fc4a8 upstream. As soon as fuse_dentry_automount() does up_write(&sb->s_umount), the superblock can theoretically be killed. If this happens before the submount was added to the &fc->mounts list, fuse_mount_remove() later crashes in list_del_init() because it assumes the submount to be already there. Add the submount before dropping sb->s_umount to fix the inconsistency. It is okay to nest fc->killsb under sb->s_umount, we already do this on the ->kill_sb() path. Signed-off-by: Greg Kurz <groug@kaod.org> Fixes: bf109c64040f ("fuse: implement crossmounts") Cc: stable@vger.kernel.org # v5.10+ Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-07-14fuse: Fix crash in fuse_dentry_automount() error pathGreg Kurz1-1/+5
commit d92d88f0568e97c437eeb79d9c9609bd8277406f upstream. If fuse_fill_super_submount() returns an error, the error path triggers a crash: [ 26.206673] BUG: kernel NULL pointer dereference, address: 0000000000000000 [...] [ 26.226362] RIP: 0010:__list_del_entry_valid+0x25/0x90 [...] [ 26.247938] Call Trace: [ 26.248300] fuse_mount_remove+0x2c/0x70 [fuse] [ 26.248892] virtio_kill_sb+0x22/0x160 [virtiofs] [ 26.249487] deactivate_locked_super+0x36/0xa0 [ 26.250077] fuse_dentry_automount+0x178/0x1a0 [fuse] The crash happens because fuse_mount_remove() assumes that the FUSE mount was already added to list under the FUSE connection, but this only done after fuse_fill_super_submount() has returned success. This means that until fuse_fill_super_submount() has returned success, the FUSE mount isn't actually owned by the superblock. We should thus reclaim ownership by clearing sb->s_fs_info, which will skip the call to fuse_mount_remove(), and perform rollback, like virtio_fs_get_tree() already does for the root sb. Fixes: bf109c64040f ("fuse: implement crossmounts") Cc: stable@vger.kernel.org # v5.10+ Signed-off-by: Greg Kurz <groug@kaod.org> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-19cuse: prevent cloneMiklos Szeredi1-0/+2
[ Upstream commit 8217673d07256b22881127bf50dce874d0e51653 ] For cloned connections cuse_channel_release() will be called more than once, resulting in use after free. Prevent device cloning for CUSE, which does not make sense at this point, and highly unlikely to be used in real life. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19virtiofs: fix usernsMiklos Szeredi1-2/+1
[ Upstream commit 0a7419c68a45d2d066b996be5087aa2d07ce80eb ] get_user_ns() is done twice (once in virtio_fs_get_tree() and once in fuse_conn_init()), resulting in a reference leak. Also looks better to use fsc->user_ns (which *should* be the current_user_ns() at this point). Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-19fuse: invalidate attrs when page writeback completesVivek Goyal1-0/+9
[ Upstream commit 3466958beb31a8e9d3a1441a34228ed088b84f3e ] In fuse when a direct/write-through write happens we invalidate attrs because that might have updated mtime/ctime on server and cached mtime/ctime will be stale. What about page writeback path. Looks like we don't invalidate attrs there. To be consistent, invalidate attrs in writeback path as well. Only exception is when writeback_cache is enabled. In that case we strust local mtime/ctime and there is no need to invalidate attrs. Recently users started experiencing failure of xfstests generic/080, geneirc/215 and generic/614 on virtiofs. This happened only newer "stat" utility and not older one. This patch fixes the issue. So what's the root cause of the issue. Here is detailed explanation. generic/080 test does mmap write to a file, closes the file and then checks if mtime has been updated or not. When file is closed, it leads to flushing of dirty pages (and that should update mtime/ctime on server). But we did not explicitly invalidate attrs after writeback finished. Still generic/080 passed so far and reason being that we invalidated atime in fuse_readpages_end(). This is called in fuse_readahead() path and always seems to trigger before mmaped write. So after mmaped write when lstat() is called, it sees that atleast one of the fields being asked for is invalid (atime) and that results in generating GETATTR to server and mtime/ctime also get updated and test passes. But newer /usr/bin/stat seems to have moved to using statx() syscall now (instead of using lstat()). And statx() allows it to query only ctime or mtime (and not rest of the basic stat fields). That means when querying for mtime, fuse_update_get_attr() sees that mtime is not invalid (only atime is invalid). So it does not generate a new GETATTR and fill stat with cached mtime/ctime. And that means updated mtime is not seen by xfstest and tests start failing. Invalidating attrs after writeback completion should solve this problem in a generic manner. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-05-11fuse: fix write deadlockVivek Goyal2-12/+30
commit 4f06dd92b5d0a6f8eec6a34b8d6ef3e1f4ac1e10 upstream. There are two modes for write(2) and friends in fuse: a) write through (update page cache, send sync WRITE request to userspace) b) buffered write (update page cache, async writeout later) The write through method kept all the page cache pages locked that were used for the request. Keeping more than one page locked is deadlock prone and Qian Cai demonstrated this with trinity fuzzing. The reason for keeping the pages locked is that concurrent mapped reads shouldn't try to pull possibly stale data into the page cache. For full page writes, the easy way to fix this is to make the cached page be the authoritative source by marking the page PG_uptodate immediately. After this the page can be safely unlocked, since mapped/cached reads will take the written data from the cache. Concurrent mapped writes will now cause data in the original WRITE request to be updated; this however doesn't cause any data inconsistency and this scenario should be exceedingly rare anyway. If the WRITE request returns with an error in the above case, currently the page is not marked uptodate; this means that a concurrent read will always read consistent data. After this patch the page is uptodate between writing to the cache and receiving the error: there's window where a cached read will read the wrong data. While theoretically this could be a regression, it is unlikely to be one in practice, since this is normal for buffered writes. In case of a partial page write to an already uptodate page the locking is also unnecessary, with the above caveats. Partial write of a not uptodate page still needs to be handled. One way would be to read the complete page before doing the write. This is not possible, since it might break filesystems that don't expect any READ requests when the file was opened O_WRONLY. The other solution is to serialize the synchronous write with reads from the partial pages. The easiest way to do this is to keep the partial pages locked. The problem is that a write() may involve two such pages (one head and one tail). This patch fixes it by only locking the partial tail page. If there's a partial head page as well, then split that off as a separate WRITE request. Reported-by: Qian Cai <cai@lca.pw> Link: https://lore.kernel.org/linux-fsdevel/4794a3fa3742a5e84fb0f934944204b55730829b.camel@lca.pw/ Fixes: ea9b9907b82a ("fuse: implement perform_write") Cc: <stable@vger.kernel.org> # v2.6.26 Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-11virtiofs: fix memory leak in virtio_fs_probe()Luis Henriques1-0/+1
commit c79c5e0178922a9e092ec8fed026750f39dcaef4 upstream. When accidentally passing twice the same tag to qemu, kmemleak ended up reporting a memory leak in virtiofs. Also, looking at the log I saw the following error (that's when I realised the duplicated tag): virtiofs: probe of virtio5 failed with error -17 Here's the kmemleak log for reference: unreferenced object 0xffff888103d47800 (size 1024): comm "systemd-udevd", pid 118, jiffies 4294893780 (age 18.340s) hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... ff ff ff ff ff ff ff ff 80 90 02 a0 ff ff ff ff ................ backtrace: [<000000000ebb87c1>] virtio_fs_probe+0x171/0x7ae [virtiofs] [<00000000f8aca419>] virtio_dev_probe+0x15f/0x210 [<000000004d6baf3c>] really_probe+0xea/0x430 [<00000000a6ceeac8>] device_driver_attach+0xa8/0xb0 [<00000000196f47a7>] __driver_attach+0x98/0x140 [<000000000b20601d>] bus_for_each_dev+0x7b/0xc0 [<00000000399c7b7f>] bus_add_driver+0x11b/0x1f0 [<0000000032b09ba7>] driver_register+0x8f/0xe0 [<00000000cdd55998>] 0xffffffffa002c013 [<000000000ea196a2>] do_one_initcall+0x64/0x2e0 [<0000000008f727ce>] do_init_module+0x5c/0x260 [<000000003cdedab6>] __do_sys_finit_module+0xb5/0x120 [<00000000ad2f48c6>] do_syscall_64+0x33/0x40 [<00000000809526b5>] entry_SYSCALL_64_after_hwframe+0x44/0xae Cc: stable@vger.kernel.org Signed-off-by: Luis Henriques <lhenriques@suse.de> Fixes: a62a8ef9d97d ("virtio-fs: add virtiofs filesystem") Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-04-07virtiofs: Fail dax mount if device does not support itVivek Goyal1-1/+8
[ Upstream commit 3f9b9efd82a84f27e95d0414f852caf1fa839e83 ] Right now "mount -t virtiofs -o dax myfs /mnt/virtiofs" succeeds even if filesystem deivce does not have a cache window and hence DAX can't be supported. This gives a false sense to user that they are using DAX with virtiofs but fact of the matter is that they are not. Fix this by returning error if dax can't be supported and user has asked for it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-20fuse: fix live lock in fuse_iget()Amir Goldstein1-0/+1
commit 775c5033a0d164622d9d10dd0f0a5531639ed3ed upstream. Commit 5d069dbe8aaf ("fuse: fix bad inode") replaced make_bad_inode() in fuse_iget() with a private implementation fuse_make_bad(). The private implementation fails to remove the bad inode from inode cache, so the retry loop with iget5_locked() finds the same bad inode and marks it bad forever. kmsg snip: [ ] rcu: INFO: rcu_sched self-detected stall on CPU ... [ ] ? bit_wait_io+0x50/0x50 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ? find_inode.isra.32+0x60/0xb0 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ilookup5_nowait+0x65/0x90 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ilookup5.part.36+0x2e/0x80 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ? fuse_inode_eq+0x20/0x20 [ ] iget5_locked+0x21/0x80 [ ] ? fuse_inode_eq+0x20/0x20 [ ] fuse_iget+0x96/0x1b0 Fixes: 5d069dbe8aaf ("fuse: fix bad inode") Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-09fuse: fix bad inodeMiklos Szeredi7-17/+74
[ Upstream commit 5d069dbe8aaf2a197142558b6fb2978189ba3454 ] Jan Kara's analysis of the syzbot report (edited): The reproducer opens a directory on FUSE filesystem, it then attaches dnotify mark to the open directory. After that a fuse_do_getattr() call finds that attributes returned by the server are inconsistent, and calls make_bad_inode() which, among other things does: inode->i_mode = S_IFREG; This then confuses dnotify which doesn't tear down its structures properly and eventually crashes. Avoid calling make_bad_inode() on a live inode: switch to a private flag on the fuse inode. Also add the test to ops which the bad_inode_ops would have caught. This bug goes back to the initial merge of fuse in 2.6.14... Reported-by: syzbot+f427adf9324b92652ccc@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Jan Kara <jack@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30virtiofs fix leak in setupMiklos Szeredi1-0/+2
[ Upstream commit 66ab33bf6d4341574f88b511e856a73f6f2a921e ] This can be triggered for example by adding the "-omand" mount option, which will be rejected and virtio_fs_fill_super() will return an error. In such a case the allocations for fuse_conn and fuse_mount will leak due to s_root not yet being set and so ->put_super() not being called. Fixes: a62a8ef9d97d ("virtio-fs: add virtiofs filesystem") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-20Merge tag 'fuse-update-5.10' of ↵Linus Torvalds13-484/+2606
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse updates from Miklos Szeredi: - Support directly accessing host page cache from virtiofs. This can improve I/O performance for various workloads, as well as reducing the memory requirement by eliminating double caching. Thanks to Vivek Goyal for doing most of the work on this. - Allow automatic submounting inside virtiofs. This allows unique st_dev/ st_ino values to be assigned inside the guest to files residing on different filesystems on the host. Thanks to Max Reitz for the patches. - Fix an old use after free bug found by Pradeep P V K. * tag 'fuse-update-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (25 commits) virtiofs: calculate number of scatter-gather elements accurately fuse: connection remove fix fuse: implement crossmounts fuse: Allow fuse_fill_super_common() for submounts fuse: split fuse_mount off of fuse_conn fuse: drop fuse_conn parameter where possible fuse: store fuse_conn in fuse_req fuse: add submount support to <uapi/linux/fuse.h> fuse: fix page dereference after free virtiofs: add logic to free up a memory range virtiofs: maintain a list of busy elements virtiofs: serialize truncate/punch_hole and dax fault path virtiofs: define dax address space operations virtiofs: add DAX mmap support virtiofs: implement dax read/write operations virtiofs: introduce setupmapping/removemapping commands virtiofs: implement FUSE_INIT map_alignment field virtiofs: keep a list of free dax memory ranges virtiofs: add a mount option to enable dax virtiofs: set up virtio_fs dax_device ...
2020-10-14virtiofs: calculate number of scatter-gather elements accuratelyVivek Goyal1-5/+27
virtiofs currently maps various buffers in scatter gather list and it looks at number of pages (ap->pages) and assumes that same number of pages will be used both for input and output (sg_count_fuse_req()), and calculates total number of scatterlist elements accordingly. But looks like this assumption is not valid in all the cases. For example, Cai Qian reported that trinity, triggers warning with virtiofs sometimes. A closer look revealed that if one calls ioctl(fd, 0x5a004000, buf), it will trigger following warning. WARN_ON(out_sgs + in_sgs != total_sgs) In this case, total_sgs = 8, out_sgs=4, in_sgs=3. Number of pages is 2 (ap->pages), but out_sgs are using both the pages but in_sgs are using only one page. In this case, fuse_do_ioctl() sets different size values for input and output. args->in_args[args->in_numargs - 1].size == 6656 args->out_args[args->out_numargs - 1].size == 4096 So current method of calculating how many scatter-gather list elements will be used is not accurate. Make calculations more precise by parsing size and ap->descs. Reported-by: Qian Cai <cai@redhat.com> Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Link: https://lore.kernel.org/linux-fsdevel/5ea77e9f6cb8c2db43b09fbd4158ab2d8c066a0a.camel@redhat.com/ Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-10-13Merge tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-blockLinus Torvalds1-2/+2
Pull block updates from Jens Axboe: - Series of merge handling cleanups (Baolin, Christoph) - Series of blk-throttle fixes and cleanups (Baolin) - Series cleaning up BDI, seperating the block device from the backing_dev_info (Christoph) - Removal of bdget() as a generic API (Christoph) - Removal of blkdev_get() as a generic API (Christoph) - Cleanup of is-partition checks (Christoph) - Series reworking disk revalidation (Christoph) - Series cleaning up bio flags (Christoph) - bio crypt fixes (Eric) - IO stats inflight tweak (Gabriel) - blk-mq tags fixes (Hannes) - Buffer invalidation fixes (Jan) - Allow soft limits for zone append (Johannes) - Shared tag set improvements (John, Kashyap) - Allow IOPRIO_CLASS_RT for CAP_SYS_NICE (Khazhismel) - DM no-wait support (Mike, Konstantin) - Request allocation improvements (Ming) - Allow md/dm/bcache to use IO stat helpers (Song) - Series improving blk-iocost (Tejun) - Various cleanups (Geert, Damien, Danny, Julia, Tetsuo, Tian, Wang, Xianting, Yang, Yufen, yangerkun) * tag 'block-5.10-2020-10-12' of git://git.kernel.dk/linux-block: (191 commits) block: fix uapi blkzoned.h comments blk-mq: move cancel of hctx->run_work to the front of blk_exit_queue blk-mq: get rid of the dead flush handle code path block: get rid of unnecessary local variable block: fix comment and add lockdep assert blk-mq: use helper function to test hw stopped block: use helper function to test queue register block: remove redundant mq check block: invoke blk_mq_exit_sched no matter whether have .exit_sched percpu_ref: don't refer to ref->data if it isn't allocated block: ratelimit handle_bad_sector() message blk-throttle: Re-use the throtl_set_slice_end() blk-throttle: Open code __throtl_de/enqueue_tg() blk-throttle: Move service tree validation out of the throtl_rb_first() blk-throttle: Move the list operation after list validation blk-throttle: Fix IO hang for a corner case blk-throttle: Avoid tracking latency if low limit is invalid blk-throttle: Avoid getting the current time if tg->last_finish_time is 0 blk-throttle: Remove a meaningless parameter for throtl_downgrade_state() block: Remove redundant 'return' statement ...
2020-10-12fuse: connection remove fixMiklos Szeredi1-0/+7
Re-add lost removal of fc from fuse_conn_list and the control filesystem. Reported-by: kernel test robot <rong.a.chen@intel.com> Fixes: fcee216beb9c ("fuse: split fuse_mount off of fuse_conn") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-10-09fuse: implement crossmountsMax Reitz4-3/+105
FUSE servers can indicate crossmount points by setting FUSE_ATTR_SUBMOUNT in fuse_attr.flags. The inode will then be marked as S_AUTOMOUNT, and the .d_automount implementation creates a new submount at that location, so that the submount gets a distinct st_dev value. Note that all submounts get a distinct superblock and a distinct st_dev value, so for virtio-fs, even if the same filesystem is mounted more than once on the host, none of its mount points will have the same st_dev. We need distinct superblocks because the superblock points to the root node, but the different host mounts may show different trees (e.g. due to submounts in some of them, but not in others). Right now, this behavior is only enabled when fuse_conn.auto_submounts is set, which is the case only for virtio-fs. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-24bdi: invert BDI_CAP_NO_ACCT_WBChristoph Hellwig1-1/+2
Replace BDI_CAP_NO_ACCT_WB with a positive BDI_CAP_WRITEBACK_ACCT to make the checks more obvious. Also remove the pointless bdi_cap_account_writeback wrapper that just obsfucates the check. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-24bdi: initialize ->ra_pages and ->io_pages in bdi_initChristoph Hellwig1-1/+0
Set up a readahead size by default, as very few users have a good reason to change it. This means code, ecryptfs, and orangefs now set up the values while they were previously missing it, while ubifs, mtd and vboxsf manually set it to 0 to avoid readahead. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: David Sterba <dsterba@suse.com> [btrfs] Acked-by: Richard Weinberger <richard@nod.at> [ubifs, mtd] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-09-18fuse: Allow fuse_fill_super_common() for submountsMax Reitz2-21/+96
Submounts have their own superblock, which needs to be initialized. However, they do not have a fuse_fs_context associated with them, and the root node's attributes should be taken from the mountpoint's node. Extend fuse_fill_super_common() to work for submounts by making the @ctx parameter optional, and by adding a @submount_finode parameter. (There is a plain "unsigned" in an existing code block that is being indented by this commit. Extend it to "unsigned int" so checkpatch does not complain.) Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18fuse: split fuse_mount off of fuse_connMax Reitz11-358/+526
We want to allow submounts for the same fuse_conn, but with different superblocks so that each of the submounts has its own device ID. To do so, we need to split all mount-specific information off of fuse_conn into a new fuse_mount structure, so that multiple mounts can share a single fuse_conn. We need to take care only to perform connection-level actions once (i.e. when the fuse_conn and thus the first fuse_mount are established, or when the last fuse_mount and thus the fuse_conn are destroyed). For example, fuse_sb_destroy() must invoke fuse_send_destroy() until the last superblock is released. To do so, we keep track of which fuse_mount is the root mount and perform all fuse_conn-level actions only when this fuse_mount is involved. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18fuse: drop fuse_conn parameter where possibleMax Reitz3-37/+43
With the last commit, all functions that handle some existing fuse_req no longer need to be given the associated fuse_conn, because they can get it from the fuse_req object. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18fuse: store fuse_conn in fuse_reqMax Reitz2-6/+10
Every fuse_req belongs to a fuse_conn. Right now, we always know which fuse_conn that is based on the respective device, but we want to allow multiple (sub)mounts per single connection, and then the corresponding filesystem is not going to be so trivial to obtain. Storing a pointer to the associated fuse_conn in every fuse_req will allow us to trivially find any request's superblock (and thus filesystem) even then. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18fuse: fix page dereference after freeMiklos Szeredi1-10/+18
After unlock_request() pages from the ap->pages[] array may be put (e.g. by aborting the connection) and the pages can be freed. Prevent use after free by grabbing a reference to the page before calling unlock_request(). The original patch was created by Pradeep P V K. Reported-by: Pradeep P V K <ppvk@codeaurora.org> Cc: <stable@vger.kernel.org> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-18fuse: fix the ->direct_IO() treatment of iov_iterAl Viro1-13/+12
the callers rely upon having any iov_iter_truncate() done inside ->direct_IO() countered by iov_iter_reexpand(). Reported-by: Qian Cai <cai@redhat.com> Tested-by: Qian Cai <cai@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2020-09-10virtiofs: add logic to free up a memory rangeVivek Goyal3-6/+524
Add logic to free up a busy memory range. Freed memory range will be returned to free pool. Add a worker which can be started to select and free some busy memory ranges. Process can also steal one of its busy dax ranges if free range is not available. I will refer it to as direct reclaim. If free range is not available and nothing can't be stolen from same inode, caller waits on a waitq for free range to become available. For reclaiming a range, as of now we need to hold following locks in specified order. down_write(&fi->i_mmap_sem); down_write(&fi->dax->sem); We look for a free range in following order. A. Try to get a free range. B. If not, try direct reclaim. C. If not, wait for a memory range to become free Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-10virtiofs: maintain a list of busy elementsVivek Goyal1-0/+25
This list will be used selecting fuse_dax_mapping to free when number of free mappings drops below a threshold. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-10virtiofs: serialize truncate/punch_hole and dax fault pathVivek Goyal5-10/+110
Currently in fuse we don't seem have any lock which can serialize fault path with truncate/punch_hole path. With dax support I need one for following reasons. 1. Dax requirement DAX fault code relies on inode size being stable for the duration of fault and want to serialize with truncate/punch_hole and they explicitly mention it. static vm_fault_t dax_iomap_pmd_fault(struct vm_fault *vmf, pfn_t *pfnp, const struct iomap_ops *ops) /* * Check whether offset isn't beyond end of file now. Caller is * supposed to hold locks serializing us with truncate / punch hole so * this is a reliable test. */ max_pgoff = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); 2. Make sure there are no users of pages being truncated/punch_hole get_user_pages() might take references to page and then do some DMA to said pages. Filesystem might truncate those pages without knowing that a DMA is in progress or some I/O is in progress. So use dax_layout_busy_page() to make sure there are no such references and I/O is not in progress on said pages before moving ahead with truncation. 3. Limitation of kvm page fault error reporting If we are truncating file on host first and then removing mappings in guest lateter (truncate page cache etc), then this could lead to a problem with KVM. Say a mapping is in place in guest and truncation happens on host. Now if guest accesses that mapping, then host will take a fault and kvm will either exit to qemu or spin infinitely. IOW, before we do truncation on host, we need to make sure that guest inode does not have any mapping in that region or whole file. 4. virtiofs memory range reclaim Soon I will introduce the notion of being able to reclaim dax memory ranges from a fuse dax inode. There also I need to make sure that no I/O or fault is going on in the reclaimed range and nobody is using it so that range can be reclaimed without issues. Currently if we take inode lock, that serializes read/write. But it does not do anything for faults. So I add another semaphore fuse_inode->i_mmap_sem for this purpose. It can be used to serialize with faults. As of now, I am adding taking this semaphore only in dax fault path and not regular fault path because existing code does not have one. May be existing code can benefit from it as well to take care of some races, but that we can fix later if need be. For now, I am just focussing only on DAX path which is new path. Also added logic to take fuse_inode->i_mmap_sem in truncate/punch_hole/open(O_TRUNC) path to make sure file truncation and fuse dax fault are mutually exlusive and avoid all the above problems. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Cc: Dave Chinner <david@fromorbit.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-10virtiofs: define dax address space operationsVivek Goyal1-0/+18
This is done along the lines of ext4 and xfs. I primarily wanted ->writepages hook at this time so that I could call into dax_writeback_mapping_range(). This in turn will decide which pfns need to be written back. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2020-09-10virtiofs: add DAX mmap supportStefan Hajnoczi2-0/+64
Add DAX mmap() support. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>