diff options
Diffstat (limited to 'Documentation/vm')
-rw-r--r-- | Documentation/vm/hmm.rst | 36 | ||||
-rw-r--r-- | Documentation/vm/memory-model.rst | 9 | ||||
-rw-r--r-- | Documentation/vm/page_owner.rst | 3 | ||||
-rw-r--r-- | Documentation/vm/slub.rst | 2 | ||||
-rw-r--r-- | Documentation/vm/transhuge.rst | 4 |
5 files changed, 22 insertions, 32 deletions
diff --git a/Documentation/vm/hmm.rst b/Documentation/vm/hmm.rst index 4e3e9362afeb..6f9e000757fa 100644 --- a/Documentation/vm/hmm.rst +++ b/Documentation/vm/hmm.rst @@ -161,7 +161,7 @@ device must complete the update before the driver callback returns. When the device driver wants to populate a range of virtual addresses, it can use:: - long hmm_range_fault(struct hmm_range *range); + int hmm_range_fault(struct hmm_range *range); It will trigger a page fault on missing or read-only entries if write access is requested (see below). Page faults use the generic mm page fault code path just @@ -184,25 +184,22 @@ The usage pattern is:: range.notifier = &interval_sub; range.start = ...; range.end = ...; - range.pfns = ...; - range.flags = ...; - range.values = ...; - range.pfn_shift = ...; + range.hmm_pfns = ...; if (!mmget_not_zero(interval_sub->notifier.mm)) return -EFAULT; again: range.notifier_seq = mmu_interval_read_begin(&interval_sub); - down_read(&mm->mmap_sem); + mmap_read_lock(mm); ret = hmm_range_fault(&range); if (ret) { - up_read(&mm->mmap_sem); + mmap_read_unlock(mm); if (ret == -EBUSY) goto again; return ret; } - up_read(&mm->mmap_sem); + mmap_read_unlock(mm); take_lock(driver->update); if (mmu_interval_read_retry(&ni, range.notifier_seq) { @@ -229,15 +226,10 @@ The hmm_range struct has 2 fields, default_flags and pfn_flags_mask, that specif fault or snapshot policy for the whole range instead of having to set them for each entry in the pfns array. -For instance, if the device flags for range.flags are:: +For instance if the device driver wants pages for a range with at least read +permission, it sets:: - range.flags[HMM_PFN_VALID] = (1 << 63); - range.flags[HMM_PFN_WRITE] = (1 << 62); - -and the device driver wants pages for a range with at least read permission, -it sets:: - - range->default_flags = (1 << 63); + range->default_flags = HMM_PFN_REQ_FAULT; range->pfn_flags_mask = 0; and calls hmm_range_fault() as described above. This will fill fault all pages @@ -246,18 +238,18 @@ in the range with at least read permission. Now let's say the driver wants to do the same except for one page in the range for which it wants to have write permission. Now driver set:: - range->default_flags = (1 << 63); - range->pfn_flags_mask = (1 << 62); - range->pfns[index_of_write] = (1 << 62); + range->default_flags = HMM_PFN_REQ_FAULT; + range->pfn_flags_mask = HMM_PFN_REQ_WRITE; + range->pfns[index_of_write] = HMM_PFN_REQ_WRITE; With this, HMM will fault in all pages with at least read (i.e., valid) and for the address == range->start + (index_of_write << PAGE_SHIFT) it will fault with write permission i.e., if the CPU pte does not have write permission set then HMM will call handle_mm_fault(). -Note that HMM will populate the pfns array with write permission for any page -that is mapped with CPU write permission no matter what values are set -in default_flags or pfn_flags_mask. +After hmm_range_fault completes the flag bits are set to the current state of +the page tables, ie HMM_PFN_VALID | HMM_PFN_WRITE will be set if the page is +writable. Represent and manage device memory from core kernel point of view diff --git a/Documentation/vm/memory-model.rst b/Documentation/vm/memory-model.rst index 58a12376b7df..91228044ed16 100644 --- a/Documentation/vm/memory-model.rst +++ b/Documentation/vm/memory-model.rst @@ -46,11 +46,10 @@ maps the entire physical memory. For most architectures, the holes have entries in the `mem_map` array. The `struct page` objects corresponding to the holes are never fully initialized. -To allocate the `mem_map` array, architecture specific setup code -should call :c:func:`free_area_init_node` function or its convenience -wrapper :c:func:`free_area_init`. Yet, the mappings array is not -usable until the call to :c:func:`memblock_free_all` that hands all -the memory to the page allocator. +To allocate the `mem_map` array, architecture specific setup code should +call :c:func:`free_area_init` function. Yet, the mappings array is not +usable until the call to :c:func:`memblock_free_all` that hands all the +memory to the page allocator. If an architecture enables `CONFIG_ARCH_HAS_HOLES_MEMORYMODEL` option, it may free parts of the `mem_map` array that do not cover the diff --git a/Documentation/vm/page_owner.rst b/Documentation/vm/page_owner.rst index 0ed5ab8c7ab4..079f3f8c4784 100644 --- a/Documentation/vm/page_owner.rst +++ b/Documentation/vm/page_owner.rst @@ -83,8 +83,7 @@ Usage 4) Analyze information from page owner:: cat /sys/kernel/debug/page_owner > page_owner_full.txt - grep -v ^PFN page_owner_full.txt > page_owner.txt - ./page_owner_sort page_owner.txt sorted_page_owner.txt + ./page_owner_sort page_owner_full.txt sorted_page_owner.txt See the result about who allocated each page in the ``sorted_page_owner.txt``. diff --git a/Documentation/vm/slub.rst b/Documentation/vm/slub.rst index 933ada4368ff..4eee598555c9 100644 --- a/Documentation/vm/slub.rst +++ b/Documentation/vm/slub.rst @@ -49,7 +49,7 @@ Possible debug options are:: P Poisoning (object and padding) U User tracking (free and alloc) T Trace (please only use on single slabs) - A Toggle failslab filter mark for the cache + A Enable failslab filter mark for the cache O Switch debugging off for caches that would have caused higher minimum slab orders - Switch all debugging off (useful if the kernel is diff --git a/Documentation/vm/transhuge.rst b/Documentation/vm/transhuge.rst index 37c57ca32629..0ed23e59abe5 100644 --- a/Documentation/vm/transhuge.rst +++ b/Documentation/vm/transhuge.rst @@ -98,9 +98,9 @@ split_huge_page() or split_huge_pmd() has a cost. To make pagetable walks huge pmd aware, all you need to do is to call pmd_trans_huge() on the pmd returned by pmd_offset. You must hold the -mmap_sem in read (or write) mode to be sure a huge pmd cannot be +mmap_lock in read (or write) mode to be sure a huge pmd cannot be created from under you by khugepaged (khugepaged collapse_huge_page -takes the mmap_sem in write mode in addition to the anon_vma lock). If +takes the mmap_lock in write mode in addition to the anon_vma lock). If pmd_trans_huge returns false, you just fallback in the old code paths. If instead pmd_trans_huge returns true, you have to take the page table lock (pmd_lock()) and re-run pmd_trans_huge. Taking the |