summaryrefslogtreecommitdiff
path: root/kernel/sched/stop_task.c
diff options
context:
space:
mode:
authorCarlos Llamas <cmllamas@google.com>2023-12-01 20:21:57 +0300
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>2023-12-05 03:23:41 +0300
commit7710e2cca32e7f3958480e8bd44f50e29d0c2509 (patch)
treedd750a11cdc5c1fd884f653c7c7437561c65a16a /kernel/sched/stop_task.c
parente50f4e6cc9bfaca655d3b6a3506d27cf2caa1d40 (diff)
downloadlinux-7710e2cca32e7f3958480e8bd44f50e29d0c2509.tar.xz
binder: switch alloc->mutex to spinlock_t
The alloc->mutex is a highly contended lock that causes performance issues on Android devices. When a low-priority task is given this lock and it sleeps, it becomes difficult for the task to wake up and complete its work. This delays other tasks that are also waiting on the mutex. The problem gets worse when there is memory pressure in the system, because this increases the contention on the alloc->mutex while the shrinker reclaims binder pages. Switching to a spinlock helps to keep the waiters running and avoids the overhead of waking up tasks. This significantly improves the transaction latency when the problematic scenario occurs. The performance impact of this patchset was measured by stress-testing the binder alloc contention. In this test, several clients of different priorities send thousands of transactions of different sizes to a single server. In parallel, pages get reclaimed using the shinker's debugfs. The test was run on a Pixel 8, Pixel 6 and qemu machine. The results were similar on all three devices: after: | sched | prio | average | max | min | |--------+------+---------+-----------+---------| | fifo | 99 | 0.135ms | 1.197ms | 0.022ms | | fifo | 01 | 0.136ms | 5.232ms | 0.018ms | | other | -20 | 0.180ms | 7.403ms | 0.019ms | | other | 19 | 0.241ms | 58.094ms | 0.018ms | before: | sched | prio | average | max | min | |--------+------+---------+-----------+---------| | fifo | 99 | 0.350ms | 248.730ms | 0.020ms | | fifo | 01 | 0.357ms | 248.817ms | 0.024ms | | other | -20 | 0.399ms | 249.906ms | 0.020ms | | other | 19 | 0.477ms | 297.756ms | 0.022ms | The key metrics above are the average and max latencies (wall time). These improvements should roughly translate to p95-p99 latencies on real workloads. The response time is up to 200x faster in these scenarios and there is no penalty in the regular path. Note that it is only possible to convert this lock after a series of changes made by previous patches. These mainly include refactoring the sections that might_sleep() and changing the locking order with the mmap_lock amongst others. Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://lore.kernel.org/r/20231201172212.1813387-29-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Diffstat (limited to 'kernel/sched/stop_task.c')
0 files changed, 0 insertions, 0 deletions