summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/RCU/checklist.rst55
-rw-r--r--Documentation/bpf/index.rst13
-rw-r--r--Documentation/bpf/libbpf/libbpf.rst14
-rw-r--r--Documentation/bpf/libbpf/libbpf_api.rst27
-rw-r--r--Documentation/bpf/libbpf/libbpf_build.rst37
-rw-r--r--Documentation/bpf/libbpf/libbpf_naming_convention.rst162
-rw-r--r--Documentation/networking/af_xdp.rst32
7 files changed, 303 insertions, 37 deletions
diff --git a/Documentation/RCU/checklist.rst b/Documentation/RCU/checklist.rst
index 1030119294d0..01cc21f17f7b 100644
--- a/Documentation/RCU/checklist.rst
+++ b/Documentation/RCU/checklist.rst
@@ -211,27 +211,40 @@ over a rather long period of time, but improvements are always welcome!
of the system, especially to real-time workloads running on
the rest of the system.
-7. As of v4.20, a given kernel implements only one RCU flavor,
- which is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y.
- If the updater uses call_rcu() or synchronize_rcu(),
- then the corresponding readers may use rcu_read_lock() and
- rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(),
- or any pair of primitives that disables and re-enables preemption,
- for example, rcu_read_lock_sched() and rcu_read_unlock_sched().
- If the updater uses synchronize_srcu() or call_srcu(),
- then the corresponding readers must use srcu_read_lock() and
- srcu_read_unlock(), and with the same srcu_struct. The rules for
- the expedited primitives are the same as for their non-expedited
- counterparts. Mixing things up will result in confusion and
- broken kernels, and has even resulted in an exploitable security
- issue.
-
- One exception to this rule: rcu_read_lock() and rcu_read_unlock()
- may be substituted for rcu_read_lock_bh() and rcu_read_unlock_bh()
- in cases where local bottom halves are already known to be
- disabled, for example, in irq or softirq context. Commenting
- such cases is a must, of course! And the jury is still out on
- whether the increased speed is worth it.
+7. As of v4.20, a given kernel implements only one RCU flavor, which
+ is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y.
+ If the updater uses call_rcu() or synchronize_rcu(), then
+ the corresponding readers may use: (1) rcu_read_lock() and
+ rcu_read_unlock(), (2) any pair of primitives that disables
+ and re-enables softirq, for example, rcu_read_lock_bh() and
+ rcu_read_unlock_bh(), or (3) any pair of primitives that disables
+ and re-enables preemption, for example, rcu_read_lock_sched() and
+ rcu_read_unlock_sched(). If the updater uses synchronize_srcu()
+ or call_srcu(), then the corresponding readers must use
+ srcu_read_lock() and srcu_read_unlock(), and with the same
+ srcu_struct. The rules for the expedited RCU grace-period-wait
+ primitives are the same as for their non-expedited counterparts.
+
+ If the updater uses call_rcu_tasks() or synchronize_rcu_tasks(),
+ then the readers must refrain from executing voluntary
+ context switches, that is, from blocking. If the updater uses
+ call_rcu_tasks_trace() or synchronize_rcu_tasks_trace(), then
+ the corresponding readers must use rcu_read_lock_trace() and
+ rcu_read_unlock_trace(). If an updater uses call_rcu_tasks_rude()
+ or synchronize_rcu_tasks_rude(), then the corresponding readers
+ must use anything that disables interrupts.
+
+ Mixing things up will result in confusion and broken kernels, and
+ has even resulted in an exploitable security issue. Therefore,
+ when using non-obvious pairs of primitives, commenting is
+ of course a must. One example of non-obvious pairing is
+ the XDP feature in networking, which calls BPF programs from
+ network-driver NAPI (softirq) context. BPF relies heavily on RCU
+ protection for its data structures, but because the BPF program
+ invocation happens entirely within a single local_bh_disable()
+ section in a NAPI poll cycle, this usage is safe. The reason
+ that this usage is safe is that readers can use anything that
+ disables BH when updaters use call_rcu() or synchronize_rcu().
8. Although synchronize_rcu() is slower than is call_rcu(), it
usually results in simpler code. So, unless update performance is
diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index 93e8cf12a6d4..baea6c2abba5 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -12,6 +12,19 @@ BPF instruction-set.
The Cilium project also maintains a `BPF and XDP Reference Guide`_
that goes into great technical depth about the BPF Architecture.
+libbpf
+======
+
+Libbpf is a userspace library for loading and interacting with bpf programs.
+
+.. toctree::
+ :maxdepth: 1
+
+ libbpf/libbpf
+ libbpf/libbpf_api
+ libbpf/libbpf_build
+ libbpf/libbpf_naming_convention
+
BPF Type Format (BTF)
=====================
diff --git a/Documentation/bpf/libbpf/libbpf.rst b/Documentation/bpf/libbpf/libbpf.rst
new file mode 100644
index 000000000000..1b1e61d5ead1
--- /dev/null
+++ b/Documentation/bpf/libbpf/libbpf.rst
@@ -0,0 +1,14 @@
+.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+
+libbpf
+======
+
+This is documentation for libbpf, a userspace library for loading and
+interacting with bpf programs.
+
+All general BPF questions, including kernel functionality, libbpf APIs and
+their application, should be sent to bpf@vger.kernel.org mailing list.
+You can `subscribe <http://vger.kernel.org/vger-lists.html#bpf>`_ to the
+mailing list search its `archive <https://lore.kernel.org/bpf/>`_.
+Please search the archive before asking new questions. It very well might
+be that this was already addressed or answered before.
diff --git a/Documentation/bpf/libbpf/libbpf_api.rst b/Documentation/bpf/libbpf/libbpf_api.rst
new file mode 100644
index 000000000000..f07eecd054da
--- /dev/null
+++ b/Documentation/bpf/libbpf/libbpf_api.rst
@@ -0,0 +1,27 @@
+.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+
+API
+===
+
+This documentation is autogenerated from header files in libbpf, tools/lib/bpf
+
+.. kernel-doc:: tools/lib/bpf/libbpf.h
+ :internal:
+
+.. kernel-doc:: tools/lib/bpf/bpf.h
+ :internal:
+
+.. kernel-doc:: tools/lib/bpf/btf.h
+ :internal:
+
+.. kernel-doc:: tools/lib/bpf/xsk.h
+ :internal:
+
+.. kernel-doc:: tools/lib/bpf/bpf_tracing.h
+ :internal:
+
+.. kernel-doc:: tools/lib/bpf/bpf_core_read.h
+ :internal:
+
+.. kernel-doc:: tools/lib/bpf/bpf_endian.h
+ :internal: \ No newline at end of file
diff --git a/Documentation/bpf/libbpf/libbpf_build.rst b/Documentation/bpf/libbpf/libbpf_build.rst
new file mode 100644
index 000000000000..8e8c23e8093d
--- /dev/null
+++ b/Documentation/bpf/libbpf/libbpf_build.rst
@@ -0,0 +1,37 @@
+.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+
+Building libbpf
+===============
+
+libelf and zlib are internal dependencies of libbpf and thus are required to link
+against and must be installed on the system for applications to work.
+pkg-config is used by default to find libelf, and the program called
+can be overridden with PKG_CONFIG.
+
+If using pkg-config at build time is not desired, it can be disabled by
+setting NO_PKG_CONFIG=1 when calling make.
+
+To build both static libbpf.a and shared libbpf.so:
+
+.. code-block:: bash
+
+ $ cd src
+ $ make
+
+To build only static libbpf.a library in directory build/ and install them
+together with libbpf headers in a staging directory root/:
+
+.. code-block:: bash
+
+ $ cd src
+ $ mkdir build root
+ $ BUILD_STATIC_ONLY=y OBJDIR=build DESTDIR=root make install
+
+To build both static libbpf.a and shared libbpf.so against a custom libelf
+dependency installed in /build/root/ and install them together with libbpf
+headers in a build directory /build/root/:
+
+.. code-block:: bash
+
+ $ cd src
+ $ PKG_CONFIG_PATH=/build/root/lib64/pkgconfig DESTDIR=/build/root make \ No newline at end of file
diff --git a/Documentation/bpf/libbpf/libbpf_naming_convention.rst b/Documentation/bpf/libbpf/libbpf_naming_convention.rst
new file mode 100644
index 000000000000..3de1d51e41da
--- /dev/null
+++ b/Documentation/bpf/libbpf/libbpf_naming_convention.rst
@@ -0,0 +1,162 @@
+.. SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+
+API naming convention
+=====================
+
+libbpf API provides access to a few logically separated groups of
+functions and types. Every group has its own naming convention
+described here. It's recommended to follow these conventions whenever a
+new function or type is added to keep libbpf API clean and consistent.
+
+All types and functions provided by libbpf API should have one of the
+following prefixes: ``bpf_``, ``btf_``, ``libbpf_``, ``xsk_``,
+``btf_dump_``, ``ring_buffer_``, ``perf_buffer_``.
+
+System call wrappers
+--------------------
+
+System call wrappers are simple wrappers for commands supported by
+sys_bpf system call. These wrappers should go to ``bpf.h`` header file
+and map one to one to corresponding commands.
+
+For example ``bpf_map_lookup_elem`` wraps ``BPF_MAP_LOOKUP_ELEM``
+command of sys_bpf, ``bpf_prog_attach`` wraps ``BPF_PROG_ATTACH``, etc.
+
+Objects
+-------
+
+Another class of types and functions provided by libbpf API is "objects"
+and functions to work with them. Objects are high-level abstractions
+such as BPF program or BPF map. They're represented by corresponding
+structures such as ``struct bpf_object``, ``struct bpf_program``,
+``struct bpf_map``, etc.
+
+Structures are forward declared and access to their fields should be
+provided via corresponding getters and setters rather than directly.
+
+These objects are associated with corresponding parts of ELF object that
+contains compiled BPF programs.
+
+For example ``struct bpf_object`` represents ELF object itself created
+from an ELF file or from a buffer, ``struct bpf_program`` represents a
+program in ELF object and ``struct bpf_map`` is a map.
+
+Functions that work with an object have names built from object name,
+double underscore and part that describes function purpose.
+
+For example ``bpf_object__open`` consists of the name of corresponding
+object, ``bpf_object``, double underscore and ``open`` that defines the
+purpose of the function to open ELF file and create ``bpf_object`` from
+it.
+
+All objects and corresponding functions other than BTF related should go
+to ``libbpf.h``. BTF types and functions should go to ``btf.h``.
+
+Auxiliary functions
+-------------------
+
+Auxiliary functions and types that don't fit well in any of categories
+described above should have ``libbpf_`` prefix, e.g.
+``libbpf_get_error`` or ``libbpf_prog_type_by_name``.
+
+AF_XDP functions
+-------------------
+
+AF_XDP functions should have an ``xsk_`` prefix, e.g.
+``xsk_umem__get_data`` or ``xsk_umem__create``. The interface consists
+of both low-level ring access functions and high-level configuration
+functions. These can be mixed and matched. Note that these functions
+are not reentrant for performance reasons.
+
+ABI
+==========
+
+libbpf can be both linked statically or used as DSO. To avoid possible
+conflicts with other libraries an application is linked with, all
+non-static libbpf symbols should have one of the prefixes mentioned in
+API documentation above. See API naming convention to choose the right
+name for a new symbol.
+
+Symbol visibility
+-----------------
+
+libbpf follow the model when all global symbols have visibility "hidden"
+by default and to make a symbol visible it has to be explicitly
+attributed with ``LIBBPF_API`` macro. For example:
+
+.. code-block:: c
+
+ LIBBPF_API int bpf_prog_get_fd_by_id(__u32 id);
+
+This prevents from accidentally exporting a symbol, that is not supposed
+to be a part of ABI what, in turn, improves both libbpf developer- and
+user-experiences.
+
+ABI versionning
+---------------
+
+To make future ABI extensions possible libbpf ABI is versioned.
+Versioning is implemented by ``libbpf.map`` version script that is
+passed to linker.
+
+Version name is ``LIBBPF_`` prefix + three-component numeric version,
+starting from ``0.0.1``.
+
+Every time ABI is being changed, e.g. because a new symbol is added or
+semantic of existing symbol is changed, ABI version should be bumped.
+This bump in ABI version is at most once per kernel development cycle.
+
+For example, if current state of ``libbpf.map`` is:
+
+.. code-block:: c
+
+ LIBBPF_0.0.1 {
+ global:
+ bpf_func_a;
+ bpf_func_b;
+ local:
+ \*;
+ };
+
+, and a new symbol ``bpf_func_c`` is being introduced, then
+``libbpf.map`` should be changed like this:
+
+.. code-block:: c
+
+ LIBBPF_0.0.1 {
+ global:
+ bpf_func_a;
+ bpf_func_b;
+ local:
+ \*;
+ };
+ LIBBPF_0.0.2 {
+ global:
+ bpf_func_c;
+ } LIBBPF_0.0.1;
+
+, where new version ``LIBBPF_0.0.2`` depends on the previous
+``LIBBPF_0.0.1``.
+
+Format of version script and ways to handle ABI changes, including
+incompatible ones, described in details in [1].
+
+Stand-alone build
+-------------------
+
+Under https://github.com/libbpf/libbpf there is a (semi-)automated
+mirror of the mainline's version of libbpf for a stand-alone build.
+
+However, all changes to libbpf's code base must be upstreamed through
+the mainline kernel tree.
+
+License
+-------------------
+
+libbpf is dual-licensed under LGPL 2.1 and BSD 2-Clause.
+
+Links
+-------------------
+
+[1] https://www.akkadia.org/drepper/dsohowto.pdf
+ (Chapter 3. Maintaining APIs and ABIs).
diff --git a/Documentation/networking/af_xdp.rst b/Documentation/networking/af_xdp.rst
index 2ccc5644cc98..42576880aa4a 100644
--- a/Documentation/networking/af_xdp.rst
+++ b/Documentation/networking/af_xdp.rst
@@ -290,19 +290,19 @@ round-robin example of distributing packets is shown below:
#define MAX_SOCKS 16
struct {
- __uint(type, BPF_MAP_TYPE_XSKMAP);
- __uint(max_entries, MAX_SOCKS);
- __uint(key_size, sizeof(int));
- __uint(value_size, sizeof(int));
+ __uint(type, BPF_MAP_TYPE_XSKMAP);
+ __uint(max_entries, MAX_SOCKS);
+ __uint(key_size, sizeof(int));
+ __uint(value_size, sizeof(int));
} xsks_map SEC(".maps");
static unsigned int rr;
SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx)
{
- rr = (rr + 1) & (MAX_SOCKS - 1);
+ rr = (rr + 1) & (MAX_SOCKS - 1);
- return bpf_redirect_map(&xsks_map, rr, XDP_DROP);
+ return bpf_redirect_map(&xsks_map, rr, XDP_DROP);
}
Note, that since there is only a single set of FILL and COMPLETION
@@ -379,7 +379,7 @@ would look like this for the TX path:
.. code-block:: c
if (xsk_ring_prod__needs_wakeup(&my_tx_ring))
- sendto(xsk_socket__fd(xsk_handle), NULL, 0, MSG_DONTWAIT, NULL, 0);
+ sendto(xsk_socket__fd(xsk_handle), NULL, 0, MSG_DONTWAIT, NULL, 0);
I.e., only use the syscall if the flag is set.
@@ -442,9 +442,9 @@ purposes. The supported statistics are shown below:
.. code-block:: c
struct xdp_statistics {
- __u64 rx_dropped; /* Dropped for reasons other than invalid desc */
- __u64 rx_invalid_descs; /* Dropped due to invalid descriptor */
- __u64 tx_invalid_descs; /* Dropped due to invalid descriptor */
+ __u64 rx_dropped; /* Dropped for reasons other than invalid desc */
+ __u64 rx_invalid_descs; /* Dropped due to invalid descriptor */
+ __u64 tx_invalid_descs; /* Dropped due to invalid descriptor */
};
XDP_OPTIONS getsockopt
@@ -483,15 +483,15 @@ like this:
.. code-block:: c
// struct xdp_rxtx_ring {
- // __u32 *producer;
- // __u32 *consumer;
- // struct xdp_desc *desc;
+ // __u32 *producer;
+ // __u32 *consumer;
+ // struct xdp_desc *desc;
// };
// struct xdp_umem_ring {
- // __u32 *producer;
- // __u32 *consumer;
- // __u64 *desc;
+ // __u32 *producer;
+ // __u32 *consumer;
+ // __u64 *desc;
// };
// typedef struct xdp_rxtx_ring RING;