aboutsummaryrefslogtreecommitdiff
path: root/Documentation/networking
diff options
context:
space:
mode:
authorJakub Kicinski <kuba@kernel.org>2025-07-17 18:07:37 -0700
committerJakub Kicinski <kuba@kernel.org>2025-07-17 18:07:37 -0700
commitffe5aedc439cd59c0fb267c845a733fbb41532de (patch)
tree43e157e96169e6eee505acfdebd4ed74ed3493a6 /Documentation/networking
parent797f080c463d9866ca8a4bcc8cf0f512dec634e6 (diff)
parentef57dc6f52e4949527f82a456cb9a637a55209ea (diff)
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-07-17 We've added 13 non-merge commits during the last 20 day(s) which contain a total of 4 files changed, 712 insertions(+), 84 deletions(-). The main changes are: 1) Avoid skipping or repeating a sk when using a TCP bpf_iter, from Jordan Rife. 2) Clarify the driver requirement on using the XDP metadata, from Song Yoong Siang * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: doc: xdp: Clarify driver implementation for XDP Rx metadata selftests/bpf: Add tests for bucket resume logic in established sockets selftests/bpf: Create iter_tcp_destroy test program selftests/bpf: Create established sockets in socket iterator tests selftests/bpf: Make ehash buckets configurable in socket iterator tests selftests/bpf: Allow for iteration over multiple states selftests/bpf: Allow for iteration over multiple ports selftests/bpf: Add tests for bucket resume logic in listening sockets bpf: tcp: Avoid socket skips and repeats during iteration bpf: tcp: Use bpf_tcp_iter_batch_item for bpf_tcp_iter_state batch items bpf: tcp: Get rid of st_bucket_done bpf: tcp: Make sure iter->batch always contains a full bucket snapshot bpf: tcp: Make mem flags configurable through bpf_iter_tcp_realloc_batch ==================== Link: https://patch.msgid.link/20250717191731.4142326-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'Documentation/networking')
-rw-r--r--Documentation/networking/xdp-rx-metadata.rst33
1 files changed, 33 insertions, 0 deletions
diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
index a6e0ece18be5..ce96f4c99505 100644
--- a/Documentation/networking/xdp-rx-metadata.rst
+++ b/Documentation/networking/xdp-rx-metadata.rst
@@ -120,6 +120,39 @@ It is possible to query which kfunc the particular netdev implements via
netlink. See ``xdp-rx-metadata-features`` attribute set in
``Documentation/netlink/specs/netdev.yaml``.
+Driver Implementation
+=====================
+
+Certain devices may prepend metadata to received packets. However, as of now,
+``AF_XDP`` lacks the ability to communicate the size of the ``data_meta`` area
+to the consumer. Therefore, it is the responsibility of the driver to copy any
+device-reserved metadata out from the metadata area and ensure that
+``xdp_buff->data_meta`` is pointing to ``xdp_buff->data`` before presenting the
+frame to the XDP program. This is necessary so that, after the XDP program
+adjusts the metadata area, the consumer can reliably retrieve the metadata
+address using ``METADATA_SIZE`` offset.
+
+The following diagram shows how custom metadata is positioned relative to the
+packet data and how pointers are adjusted for metadata access::
+
+ |<-- bpf_xdp_adjust_meta(xdp_buff, -METADATA_SIZE) --|
+ new xdp_buff->data_meta old xdp_buff->data_meta
+ | |
+ | xdp_buff->data
+ | |
+ +----------+----------------------------------------------------+------+
+ | headroom | custom metadata | data |
+ +----------+----------------------------------------------------+------+
+ | |
+ | xdp_desc->addr
+ |<------ xsk_umem__get_data() - METADATA_SIZE -------|
+
+``bpf_xdp_adjust_meta`` ensures that ``METADATA_SIZE`` is aligned to 4 bytes,
+does not exceed 252 bytes, and leaves sufficient space for building the
+xdp_frame. If these conditions are not met, it returns a negative error. In this
+case, the BPF program should not proceed to populate data into the ``data_meta``
+area.
+
Example
=======