From e7096c131e5161fa3b8e52a650d7719d2857adfd Mon Sep 17 00:00:00 2001 From: "Jason A. Donenfeld" Date: Mon, 9 Dec 2019 00:27:34 +0100 Subject: net: WireGuard secure network tunnel WireGuard is a layer 3 secure networking tunnel made specifically for the kernel, that aims to be much simpler and easier to audit than IPsec. Extensive documentation and description of the protocol and considerations, along with formal proofs of the cryptography, are available at: * https://www.wireguard.com/ * https://www.wireguard.com/papers/wireguard.pdf This commit implements WireGuard as a simple network device driver, accessible in the usual RTNL way used by virtual network drivers. It makes use of the udp_tunnel APIs, GRO, GSO, NAPI, and the usual set of networking subsystem APIs. It has a somewhat novel multicore queueing system designed for maximum throughput and minimal latency of encryption operations, but it is implemented modestly using workqueues and NAPI. Configuration is done via generic Netlink, and following a review from the Netlink maintainer a year ago, several high profile userspace tools have already implemented the API. This commit also comes with several different tests, both in-kernel tests and out-of-kernel tests based on network namespaces, taking profit of the fact that sockets used by WireGuard intentionally stay in the namespace the WireGuard interface was originally created, exactly like the semantics of userspace tun devices. See wireguard.com/netns/ for pictures and examples. The source code is fairly short, but rather than combining everything into a single file, WireGuard is developed as cleanly separable files, making auditing and comprehension easier. Things are laid out as follows: * noise.[ch], cookie.[ch], messages.h: These implement the bulk of the cryptographic aspects of the protocol, and are mostly data-only in nature, taking in buffers of bytes and spitting out buffers of bytes. They also handle reference counting for their various shared pieces of data, like keys and key lists. * ratelimiter.[ch]: Used as an integral part of cookie.[ch] for ratelimiting certain types of cryptographic operations in accordance with particular WireGuard semantics. * allowedips.[ch], peerlookup.[ch]: The main lookup structures of WireGuard, the former being trie-like with particular semantics, an integral part of the design of the protocol, and the latter just being nice helper functions around the various hashtables we use. * device.[ch]: Implementation of functions for the netdevice and for rtnl, responsible for maintaining the life of a given interface and wiring it up to the rest of WireGuard. * peer.[ch]: Each interface has a list of peers, with helper functions available here for creation, destruction, and reference counting. * socket.[ch]: Implementation of functions related to udp_socket and the general set of kernel socket APIs, for sending and receiving ciphertext UDP packets, and taking care of WireGuard-specific sticky socket routing semantics for the automatic roaming. * netlink.[ch]: Userspace API entry point for configuring WireGuard peers and devices. The API has been implemented by several userspace tools and network management utility, and the WireGuard project distributes the basic wg(8) tool. * queueing.[ch]: Shared function on the rx and tx path for handling the various queues used in the multicore algorithms. * send.c: Handles encrypting outgoing packets in parallel on multiple cores, before sending them in order on a single core, via workqueues and ring buffers. Also handles sending handshake and cookie messages as part of the protocol, in parallel. * receive.c: Handles decrypting incoming packets in parallel on multiple cores, before passing them off in order to be ingested via the rest of the networking subsystem with GRO via the typical NAPI poll function. Also handles receiving handshake and cookie messages as part of the protocol, in parallel. * timers.[ch]: Uses the timer wheel to implement protocol particular event timeouts, and gives a set of very simple event-driven entry point functions for callers. * main.c, version.h: Initialization and deinitialization of the module. * selftest/*.h: Runtime unit tests for some of the most security sensitive functions. * tools/testing/selftests/wireguard/netns.sh: Aforementioned testing script using network namespaces. This commit aims to be as self-contained as possible, implementing WireGuard as a standalone module not needing much special handling or coordination from the network subsystem. I expect for future optimizations to the network stack to positively improve WireGuard, and vice-versa, but for the time being, this exists as intentionally standalone. We introduce a menu option for CONFIG_WIREGUARD, as well as providing a verbose debug log and self-tests via CONFIG_WIREGUARD_DEBUG. Signed-off-by: Jason A. Donenfeld Cc: David Miller Cc: Greg KH Cc: Linus Torvalds Cc: Herbert Xu Cc: linux-crypto@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller --- include/uapi/linux/wireguard.h | 196 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 196 insertions(+) create mode 100644 include/uapi/linux/wireguard.h (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h new file mode 100644 index 000000000000..dd8a47c4ad11 --- /dev/null +++ b/include/uapi/linux/wireguard.h @@ -0,0 +1,196 @@ +/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR MIT */ +/* + * Copyright (C) 2015-2019 Jason A. Donenfeld . All Rights Reserved. + * + * Documentation + * ============= + * + * The below enums and macros are for interfacing with WireGuard, using generic + * netlink, with family WG_GENL_NAME and version WG_GENL_VERSION. It defines two + * methods: get and set. Note that while they share many common attributes, + * these two functions actually accept a slightly different set of inputs and + * outputs. + * + * WG_CMD_GET_DEVICE + * ----------------- + * + * May only be called via NLM_F_REQUEST | NLM_F_DUMP. The command should contain + * one but not both of: + * + * WGDEVICE_A_IFINDEX: NLA_U32 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * + * The kernel will then return several messages (NLM_F_MULTI) containing the + * following tree of nested items: + * + * WGDEVICE_A_IFINDEX: NLA_U32 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * WGDEVICE_A_PRIVATE_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGDEVICE_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGDEVICE_A_LISTEN_PORT: NLA_U16 + * WGDEVICE_A_FWMARK: NLA_U32 + * WGDEVICE_A_PEERS: NLA_NESTED + * 0: NLA_NESTED + * WGPEER_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGPEER_A_PRESHARED_KEY: NLA_EXACT_LEN, len WG_KEY_LEN + * WGPEER_A_ENDPOINT: NLA_MIN_LEN(struct sockaddr), struct sockaddr_in or struct sockaddr_in6 + * WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL: NLA_U16 + * WGPEER_A_LAST_HANDSHAKE_TIME: NLA_EXACT_LEN, struct __kernel_timespec + * WGPEER_A_RX_BYTES: NLA_U64 + * WGPEER_A_TX_BYTES: NLA_U64 + * WGPEER_A_ALLOWEDIPS: NLA_NESTED + * 0: NLA_NESTED + * WGALLOWEDIP_A_FAMILY: NLA_U16 + * WGALLOWEDIP_A_IPADDR: NLA_MIN_LEN(struct in_addr), struct in_addr or struct in6_addr + * WGALLOWEDIP_A_CIDR_MASK: NLA_U8 + * 0: NLA_NESTED + * ... + * 0: NLA_NESTED + * ... + * ... + * WGPEER_A_PROTOCOL_VERSION: NLA_U32 + * 0: NLA_NESTED + * ... + * ... + * + * It is possible that all of the allowed IPs of a single peer will not + * fit within a single netlink message. In that case, the same peer will + * be written in the following message, except it will only contain + * WGPEER_A_PUBLIC_KEY and WGPEER_A_ALLOWEDIPS. This may occur several + * times in a row for the same peer. It is then up to the receiver to + * coalesce adjacent peers. Likewise, it is possible that all peers will + * not fit within a single message. So, subsequent peers will be sent + * in following messages, except those will only contain WGDEVICE_A_IFNAME + * and WGDEVICE_A_PEERS. It is then up to the receiver to coalesce these + * messages to form the complete list of peers. + * + * Since this is an NLA_F_DUMP command, the final message will always be + * NLMSG_DONE, even if an error occurs. However, this NLMSG_DONE message + * contains an integer error code. It is either zero or a negative error + * code corresponding to the errno. + * + * WG_CMD_SET_DEVICE + * ----------------- + * + * May only be called via NLM_F_REQUEST. The command should contain the + * following tree of nested items, containing one but not both of + * WGDEVICE_A_IFINDEX and WGDEVICE_A_IFNAME: + * + * WGDEVICE_A_IFINDEX: NLA_U32 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * WGDEVICE_A_FLAGS: NLA_U32, 0 or WGDEVICE_F_REPLACE_PEERS if all current + * peers should be removed prior to adding the list below. + * WGDEVICE_A_PRIVATE_KEY: len WG_KEY_LEN, all zeros to remove + * WGDEVICE_A_LISTEN_PORT: NLA_U16, 0 to choose randomly + * WGDEVICE_A_FWMARK: NLA_U32, 0 to disable + * WGDEVICE_A_PEERS: NLA_NESTED + * 0: NLA_NESTED + * WGPEER_A_PUBLIC_KEY: len WG_KEY_LEN + * WGPEER_A_FLAGS: NLA_U32, 0 and/or WGPEER_F_REMOVE_ME if the + * specified peer should not exist at the end of the + * operation, rather than added/updated and/or + * WGPEER_F_REPLACE_ALLOWEDIPS if all current allowed + * IPs of this peer should be removed prior to adding + * the list below and/or WGPEER_F_UPDATE_ONLY if the + * peer should only be set if it already exists. + * WGPEER_A_PRESHARED_KEY: len WG_KEY_LEN, all zeros to remove + * WGPEER_A_ENDPOINT: struct sockaddr_in or struct sockaddr_in6 + * WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL: NLA_U16, 0 to disable + * WGPEER_A_ALLOWEDIPS: NLA_NESTED + * 0: NLA_NESTED + * WGALLOWEDIP_A_FAMILY: NLA_U16 + * WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr + * WGALLOWEDIP_A_CIDR_MASK: NLA_U8 + * 0: NLA_NESTED + * ... + * 0: NLA_NESTED + * ... + * ... + * WGPEER_A_PROTOCOL_VERSION: NLA_U32, should not be set or used at + * all by most users of this API, as the + * most recent protocol will be used when + * this is unset. Otherwise, must be set + * to 1. + * 0: NLA_NESTED + * ... + * ... + * + * It is possible that the amount of configuration data exceeds that of + * the maximum message length accepted by the kernel. In that case, several + * messages should be sent one after another, with each successive one + * filling in information not contained in the prior. Note that if + * WGDEVICE_F_REPLACE_PEERS is specified in the first message, it probably + * should not be specified in fragments that come after, so that the list + * of peers is only cleared the first time but appened after. Likewise for + * peers, if WGPEER_F_REPLACE_ALLOWEDIPS is specified in the first message + * of a peer, it likely should not be specified in subsequent fragments. + * + * If an error occurs, NLMSG_ERROR will reply containing an errno. + */ + +#ifndef _WG_UAPI_WIREGUARD_H +#define _WG_UAPI_WIREGUARD_H + +#define WG_GENL_NAME "wireguard" +#define WG_GENL_VERSION 1 + +#define WG_KEY_LEN 32 + +enum wg_cmd { + WG_CMD_GET_DEVICE, + WG_CMD_SET_DEVICE, + __WG_CMD_MAX +}; +#define WG_CMD_MAX (__WG_CMD_MAX - 1) + +enum wgdevice_flag { + WGDEVICE_F_REPLACE_PEERS = 1U << 0, + __WGDEVICE_F_ALL = WGDEVICE_F_REPLACE_PEERS +}; +enum wgdevice_attribute { + WGDEVICE_A_UNSPEC, + WGDEVICE_A_IFINDEX, + WGDEVICE_A_IFNAME, + WGDEVICE_A_PRIVATE_KEY, + WGDEVICE_A_PUBLIC_KEY, + WGDEVICE_A_FLAGS, + WGDEVICE_A_LISTEN_PORT, + WGDEVICE_A_FWMARK, + WGDEVICE_A_PEERS, + __WGDEVICE_A_LAST +}; +#define WGDEVICE_A_MAX (__WGDEVICE_A_LAST - 1) + +enum wgpeer_flag { + WGPEER_F_REMOVE_ME = 1U << 0, + WGPEER_F_REPLACE_ALLOWEDIPS = 1U << 1, + WGPEER_F_UPDATE_ONLY = 1U << 2, + __WGPEER_F_ALL = WGPEER_F_REMOVE_ME | WGPEER_F_REPLACE_ALLOWEDIPS | + WGPEER_F_UPDATE_ONLY +}; +enum wgpeer_attribute { + WGPEER_A_UNSPEC, + WGPEER_A_PUBLIC_KEY, + WGPEER_A_PRESHARED_KEY, + WGPEER_A_FLAGS, + WGPEER_A_ENDPOINT, + WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL, + WGPEER_A_LAST_HANDSHAKE_TIME, + WGPEER_A_RX_BYTES, + WGPEER_A_TX_BYTES, + WGPEER_A_ALLOWEDIPS, + WGPEER_A_PROTOCOL_VERSION, + __WGPEER_A_LAST +}; +#define WGPEER_A_MAX (__WGPEER_A_LAST - 1) + +enum wgallowedip_attribute { + WGALLOWEDIP_A_UNSPEC, + WGALLOWEDIP_A_FAMILY, + WGALLOWEDIP_A_IPADDR, + WGALLOWEDIP_A_CIDR_MASK, + __WGALLOWEDIP_A_LAST +}; +#define WGALLOWEDIP_A_MAX (__WGALLOWEDIP_A_LAST - 1) + +#endif /* _WG_UAPI_WIREGUARD_H */ -- cgit v1.2.3 From e27cca96cd68fa2c6814c90f9a1cfd36bb68c593 Mon Sep 17 00:00:00 2001 From: Sabrina Dubroca Date: Mon, 25 Nov 2019 14:49:02 +0100 Subject: xfrm: add espintcp (RFC 8229) TCP encapsulation of IKE and IPsec messages (RFC 8229) is implemented as a TCP ULP, overriding in particular the sendmsg and recvmsg operations. A Stream Parser is used to extract messages out of the TCP stream using the first 2 bytes as length marker. Received IKE messages are put on "ike_queue", waiting to be dequeued by the custom recvmsg implementation. Received ESP messages are sent to XFRM, like with UDP encapsulation. Some of this code is taken from the original submission by Herbert Xu. Currently, only IPv4 is supported, like for UDP encapsulation. Co-developed-by: Herbert Xu Signed-off-by: Herbert Xu Signed-off-by: Sabrina Dubroca Acked-by: David S. Miller Signed-off-by: Steffen Klassert --- include/uapi/linux/udp.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h index 30baccb6c9c4..4828794efcf8 100644 --- a/include/uapi/linux/udp.h +++ b/include/uapi/linux/udp.h @@ -42,5 +42,6 @@ struct udphdr { #define UDP_ENCAP_GTP0 4 /* GSM TS 09.60 */ #define UDP_ENCAP_GTP1U 5 /* 3GPP TS 29.060 */ #define UDP_ENCAP_RXRPC 6 +#define TCP_ENCAP_ESPINTCP 7 /* Yikes, this is really xfrm encap types. */ #endif /* _UAPI_LINUX_UDP_H */ -- cgit v1.2.3 From c02a81fba74fe3488ad6b08bfb5a1329005418f8 Mon Sep 17 00:00:00 2001 From: "Andrew F. Davis" Date: Tue, 3 Dec 2019 17:26:37 +0000 Subject: dma-buf: Add dma-buf heaps framework This framework allows a unified userspace interface for dma-buf exporters, allowing userland to allocate specific types of memory for use in dma-buf sharing. Each heap is given its own device node, which a user can allocate a dma-buf fd from using the DMA_HEAP_IOC_ALLOC. This code is an evoluiton of the Android ION implementation, and a big thanks is due to its authors/maintainers over time for their effort: Rebecca Schultz Zavin, Colin Cross, Benjamin Gaignard, Laura Abbott, and many other contributors! Cc: Laura Abbott Cc: Benjamin Gaignard Cc: Sumit Semwal Cc: Liam Mark Cc: Pratik Patel Cc: Brian Starkey Cc: Vincent Donnefort Cc: Sudipto Paul Cc: Andrew F. Davis Cc: Christoph Hellwig Cc: Chenbo Feng Cc: Alistair Strachan Cc: Hridya Valsaraju Cc: Sandeep Patil Cc: Hillf Danton Cc: Dave Airlie Cc: dri-devel@lists.freedesktop.org Reviewed-by: Brian Starkey Acked-by: Sandeep Patil Signed-off-by: Andrew F. Davis Signed-off-by: John Stultz Signed-off-by: Sumit Semwal Link: https://patchwork.freedesktop.org/patch/msgid/20191203172641.66642-2-john.stultz@linaro.org --- include/uapi/linux/dma-heap.h | 53 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 53 insertions(+) create mode 100644 include/uapi/linux/dma-heap.h (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/dma-heap.h b/include/uapi/linux/dma-heap.h new file mode 100644 index 000000000000..73e7f66c1cae --- /dev/null +++ b/include/uapi/linux/dma-heap.h @@ -0,0 +1,53 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* + * DMABUF Heaps Userspace API + * + * Copyright (C) 2011 Google, Inc. + * Copyright (C) 2019 Linaro Ltd. + */ +#ifndef _UAPI_LINUX_DMABUF_POOL_H +#define _UAPI_LINUX_DMABUF_POOL_H + +#include +#include + +/** + * DOC: DMABUF Heaps Userspace API + */ + +/* Valid FD_FLAGS are O_CLOEXEC, O_RDONLY, O_WRONLY, O_RDWR */ +#define DMA_HEAP_VALID_FD_FLAGS (O_CLOEXEC | O_ACCMODE) + +/* Currently no heap flags */ +#define DMA_HEAP_VALID_HEAP_FLAGS (0) + +/** + * struct dma_heap_allocation_data - metadata passed from userspace for + * allocations + * @len: size of the allocation + * @fd: will be populated with a fd which provides the + * handle to the allocated dma-buf + * @fd_flags: file descriptor flags used when allocating + * @heap_flags: flags passed to heap + * + * Provided by userspace as an argument to the ioctl + */ +struct dma_heap_allocation_data { + __u64 len; + __u32 fd; + __u32 fd_flags; + __u64 heap_flags; +}; + +#define DMA_HEAP_IOC_MAGIC 'H' + +/** + * DOC: DMA_HEAP_IOC_ALLOC - allocate memory from pool + * + * Takes a dma_heap_allocation_data struct and returns it with the fd field + * populated with the dmabuf handle of the allocation. + */ +#define DMA_HEAP_IOC_ALLOC _IOWR(DMA_HEAP_IOC_MAGIC, 0x0,\ + struct dma_heap_allocation_data) + +#endif /* _UAPI_LINUX_DMABUF_POOL_H */ -- cgit v1.2.3 From f10870b05d5edc0652701c6a92eafcab5044795f Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 10 Dec 2019 21:59:15 +0100 Subject: staging: remove isdn capi drivers As described in drivers/staging/isdn/TODO, the drivers are all assumed to be unmaintained and unused now, with gigaset being the last one to stop being maintained after Paul Bolle lost access to an ISDN network. The CAPI subsystem remains for now, as it is still required by bluetooth/cmtp. Signed-off-by: Arnd Bergmann Link: https://lore.kernel.org/r/20191210210455.3475361-1-arnd@arndb.de Signed-off-by: Greg Kroah-Hartman --- include/uapi/linux/gigaset_dev.h | 39 --------------------------------------- include/uapi/linux/hysdn_if.h | 34 ---------------------------------- 2 files changed, 73 deletions(-) delete mode 100644 include/uapi/linux/gigaset_dev.h delete mode 100644 include/uapi/linux/hysdn_if.h (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/gigaset_dev.h b/include/uapi/linux/gigaset_dev.h deleted file mode 100644 index 279551af8ebf..000000000000 --- a/include/uapi/linux/gigaset_dev.h +++ /dev/null @@ -1,39 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */ -/* - * interface to user space for the gigaset driver - * - * Copyright (c) 2004 by Hansjoerg Lipp - * - * ===================================================================== - * This program is free software; you can redistribute it and/or - * modify it under the terms of the GNU General Public License as - * published by the Free Software Foundation; either version 2 of - * the License, or (at your option) any later version. - * ===================================================================== - */ - -#ifndef GIGASET_INTERFACE_H -#define GIGASET_INTERFACE_H - -#include - -/* The magic IOCTL value for this interface. */ -#define GIGASET_IOCTL 0x47 - -/* enable/disable device control via character device (lock out ISDN subsys) */ -#define GIGASET_REDIR _IOWR(GIGASET_IOCTL, 0, int) - -/* enable adapter configuration mode (M10x only) */ -#define GIGASET_CONFIG _IOWR(GIGASET_IOCTL, 1, int) - -/* set break characters (M105 only) */ -#define GIGASET_BRKCHARS _IOW(GIGASET_IOCTL, 2, unsigned char[6]) - -/* get version information selected by arg[0] */ -#define GIGASET_VERSION _IOWR(GIGASET_IOCTL, 3, unsigned[4]) -/* values for GIGASET_VERSION arg[0] */ -#define GIGVER_DRIVER 0 /* get driver version */ -#define GIGVER_COMPAT 1 /* get interface compatibility version */ -#define GIGVER_FWBASE 2 /* get base station firmware version */ - -#endif diff --git a/include/uapi/linux/hysdn_if.h b/include/uapi/linux/hysdn_if.h deleted file mode 100644 index 99f77c517e6d..000000000000 --- a/include/uapi/linux/hysdn_if.h +++ /dev/null @@ -1,34 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ -/* $Id: hysdn_if.h,v 1.1.8.3 2001/09/23 22:25:05 kai Exp $ - * - * Linux driver for HYSDN cards - * ioctl definitions shared by hynetmgr and driver. - * - * Author Werner Cornelius (werner@titro.de) for Hypercope GmbH - * Copyright 1999 by Werner Cornelius (werner@titro.de) - * - * This software may be used and distributed according to the terms - * of the GNU General Public License, incorporated herein by reference. - * - */ - -/****************/ -/* error values */ -/****************/ -#define ERR_NONE 0 /* no error occurred */ -#define ERR_ALREADY_BOOT 1000 /* we are already booting */ -#define EPOF_BAD_MAGIC 1001 /* bad magic in POF header */ -#define ERR_BOARD_DPRAM 1002 /* board DPRAM failed */ -#define EPOF_INTERNAL 1003 /* internal POF handler error */ -#define EPOF_BAD_IMG_SIZE 1004 /* POF boot image size invalid */ -#define ERR_BOOTIMG_FAIL 1005 /* 1. stage boot image did not start */ -#define ERR_BOOTSEQ_FAIL 1006 /* 2. stage boot seq handshake timeout */ -#define ERR_POF_TIMEOUT 1007 /* timeout waiting for card pof ready */ -#define ERR_NOT_BOOTED 1008 /* operation only allowed when booted */ -#define ERR_CONF_LONG 1009 /* conf line is too long */ -#define ERR_INV_CHAN 1010 /* invalid channel number */ -#define ERR_ASYNC_TIME 1011 /* timeout sending async data */ - - - - -- cgit v1.2.3 From f59aba2f75795e5b6a4f1aa31f3e20d7b71ca804 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Tue, 10 Dec 2019 21:59:16 +0100 Subject: isdn: capi: dead code removal The staging isdn drivers are gone, and CONFIG_BT_CMTP is now the only user. This means a lot of the code in the subsystem has no remaining callers and can be removed. Change the capi user space front-end to be part of kernelcapi, and the combined module to only be compiled if BT_CMTP is also enabled, then remove the interfaces that have no remaining callers. As the notifier list and the capi_drivers list have no callers outside of kcapi.c, the implementation gets much simpler. Some definitions from the include/linux/*.h headers are only needed internally and are moved to kcapi.h. Acked-by: David Miller Signed-off-by: Arnd Bergmann Link: https://lore.kernel.org/r/20191210210455.3475361-2-arnd@arndb.de Signed-off-by: Greg Kroah-Hartman --- include/uapi/linux/b1lli.h | 74 ---------------------------------------------- 1 file changed, 74 deletions(-) delete mode 100644 include/uapi/linux/b1lli.h (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/b1lli.h b/include/uapi/linux/b1lli.h deleted file mode 100644 index 4ae6ac9950df..000000000000 --- a/include/uapi/linux/b1lli.h +++ /dev/null @@ -1,74 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ -/* $Id: b1lli.h,v 1.8.8.3 2001/09/23 22:25:05 kai Exp $ - * - * ISDN lowlevel-module for AVM B1-card. - * - * Copyright 1996 by Carsten Paeth (calle@calle.in-berlin.de) - * - * This software may be used and distributed according to the terms - * of the GNU General Public License, incorporated herein by reference. - * - */ - -#ifndef _B1LLI_H_ -#define _B1LLI_H_ -/* - * struct for loading t4 file - */ -typedef struct avmb1_t4file { - int len; - unsigned char *data; -} avmb1_t4file; - -typedef struct avmb1_loaddef { - int contr; - avmb1_t4file t4file; -} avmb1_loaddef; - -typedef struct avmb1_loadandconfigdef { - int contr; - avmb1_t4file t4file; - avmb1_t4file t4config; -} avmb1_loadandconfigdef; - -typedef struct avmb1_resetdef { - int contr; -} avmb1_resetdef; - -typedef struct avmb1_getdef { - int contr; - int cardtype; - int cardstate; -} avmb1_getdef; - -/* - * struct for adding new cards - */ -typedef struct avmb1_carddef { - int port; - int irq; -} avmb1_carddef; - -#define AVM_CARDTYPE_B1 0 -#define AVM_CARDTYPE_T1 1 -#define AVM_CARDTYPE_M1 2 -#define AVM_CARDTYPE_M2 3 - -typedef struct avmb1_extcarddef { - int port; - int irq; - int cardtype; - int cardnr; /* for HEMA/T1 */ -} avmb1_extcarddef; - -#define AVMB1_LOAD 0 /* load image to card */ -#define AVMB1_ADDCARD 1 /* add a new card - OBSOLETE */ -#define AVMB1_RESETCARD 2 /* reset a card */ -#define AVMB1_LOAD_AND_CONFIG 3 /* load image and config to card */ -#define AVMB1_ADDCARD_WITH_TYPE 4 /* add a new card, with cardtype */ -#define AVMB1_GET_CARDINFO 5 /* get cardtype */ -#define AVMB1_REMOVECARD 6 /* remove a card - OBSOLETE */ - -#define AVMB1_REGISTERCARD_IS_OBSOLETE - -#endif /* _B1LLI_H_ */ -- cgit v1.2.3 From 2f48865db3322f8e74f51ff93b91e5c2916dd259 Mon Sep 17 00:00:00 2001 From: Marcel Holtmann Date: Wed, 4 Dec 2019 04:41:09 +0100 Subject: HID: hidraw: add support uniq ioctl Add support for reading out the uniq information from the underlying HID device. This might be the iSerialNumber in case of USB or the BD_ADDR in case of Bluetooth. Signed-off-by: Marcel Holtmann Signed-off-by: Jiri Kosina --- include/uapi/linux/hidraw.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/hidraw.h b/include/uapi/linux/hidraw.h index 98e2c493de85..4913539e5bcc 100644 --- a/include/uapi/linux/hidraw.h +++ b/include/uapi/linux/hidraw.h @@ -39,6 +39,7 @@ struct hidraw_devinfo { /* The first byte of SFEATURE and GFEATURE is the report number */ #define HIDIOCSFEATURE(len) _IOC(_IOC_WRITE|_IOC_READ, 'H', 0x06, len) #define HIDIOCGFEATURE(len) _IOC(_IOC_WRITE|_IOC_READ, 'H', 0x07, len) +#define HIDIOCGRAWUNIQ(len) _IOC(_IOC_READ, 'H', 0x08, len) #define HIDRAW_FIRST_MINOR 0 #define HIDRAW_MAX_DEVICES 64 -- cgit v1.2.3 From bae141f54be83b06652c1d47e50e4e75ed4e9c7e Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Fri, 6 Dec 2019 22:49:34 +0100 Subject: bpf: Emit audit messages upon successful prog load and unload Allow for audit messages to be emitted upon BPF program load and unload for having a timeline of events. The load itself is in syscall context, so additional info about the process initiating the BPF prog creation can be logged and later directly correlated to the unload event. The only info really needed from BPF side is the globally unique prog ID where then audit user space tooling can query / dump all info needed about the specific BPF program right upon load event and enrich the record, thus these changes needed here can be kept small and non-intrusive to the core. Raw example output: # auditctl -D # auditctl -a always,exit -F arch=x86_64 -S bpf # ausearch --start recent -m 1334 ... ---- time->Wed Nov 27 16:04:13 2019 type=PROCTITLE msg=audit(1574867053.120:84664): proctitle="./bpf" type=SYSCALL msg=audit(1574867053.120:84664): arch=c000003e syscall=321 \ success=yes exit=3 a0=5 a1=7ffea484fbe0 a2=70 a3=0 items=0 ppid=7477 \ pid=12698 auid=1001 uid=1001 gid=1001 euid=1001 suid=1001 fsuid=1001 \ egid=1001 sgid=1001 fsgid=1001 tty=pts2 ses=4 comm="bpf" \ exe="/home/jolsa/auditd/audit-testsuite/tests/bpf/bpf" \ subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null) type=UNKNOWN[1334] msg=audit(1574867053.120:84664): prog-id=76 op=LOAD ---- time->Wed Nov 27 16:04:13 2019 type=UNKNOWN[1334] msg=audit(1574867053.120:84665): prog-id=76 op=UNLOAD ... Signed-off-by: Daniel Borkmann Co-developed-by: Jiri Olsa Signed-off-by: Jiri Olsa Acked-by: Paul Moore Link: https://lore.kernel.org/bpf/20191206214934.11319-1-jolsa@kernel.org --- include/uapi/linux/audit.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h index 3ad935527177..a534d71e689a 100644 --- a/include/uapi/linux/audit.h +++ b/include/uapi/linux/audit.h @@ -116,6 +116,7 @@ #define AUDIT_FANOTIFY 1331 /* Fanotify access decision */ #define AUDIT_TIME_INJOFFSET 1332 /* Timekeeping offset injected */ #define AUDIT_TIME_ADJNTPVAL 1333 /* NTP value adjustment */ +#define AUDIT_BPF 1334 /* BPF subsystem */ #define AUDIT_AVC 1400 /* SE Linux avc denial or grant */ #define AUDIT_SELINUX_ERR 1401 /* Internal SE Linux Errors */ -- cgit v1.2.3 From ef343b35d46667668a099655fca4a5b2e43a5dfe Mon Sep 17 00:00:00 2001 From: Stefano Garzarella Date: Tue, 10 Dec 2019 11:43:03 +0100 Subject: vsock: add VMADDR_CID_LOCAL definition The VMADDR_CID_RESERVED (1) was used by VMCI, but now it is not used anymore, so we can reuse it for local communication (loopback) adding the new well-know CID: VMADDR_CID_LOCAL. Cc: Jorgen Hansen Reviewed-by: Stefan Hajnoczi Reviewed-by: Jorgen Hansen Signed-off-by: Stefano Garzarella Signed-off-by: David S. Miller --- include/uapi/linux/vm_sockets.h | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/vm_sockets.h b/include/uapi/linux/vm_sockets.h index 68d57c5e99bc..fd0ed7221645 100644 --- a/include/uapi/linux/vm_sockets.h +++ b/include/uapi/linux/vm_sockets.h @@ -99,11 +99,13 @@ #define VMADDR_CID_HYPERVISOR 0 -/* This CID is specific to VMCI and can be considered reserved (even VMCI - * doesn't use it anymore, it's a legacy value from an older release). +/* Use this as the destination CID in an address when referring to the + * local communication (loopback). + * (This was VMADDR_CID_RESERVED, but even VMCI doesn't use it anymore, + * it was a legacy value from an older release). */ -#define VMADDR_CID_RESERVED 1 +#define VMADDR_CID_LOCAL 1 /* Use this as the destination CID in an address when referring to the host * (any process other than the hypervisor). VMCI relies on it being 2, but -- cgit v1.2.3 From f74877a5457d34d604dba6dbbb13c4c05bac8b93 Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Wed, 11 Dec 2019 10:58:14 +0100 Subject: rtnetlink: provide permanent hardware address in RTM_NEWLINK Permanent hardware address of a network device was traditionally provided via ethtool ioctl interface but as Jiri Pirko pointed out in a review of ethtool netlink interface, rtnetlink is much more suitable for it so let's add it to the RTM_NEWLINK message. Add IFLA_PERM_ADDRESS attribute to RTM_NEWLINK messages unless the permanent address is all zeros (i.e. device driver did not fill it). As permanent address is not modifiable, reject userspace requests containing IFLA_PERM_ADDRESS attribute. Note: we already provide permanent hardware address for bond slaves; unfortunately we cannot drop that attribute for backward compatibility reasons. v5 -> v6: only add the attribute if permanent address is not zero Signed-off-by: Michal Kubecek Acked-by: Jiri Pirko Acked-by: Stephen Hemminger Signed-off-by: David S. Miller --- include/uapi/linux/if_link.h | 1 + 1 file changed, 1 insertion(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 8aec8769d944..1d69f637c5d6 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -169,6 +169,7 @@ enum { IFLA_MAX_MTU, IFLA_PROP_LIST, IFLA_ALT_IFNAME, /* Alternative ifname */ + IFLA_PERM_ADDRESS, __IFLA_MAX }; -- cgit v1.2.3 From 428c122f5f6b54f40bd51c47495104b534b5a57c Mon Sep 17 00:00:00 2001 From: Michal Kubecek Date: Wed, 11 Dec 2019 10:58:34 +0100 Subject: ethtool: provide link mode names as a string set Unlike e.g. netdev features, the ethtool ioctl interface requires link mode table to be in sync between kernel and userspace for userspace to be able to display and set all link modes supported by kernel. The way arbitrary length bitsets are implemented in netlink interface, this will be no longer needed. To allow userspace to access all link modes running kernel supports, add table of ethernet link mode names and make it available as a string set to userspace GET_STRSET requests. Add build time check to make sure names are defined for all modes declared in enum ethtool_link_mode_bit_indices. Once the string set is available, make it also accessible via ioctl. Signed-off-by: Michal Kubecek Reviewed-by: Andrew Lunn Reviewed-by: Jiri Pirko Reviewed-by: Florian Fainelli Signed-off-by: David S. Miller --- include/uapi/linux/ethtool.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h index d4591792f0b4..f44155840b07 100644 --- a/include/uapi/linux/ethtool.h +++ b/include/uapi/linux/ethtool.h @@ -593,6 +593,7 @@ struct ethtool_pauseparam { * @ETH_SS_RSS_HASH_FUNCS: RSS hush function names * @ETH_SS_PHY_STATS: Statistic names, for use with %ETHTOOL_GPHYSTATS * @ETH_SS_PHY_TUNABLES: PHY tunable names + * @ETH_SS_LINK_MODES: link mode names */ enum ethtool_stringset { ETH_SS_TEST = 0, @@ -604,6 +605,7 @@ enum ethtool_stringset { ETH_SS_TUNABLES, ETH_SS_PHY_STATS, ETH_SS_PHY_TUNABLES, + ETH_SS_LINK_MODES, }; /** -- cgit v1.2.3 From 826f66b30c2e3d4df533e4224cb8c734afa0a17c Mon Sep 17 00:00:00 2001 From: Andy Roulin Date: Wed, 11 Dec 2019 14:30:58 -0800 Subject: bonding: move 802.3ad port state flags to uapi The bond slave actor/partner operating state is exported as bitfield to userspace, which lacks a way to interpret it, e.g., iproute2 only prints the state as a number: ad_actor_oper_port_state 15 For userspace to interpret the bitfield, the bitfield definitions should be part of the uapi. The bitfield itself is defined in the 802.3ad standard. This commit moves the 802.3ad bitfield definitions to uapi. Related iproute2 patches, soon to be posted upstream, use the new uapi headers to pretty-print bond slave state, e.g., with ip -d link show ad_actor_oper_port_state_str Signed-off-by: Andy Roulin Acked-by: Roopa Prabhu Acked-by: Jay Vosburgh Signed-off-by: Jakub Kicinski --- include/uapi/linux/if_bonding.h | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/if_bonding.h b/include/uapi/linux/if_bonding.h index 790585f0e61b..6829213a54c5 100644 --- a/include/uapi/linux/if_bonding.h +++ b/include/uapi/linux/if_bonding.h @@ -95,6 +95,16 @@ #define BOND_XMIT_POLICY_ENCAP23 3 /* encapsulated layer 2+3 */ #define BOND_XMIT_POLICY_ENCAP34 4 /* encapsulated layer 3+4 */ +/* 802.3ad port state definitions (43.4.2.2 in the 802.3ad standard) */ +#define AD_STATE_LACP_ACTIVITY 0x1 +#define AD_STATE_LACP_TIMEOUT 0x2 +#define AD_STATE_AGGREGATION 0x4 +#define AD_STATE_SYNCHRONIZATION 0x8 +#define AD_STATE_COLLECTING 0x10 +#define AD_STATE_DISTRIBUTING 0x20 +#define AD_STATE_DEFAULTED 0x40 +#define AD_STATE_EXPIRED 0x80 + typedef struct ifbond { __s32 bond_mode; __s32 num_slaves; -- cgit v1.2.3 From de1799667b002331609e8ee6b924ccfd51968d99 Mon Sep 17 00:00:00 2001 From: Vivien Didelot Date: Wed, 11 Dec 2019 20:07:10 -0500 Subject: net: bridge: add STP xstats This adds rx_bpdu, tx_bpdu, rx_tcn, tx_tcn, transition_blk, transition_fwd xstats counters to the bridge ports copied over via netlink, providing useful information for STP. Signed-off-by: Vivien Didelot Acked-by: Nikolay Aleksandrov Signed-off-by: Jakub Kicinski --- include/uapi/linux/if_bridge.h | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h index 1b3c2b643a02..4a58e3d7de46 100644 --- a/include/uapi/linux/if_bridge.h +++ b/include/uapi/linux/if_bridge.h @@ -156,6 +156,15 @@ struct bridge_vlan_xstats { __u32 pad2; }; +struct bridge_stp_xstats { + __u64 transition_blk; + __u64 transition_fwd; + __u64 rx_bpdu; + __u64 tx_bpdu; + __u64 rx_tcn; + __u64 tx_tcn; +}; + /* Bridge multicast database attributes * [MDBA_MDB] = { * [MDBA_MDB_ENTRY] = { @@ -262,6 +271,7 @@ enum { BRIDGE_XSTATS_VLAN, BRIDGE_XSTATS_MCAST, BRIDGE_XSTATS_PAD, + BRIDGE_XSTATS_STP, __BRIDGE_XSTATS_MAX }; #define BRIDGE_XSTATS_MAX (__BRIDGE_XSTATS_MAX - 1) -- cgit v1.2.3 From 166750bc1dd256b2184b22588fb9fe6d3fbb93ae Mon Sep 17 00:00:00 2001 From: Andrii Nakryiko Date: Fri, 13 Dec 2019 17:47:08 -0800 Subject: libbpf: Support libbpf-provided extern variables Add support for extern variables, provided to BPF program by libbpf. Currently the following extern variables are supported: - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte long; - CONFIG_xxx values; a set of values of actual kernel config. Tristate, boolean, strings, and integer values are supported. Set of possible values is determined by declared type of extern variable. Supported types of variables are: - Tristate values. Are represented as `enum libbpf_tristate`. Accepted values are **strictly** 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO, or TRI_MODULE, respectively. - Boolean values. Are represented as bool (_Bool) types. Accepted values are 'y' and 'n' only, turning into true/false values, respectively. - Single-character values. Can be used both as a substritute for bool/tristate, or as a small-range integer: - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm'; - integers in a range [-128, 127] or [0, 255] (depending on signedness of char in target architecture) are recognized and represented with respective values of char type. - Strings. String values are declared as fixed-length char arrays. String of up to that length will be accepted and put in first N bytes of char array, with the rest of bytes zeroed out. If config string value is longer than space alloted, it will be truncated and warning message emitted. Char array is always zero terminated. String literals in config have to be enclosed in double quotes, just like C-style string literals. - Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and unsigned variants. Libbpf enforces parsed config value to be in the supported range of corresponding integer type. Integers values in config can be: - decimal integers, with optional + and - signs; - hexadecimal integers, prefixed with 0x or 0X; - octal integers, starting with 0. Config file itself is searched in /boot/config-$(uname -r) location with fallback to /proc/config.gz, unless config path is specified explicitly through bpf_object_open_opts' kernel_config_path option. Both gzipped and plain text formats are supported. Libbpf adds explicit dependency on zlib because of this, but this shouldn't be a problem, given libelf already depends on zlib. All detected extern variables, are put into a separate .extern internal map. It, similarly to .rodata map, is marked as read-only from BPF program side, as well as is frozen on load. This allows BPF verifier to track extern values as constants and perform enhanced branch prediction and dead code elimination. This can be relied upon for doing kernel version/feature detection and using potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF program, while still having a single version of BPF program running on old and new kernels. Selftests are validating this explicitly for unexisting BPF helper. Signed-off-by: Andrii Nakryiko Signed-off-by: Alexei Starovoitov Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com --- include/uapi/linux/btf.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h index c02dec97e1ce..1a2898c482ee 100644 --- a/include/uapi/linux/btf.h +++ b/include/uapi/linux/btf.h @@ -142,7 +142,8 @@ struct btf_param { enum { BTF_VAR_STATIC = 0, - BTF_VAR_GLOBAL_ALLOCATED, + BTF_VAR_GLOBAL_ALLOCATED = 1, + BTF_VAR_GLOBAL_EXTERN = 2, }; /* BTF_KIND_VAR is followed by a single "struct btf_var" to describe -- cgit v1.2.3 From a2ec8b5706944d228181c8b91d815f41d6dd8e7b Mon Sep 17 00:00:00 2001 From: Josh Soref Date: Sun, 15 Dec 2019 22:08:02 +0100 Subject: wireguard: global: fix spelling mistakes in comments This fixes two spelling errors in source code comments. Signed-off-by: Josh Soref [Jason: rewrote commit message] Signed-off-by: Jason A. Donenfeld Signed-off-by: David S. Miller --- include/uapi/linux/wireguard.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h index dd8a47c4ad11..ae88be14c947 100644 --- a/include/uapi/linux/wireguard.h +++ b/include/uapi/linux/wireguard.h @@ -18,13 +18,13 @@ * one but not both of: * * WGDEVICE_A_IFINDEX: NLA_U32 - * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1 * * The kernel will then return several messages (NLM_F_MULTI) containing the * following tree of nested items: * * WGDEVICE_A_IFINDEX: NLA_U32 - * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1 * WGDEVICE_A_PRIVATE_KEY: NLA_EXACT_LEN, len WG_KEY_LEN * WGDEVICE_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN * WGDEVICE_A_LISTEN_PORT: NLA_U16 @@ -77,7 +77,7 @@ * WGDEVICE_A_IFINDEX and WGDEVICE_A_IFNAME: * * WGDEVICE_A_IFINDEX: NLA_U32 - * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1 + * WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1 * WGDEVICE_A_FLAGS: NLA_U32, 0 or WGDEVICE_F_REPLACE_PEERS if all current * peers should be removed prior to adding the list below. * WGDEVICE_A_PRIVATE_KEY: len WG_KEY_LEN, all zeros to remove @@ -121,7 +121,7 @@ * filling in information not contained in the prior. Note that if * WGDEVICE_F_REPLACE_PEERS is specified in the first message, it probably * should not be specified in fragments that come after, so that the list - * of peers is only cleared the first time but appened after. Likewise for + * of peers is only cleared the first time but appended after. Likewise for * peers, if WGPEER_F_REPLACE_ALLOWEDIPS is specified in the first message * of a peer, it likely should not be specified in subsequent fragments. * -- cgit v1.2.3 From b3b4346544b571c96d46be615b9db69a601ce4c8 Mon Sep 17 00:00:00 2001 From: "Andrew F. Davis" Date: Mon, 16 Dec 2019 08:34:04 -0500 Subject: dma-buf: heaps: Use _IOCTL_ for userspace IOCTL identifier This is more consistent with the DMA and DRM frameworks convention. This patch is only a name change, no logic is changed. Signed-off-by: Andrew F. Davis Acked-by: John Stultz Signed-off-by: Sumit Semwal Link: https://patchwork.freedesktop.org/patch/msgid/20191216133405.1001-2-afd@ti.com --- include/uapi/linux/dma-heap.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/dma-heap.h b/include/uapi/linux/dma-heap.h index 73e7f66c1cae..6f84fa08e074 100644 --- a/include/uapi/linux/dma-heap.h +++ b/include/uapi/linux/dma-heap.h @@ -42,12 +42,12 @@ struct dma_heap_allocation_data { #define DMA_HEAP_IOC_MAGIC 'H' /** - * DOC: DMA_HEAP_IOC_ALLOC - allocate memory from pool + * DOC: DMA_HEAP_IOCTL_ALLOC - allocate memory from pool * * Takes a dma_heap_allocation_data struct and returns it with the fd field * populated with the dmabuf handle of the allocation. */ -#define DMA_HEAP_IOC_ALLOC _IOWR(DMA_HEAP_IOC_MAGIC, 0x0,\ +#define DMA_HEAP_IOCTL_ALLOC _IOWR(DMA_HEAP_IOC_MAGIC, 0x0,\ struct dma_heap_allocation_data) #endif /* _UAPI_LINUX_DMABUF_POOL_H */ -- cgit v1.2.3 From 3431ca4837bf7f2b816e4c439db5bb66e7586746 Mon Sep 17 00:00:00 2001 From: Alexandre Belloni Date: Sat, 14 Dec 2019 23:02:43 +0100 Subject: rtc: define RTC_VL_READ values Currently, the meaning of the value returned by RTC_VL_READ is undocumented and left to the driver implementation. In order to get more meaningful values, define a set of values to use as to make clear to userspace what is the status of the various voltages feeding the RTC. Link: https://lore.kernel.org/r/20191214220259.621996-2-alexandre.belloni@bootlin.com Signed-off-by: Alexandre Belloni --- include/uapi/linux/rtc.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/rtc.h b/include/uapi/linux/rtc.h index 2ad1788968d0..095af360326a 100644 --- a/include/uapi/linux/rtc.h +++ b/include/uapi/linux/rtc.h @@ -92,7 +92,12 @@ struct rtc_pll_info { #define RTC_PLL_GET _IOR('p', 0x11, struct rtc_pll_info) /* Get PLL correction */ #define RTC_PLL_SET _IOW('p', 0x12, struct rtc_pll_info) /* Set PLL correction */ -#define RTC_VL_READ _IOR('p', 0x13, int) /* Voltage low detector */ +#define RTC_VL_DATA_INVALID BIT(0) /* Voltage too low, RTC data is invalid */ +#define RTC_VL_BACKUP_LOW BIT(1) /* Backup voltage is low */ +#define RTC_VL_BACKUP_EMPTY BIT(2) /* Backup empty or not present */ +#define RTC_VL_ACCURACY_LOW BIT(3) /* Voltage is low, RTC accuracy is reduced */ + +#define RTC_VL_READ _IOR('p', 0x13, unsigned int) /* Voltage low detection */ #define RTC_VL_CLR _IO('p', 0x14) /* Clear voltage low information */ /* interrupt flags */ -- cgit v1.2.3 From 2d602bf28316e2f61a553f13d279f3d74c2e5189 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Thu, 24 Oct 2019 16:34:25 +0200 Subject: acct: stop using get_seconds() In 'struct acct', 'struct acct_v3', and 'struct taskstats' we have a 32-bit 'ac_btime' field containing an absolute time value, which will overflow in year 2106. There are two possible ways to deal with it: a) let it overflow and have user space code deal with reconstructing the data based on the current time, or b) truncate the times based on the range of the u32 type. Neither of them solves the actual problem. Pick the second one to best document what the issue is, and have someone fix it in a future version. Signed-off-by: Arnd Bergmann --- include/uapi/linux/acct.h | 2 ++ include/uapi/linux/taskstats.h | 1 + 2 files changed, 3 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/acct.h b/include/uapi/linux/acct.h index 0e72172cd23a..985b89068591 100644 --- a/include/uapi/linux/acct.h +++ b/include/uapi/linux/acct.h @@ -49,6 +49,7 @@ struct acct __u16 ac_uid16; /* LSB of Real User ID */ __u16 ac_gid16; /* LSB of Real Group ID */ __u16 ac_tty; /* Control Terminal */ + /* __u32 range means times from 1970 to 2106 */ __u32 ac_btime; /* Process Creation Time */ comp_t ac_utime; /* User Time */ comp_t ac_stime; /* System Time */ @@ -81,6 +82,7 @@ struct acct_v3 __u32 ac_gid; /* Real Group ID */ __u32 ac_pid; /* Process ID */ __u32 ac_ppid; /* Parent Process ID */ + /* __u32 range means times from 1970 to 2106 */ __u32 ac_btime; /* Process Creation Time */ #ifdef __KERNEL__ __u32 ac_etime; /* Elapsed Time */ diff --git a/include/uapi/linux/taskstats.h b/include/uapi/linux/taskstats.h index 5e8ca16a9079..7d3ea366e93b 100644 --- a/include/uapi/linux/taskstats.h +++ b/include/uapi/linux/taskstats.h @@ -112,6 +112,7 @@ struct taskstats { __u32 ac_gid; /* Group ID */ __u32 ac_pid; /* Process ID */ __u32 ac_ppid; /* Parent process ID */ + /* __u32 range means times from 1970 to 2106 */ __u32 ac_btime; /* Begin time [sec since 1970] */ __u64 ac_etime __attribute__((aligned(8))); /* Elapsed time [usec] */ -- cgit v1.2.3 From 352c912b0a525977a8e6fa1f87c15d9f71943642 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Thu, 24 Oct 2019 16:47:56 +0200 Subject: tsacct: add 64-bit btime field As there is only a 32-bit ac_btime field in taskstat and we should handle dates after the overflow, add a new field with the same information but 64-bit width that can hold a full time64_t. Signed-off-by: Arnd Bergmann --- include/uapi/linux/taskstats.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/taskstats.h b/include/uapi/linux/taskstats.h index 7d3ea366e93b..ccbd08709321 100644 --- a/include/uapi/linux/taskstats.h +++ b/include/uapi/linux/taskstats.h @@ -34,7 +34,7 @@ */ -#define TASKSTATS_VERSION 9 +#define TASKSTATS_VERSION 10 #define TS_COMM_LEN 32 /* should be >= TASK_COMM_LEN * in linux/sched.h */ @@ -169,6 +169,9 @@ struct taskstats { /* Delay waiting for thrashing page */ __u64 thrashing_count; __u64 thrashing_delay_total; + + /* v10: 64-bit btime to avoid overflow */ + __u64 ac_btime64; /* 64-bit begin time */ }; -- cgit v1.2.3 From 4f9fbd893fe83a1193adceca41c8f7aa6c7382a1 Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Fri, 15 Nov 2019 15:53:29 +0100 Subject: y2038: rename itimerval to __kernel_old_itimerval Take the renaming of timeval and timespec one level further, also renaming itimerval to __kernel_old_itimerval, to avoid namespace conflicts with the user-space structure that may use 64-bit time_t members. Signed-off-by: Arnd Bergmann --- include/uapi/linux/time_types.h | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/time_types.h b/include/uapi/linux/time_types.h index 074e391d73a1..bcc0002115d3 100644 --- a/include/uapi/linux/time_types.h +++ b/include/uapi/linux/time_types.h @@ -33,6 +33,11 @@ struct __kernel_old_timespec { long tv_nsec; /* nanoseconds */ }; +struct __kernel_old_itimerval { + struct __kernel_old_timeval it_interval;/* timer interval */ + struct __kernel_old_timeval it_value; /* current value */ +}; + struct __kernel_sock_timeval { __s64 tv_sec; __s64 tv_usec; -- cgit v1.2.3 From 251ec1c159e4874fbede0c3c586e317e177c0c9b Mon Sep 17 00:00:00 2001 From: Arnd Bergmann Date: Wed, 11 Dec 2019 21:07:23 +0100 Subject: y2038: sparc: remove use of struct timex 'struct timex' is one of the last users of 'struct timeval' and is only referenced in one place in the kernel any more, to convert the user space timex into the kernel-internal version on sparc64, with a different tv_usec member type. As a preparation for hiding the time_t definition and everything using that in the kernel, change the implementation once more to only convert the timeval member, and then enclose the struct definition in an #ifdef. Signed-off-by: Arnd Bergmann --- include/uapi/linux/timex.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/timex.h b/include/uapi/linux/timex.h index 9f517f9010bb..bd627c368d09 100644 --- a/include/uapi/linux/timex.h +++ b/include/uapi/linux/timex.h @@ -57,6 +57,7 @@ #define NTP_API 4 /* NTP API version */ +#ifndef __KERNEL__ /* * syscall interface - used (mainly by NTP daemon) * to discipline kernel clock oscillator @@ -91,6 +92,7 @@ struct timex { int :32; int :32; int :32; int :32; int :32; int :32; int :32; }; +#endif struct __kernel_timex_timeval { __kernel_time64_t tv_sec; -- cgit v1.2.3 From dcc68b4d8084e1ac9af0d4022d6b1aff6a139a33 Mon Sep 17 00:00:00 2001 From: Petr Machata Date: Wed, 18 Dec 2019 14:55:13 +0000 Subject: net: sch_ets: Add a new Qdisc Introduces a new Qdisc, which is based on 802.1Q-2014 wording. It is PRIO-like in how it is configured, meaning one needs to specify how many bands there are, how many are strict and how many are dwrr, quanta for the latter, and priomap. The new Qdisc operates like the PRIO / DRR combo would when configured as per the standard. The strict classes, if any, are tried for traffic first. When there's no traffic in any of the strict queues, the ETS ones (if any) are treated in the same way as in DRR. Signed-off-by: Petr Machata Acked-by: Jiri Pirko Signed-off-by: David S. Miller --- include/uapi/linux/pkt_sched.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h index 9f1a72876212..bf5a5b1dfb0b 100644 --- a/include/uapi/linux/pkt_sched.h +++ b/include/uapi/linux/pkt_sched.h @@ -1187,4 +1187,21 @@ enum { #define TCA_TAPRIO_ATTR_MAX (__TCA_TAPRIO_ATTR_MAX - 1) +/* ETS */ + +#define TCQ_ETS_MAX_BANDS 16 + +enum { + TCA_ETS_UNSPEC, + TCA_ETS_NBANDS, /* u8 */ + TCA_ETS_NSTRICT, /* u8 */ + TCA_ETS_QUANTA, /* nested TCA_ETS_QUANTA_BAND */ + TCA_ETS_QUANTA_BAND, /* u32 */ + TCA_ETS_PRIOMAP, /* nested TCA_ETS_PRIOMAP_BAND */ + TCA_ETS_PRIOMAP_BAND, /* u8 */ + __TCA_ETS_MAX, +}; + +#define TCA_ETS_MAX (__TCA_ETS_MAX - 1) + #endif -- cgit v1.2.3 From 7dd68b3279f1792103d12e69933db3128c6d416e Mon Sep 17 00:00:00 2001 From: Andrey Ignatov Date: Wed, 18 Dec 2019 23:44:35 -0800 Subject: bpf: Support replacing cgroup-bpf program in MULTI mode The common use-case in production is to have multiple cgroup-bpf programs per attach type that cover multiple use-cases. Such programs are attached with BPF_F_ALLOW_MULTI and can be maintained by different people. Order of programs usually matters, for example imagine two egress programs: the first one drops packets and the second one counts packets. If they're swapped the result of counting program will be different. It brings operational challenges with updating cgroup-bpf program(s) attached with BPF_F_ALLOW_MULTI since there is no way to replace a program: * One way to update is to detach all programs first and then attach the new version(s) again in the right order. This introduces an interruption in the work a program is doing and may not be acceptable (e.g. if it's egress firewall); * Another way is attach the new version of a program first and only then detach the old version. This introduces the time interval when two versions of same program are working, what may not be acceptable if a program is not idempotent. It also imposes additional burden on program developers to make sure that two versions of their program can co-exist. Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to replace That way user can replace any program among those attached with BPF_F_ALLOW_MULTI flag without the problems described above. Details of the new API: * If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid descriptor of BPF program, BPF_PROG_ATTACH will return corresponding error (EINVAL or EBADF). * If replace_bpf_fd has valid descriptor of BPF program but such a program is not attached to specified cgroup, BPF_PROG_ATTACH will return ENOENT. BPF_F_REPLACE is introduced to make the user intent clear, since replace_bpf_fd alone can't be used for this (its default value, 0, is a valid fd). BPF_F_REPLACE also makes it possible to extend the API in the future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed). Signed-off-by: Andrey Ignatov Signed-off-by: Alexei Starovoitov Acked-by: Martin KaFai Lau Acked-by: Andrii Narkyiko Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com --- include/uapi/linux/bpf.h | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index dbbcf0b02970..7df436da542d 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -231,6 +231,11 @@ enum bpf_attach_type { * When children program makes decision (like picking TCP CA or sock bind) * parent program has a chance to override it. * + * With BPF_F_ALLOW_MULTI a new program is added to the end of the list of + * programs for a cgroup. Though it's possible to replace an old program at + * any position by also specifying BPF_F_REPLACE flag and position itself in + * replace_bpf_fd attribute. Old program at this position will be released. + * * A cgroup with MULTI or OVERRIDE flag allows any attach flags in sub-cgroups. * A cgroup with NONE doesn't allow any programs in sub-cgroups. * Ex1: @@ -249,6 +254,7 @@ enum bpf_attach_type { */ #define BPF_F_ALLOW_OVERRIDE (1U << 0) #define BPF_F_ALLOW_MULTI (1U << 1) +#define BPF_F_REPLACE (1U << 2) /* If BPF_F_STRICT_ALIGNMENT is used in BPF_PROG_LOAD command, the * verifier will perform strict alignment checking as if the kernel @@ -442,6 +448,10 @@ union bpf_attr { __u32 attach_bpf_fd; /* eBPF program to attach */ __u32 attach_type; __u32 attach_flags; + __u32 replace_bpf_fd; /* previously attached eBPF + * program to replace if + * BPF_F_REPLACE is used + */ }; struct { /* anonymous struct used by BPF_PROG_TEST_RUN command */ -- cgit v1.2.3 From e1b5e598e5a51b453328879682b178b4acc15105 Mon Sep 17 00:00:00 2001 From: John Rutherford Date: Thu, 19 Dec 2019 16:03:57 +1100 Subject: tipc: make legacy address flag readable over netlink To enable iproute2/tipc to generate backwards compatible printouts and validate command parameters for nodes using a node address, it needs to be able to read the legacy address flag from the kernel. The legacy address flag records the way in which the node identity was originally specified. The legacy address flag is requested by the netlink message TIPC_NL_ADDR_LEGACY_GET. If the flag is set the attribute TIPC_NLA_NET_ADDR_LEGACY is set in the return message. Signed-off-by: John Rutherford Acked-by: Jon Maloy Signed-off-by: David S. Miller --- include/uapi/linux/tipc_netlink.h | 2 ++ 1 file changed, 2 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/tipc_netlink.h b/include/uapi/linux/tipc_netlink.h index 6c2194ab745b..dc0d23a50e69 100644 --- a/include/uapi/linux/tipc_netlink.h +++ b/include/uapi/linux/tipc_netlink.h @@ -65,6 +65,7 @@ enum { TIPC_NL_UDP_GET_REMOTEIP, TIPC_NL_KEY_SET, TIPC_NL_KEY_FLUSH, + TIPC_NL_ADDR_LEGACY_GET, __TIPC_NL_CMD_MAX, TIPC_NL_CMD_MAX = __TIPC_NL_CMD_MAX - 1 @@ -176,6 +177,7 @@ enum { TIPC_NLA_NET_ADDR, /* u32 */ TIPC_NLA_NET_NODEID, /* u64 */ TIPC_NLA_NET_NODEID_W1, /* u64 */ + TIPC_NLA_NET_ADDR_LEGACY, /* flag */ __TIPC_NLA_NET_MAX, TIPC_NLA_NET_MAX = __TIPC_NLA_NET_MAX - 1 -- cgit v1.2.3 From f66b53fdbb22ced1a323b22b9de84a61aacd8d18 Mon Sep 17 00:00:00 2001 From: Martin Varghese Date: Sat, 21 Dec 2019 08:50:46 +0530 Subject: openvswitch: New MPLS actions for layer 2 tunnelling The existing PUSH MPLS action inserts MPLS header between ethernet header and the IP header. Though this behaviour is fine for L3 VPN where an IP packet is encapsulated inside a MPLS tunnel, it does not suffice the L2 VPN (l2 tunnelling) requirements. In L2 VPN the MPLS header should encapsulate the ethernet packet. The new mpls action ADD_MPLS inserts MPLS header at the start of the packet or at the start of the l3 header depending on the value of l3 tunnel flag in the ADD_MPLS arguments. POP_MPLS action is extended to support ethertype 0x6558. Signed-off-by: Martin Varghese Acked-by: Pravin B Shelar Signed-off-by: David S. Miller --- include/uapi/linux/openvswitch.h | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) (limited to 'include/uapi/linux') diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h index a87b44cd5590..ae2bff14e7e1 100644 --- a/include/uapi/linux/openvswitch.h +++ b/include/uapi/linux/openvswitch.h @@ -672,6 +672,32 @@ struct ovs_action_push_mpls { __be16 mpls_ethertype; /* Either %ETH_P_MPLS_UC or %ETH_P_MPLS_MC */ }; +/** + * struct ovs_action_add_mpls - %OVS_ACTION_ATTR_ADD_MPLS action + * argument. + * @mpls_lse: MPLS label stack entry to push. + * @mpls_ethertype: Ethertype to set in the encapsulating ethernet frame. + * @tun_flags: MPLS tunnel attributes. + * + * The only values @mpls_ethertype should ever be given are %ETH_P_MPLS_UC and + * %ETH_P_MPLS_MC, indicating MPLS unicast or multicast. Other are rejected. + */ +struct ovs_action_add_mpls { + __be32 mpls_lse; + __be16 mpls_ethertype; /* Either %ETH_P_MPLS_UC or %ETH_P_MPLS_MC */ + __u16 tun_flags; +}; + +#define OVS_MPLS_L3_TUNNEL_FLAG_MASK (1 << 0) /* Flag to specify the place of + * insertion of MPLS header. + * When false, the MPLS header + * will be inserted at the start + * of the packet. + * When true, the MPLS header + * will be inserted at the start + * of the l3 header. + */ + /** * struct ovs_action_push_vlan - %OVS_ACTION_ATTR_PUSH_VLAN action argument. * @vlan_tpid: Tag protocol identifier (TPID) to push. @@ -892,6 +918,10 @@ struct check_pkt_len_arg { * @OVS_ACTION_ATTR_CHECK_PKT_LEN: Check the packet length and execute a set * of actions if greater than the specified packet length, else execute * another set of actions. + * @OVS_ACTION_ATTR_ADD_MPLS: Push a new MPLS label stack entry at the + * start of the packet or at the start of the l3 header depending on the value + * of l3 tunnel flag in the tun_flags field of OVS_ACTION_ATTR_ADD_MPLS + * argument. * * Only a single header can be set with a single %OVS_ACTION_ATTR_SET. Not all * fields within a header are modifiable, e.g. the IPv4 protocol and fragment @@ -927,6 +957,7 @@ enum ovs_action_attr { OVS_ACTION_ATTR_METER, /* u32 meter ID. */ OVS_ACTION_ATTR_CLONE, /* Nested OVS_CLONE_ATTR_*. */ OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */ + OVS_ACTION_ATTR_ADD_MPLS, /* struct ovs_action_add_mpls. */ __OVS_ACTION_ATTR_MAX, /* Nothing past this will be a