aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-02-06landlock: Document LANDLOCK_RESTRICT_SELF_TSYNCGünther Noack1-1/+9
Add documentation for LANDLOCK_RESTRICT_SELF_TSYNC. It does not need to go into the main example, but it has a section in the ABI compatibility notes. In the HTML rendering, the main reference is the system call documentation, which is included from the landlock.h header file. Cc: Andrew G. Morgan <morgan@kernel.org> Cc: John Johansen <john.johansen@canonical.com> Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20251127115136.3064948-4-gnoack@google.com [mic: Update date] Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-02-06selftests/landlock: Add LANDLOCK_RESTRICT_SELF_TSYNC testsGünther Noack2-2/+163
Exercise various scenarios where Landlock domains are enforced across all of a processes' threads. Test coverage for security/landlock is 91.6% of 2130 lines according to LLVM 21. Cc: Andrew G. Morgan <morgan@kernel.org> Cc: John Johansen <john.johansen@canonical.com> Cc: Paul Moore <paul@paul-moore.com> Signed-off-by: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20251127115136.3064948-3-gnoack@google.com [mic: Fix subject, use EXPECT_EQ(close()), make helpers static, add test coverage] Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-02-06landlock: Multithreading support for landlock_restrict_self()Günther Noack8-30/+650
Introduce the LANDLOCK_RESTRICT_SELF_TSYNC flag. With this flag, a given Landlock ruleset is applied to all threads of the calling process, instead of only the current one. Without this flag, multithreaded userspace programs currently resort to using the nptl(7)/libpsx hack for multithreaded policy enforcement, which is also used by libcap and for setuid(2). Using this userspace-based scheme, the threads of a process enforce the same Landlock policy, but the resulting Landlock domains are still separate. The domains being separate causes multiple problems: * When using Landlock's "scoped" access rights, the domain identity is used to determine whether an operation is permitted. As a result, when using LANLDOCK_SCOPE_SIGNAL, signaling between sibling threads stops working. This is a problem for programming languages and frameworks which are inherently multithreaded (e.g. Go). * In audit logging, the domains of separate threads in a process will get logged with different domain IDs, even when they are based on the same ruleset FD, which might confuse users. Cc: Andrew G. Morgan <morgan@kernel.org> Cc: John Johansen <john.johansen@canonical.com> Cc: Paul Moore <paul@paul-moore.com> Suggested-by: Jann Horn <jannh@google.com> Signed-off-by: Günther Noack <gnoack@google.com> Link: https://lore.kernel.org/r/20251127115136.3064948-2-gnoack@google.com [mic: Fix restrict_self_flags test, clean up Makefile, allign comments, reduce local variable scope, add missing includes] Closes: https://github.com/landlock-lsm/linux/issues/2 Signed-off-by: Mickaël Salaün <mic@digikod.net>
2026-02-06perf regs: Remove __weak attributive arch_sdt_arg_parse_op() functionDapeng Mi12-473/+441
In line with the previous patch, the __weak arch_sdt_arg_parse_op() function is removed. Architectural-specific implementations in the arch/ directory are now converted into sub-functions within the util/perf-regs-arch/ directory. The perf_sdt_arg_parse_op() function will call these sub-functions based on the EM_HOST. This change enables cross-architecture calls to arch_sdt_arg_parse_op(). No functional changes are intended. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> [ Fixed up somme fuzz with powerpc and x86 Build files wrt removing perf_regs.o ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06perf regs: Remove __weak attributive arch__xxx_reg_mask() functionsDapeng Mi30-236/+332
Currently, some architecture-specific perf-regs functions, such as arch__intr_reg_mask() and arch__user_reg_mask(), are defined with the __weak attribute. This approach ensures that only functions matching the architecture of the build/run host are compiled and executed, reducing build time and binary size. However, this __weak attribute restricts these functions to be called only on the same architecture, preventing cross-architecture functionality. For example, a perf.data file captured on x86 cannot be parsed on an ARM platform. To address this limitation, this patch removes the __weak attribute from these perf-regs functions. The architecture-specific code is moved from the arch/ directory to the util/perf-regs-arch/ directory. The appropriate architectural functions are then called based on the EM_HOST. No functional changes are intended. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> [ Fixed up somme fuzz with s390 and riscv Build files wrt removing perf_regs.o ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06perf arch: Update arch headers to use relative UAPI pathsDapeng Mi9-9/+9
The architectural specific headers perf_regs.h currently rely on the host architecture's 'asm/perf_regs.h'. This can lead to compilation inconsistencies or failures when including and building perf for a target architecture that differs from the host's architecture. Explicitly point to the UAPI headers within the tools source tree using relative paths. This ensures that perf is always built against the intended architecture. No functional changes are intended. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06perf regs: Fix abort for "-I" or "--user-regs" optionsDapeng Mi1-7/+6
Fix an issue where the `perf` tool aborts unexpectedly when running the following command: ``` perf record -e cycles -I -- true Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -I, --intr-regs[=<any register>] sample selected machine registers on interrupt, use '-I?' to list register names ``` The usage of the `-I` or `--user-regs` options without specifying any registers should default to sampling all general-purpose registers. However, this currently causes an abnormal termination. The issue was introduced by commit 3d06db9bad1a ("perf regs: Refactor use of arch__sample_reg_masks() to perf_reg_name()"). This patch resolves the problem, ensuring that the `-I` or `--user-regs` options work as intended without causing an abort. Fixes: 3d06db9bad1ad8e6 ("perf regs: Refactor use of arch__sample_reg_masks() to perf_reg_name()") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Guo Ren <guoren@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: John Garry <john.g.garry@oracle.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Xudong Hao <xudong.hao@intel.com> Cc: Zide Chen <zide.chen@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06Revert "revocable: Revocable resource management"Johan Hovold5-471/+0
This reverts commit 62eb557580eb2177cf16c3fd2b6efadff297b29a. The revocable implementation uses two separate abstractions, struct revocable_provider and struct revocable, in order to store the SRCU read lock index which must be passed unaltered to srcu_read_unlock() in the same context when a resource is no longer needed. With the merged revocable API, multiple threads could however share the same struct revocable and therefore potentially overwrite the SRCU index of another thread which can cause the SRCU synchronisation in revocable_provider_revoke() to never complete. [1] An example revocable conversion of the gpiolib code also turned out to be fundamentally flawed and could lead to use-after-free. [2] An attempt to address both issues was quickly put together and merged, but revocable is still fundamentally broken. [3] Specifically, the latest design relies on RCU for storing a pointer to the revocable provider, but since the resource can be shared by value (e.g. as in the now reverted selftests) this does not work at all and can also lead to use-after-free: static void revocable_provider_release(struct kref *kref) { struct revocable_provider *rp = container_of(kref, struct revocable_provider, kref); cleanup_srcu_struct(&rp->srcu); kfree_rcu(rp, rcu); } void revocable_provider_revoke(struct revocable_provider __rcu **rp_ptr) { struct revocable_provider *rp; rp = rcu_replace_pointer(*rp_ptr, NULL, 1); ... kref_put(&rp->kref, revocable_provider_release); } int revocable_init(struct revocable_provider __rcu *_rp, struct revocable *rev) { struct revocable_provider *rp; ... scoped_guard(rcu) { rp = rcu_dereference(_rp); if (!rp) return -ENODEV; if (!kref_get_unless_zero(&rp->kref)) return -ENODEV; } ... } producer: priv->rp = revocable_provider_alloc(&priv->res); // pass priv->rp by value to consumer revocable_provider_revoke(&priv->rp); consumer: struct revocable_provider __rcu *rp = filp->private_data; struct revocable *rev; revocable_init(rp, &rev); as _rp would still be non-NULL in revocable_init() regardless of whether the producer has revoked the resource and set its pointer to NULL. Essentially revocable still relies on having a pointer to reference counted driver data which holds the revocable provider, which makes all the RCU protection unnecessary along with most of the current revocable design and implementation. As the above shows, and as has been pointed out repeatedly elsewhere, these kind of issues are not something that should be addressed incrementally. [4] Revert the revocable implementation until a redesign has been proposed and evaluated properly. Link: https://lore.kernel.org/all/20260124170535.11756-4-johan@kernel.org/ [1] Link: https://lore.kernel.org/all/aXT45B6vLf9R3Pbf@hovoldconsulting.com/ [2] Link: https://lore.kernel.org/all/20260129143733.45618-1-tzungbi@kernel.org/ [3] Link: https://lore.kernel.org/all/aXobzoeooJqxMkEj@hovoldconsulting.com/ [4] Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260204142849.22055-4-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-06Revert "revocable: Add Kunit test cases"Johan Hovold4-296/+0
This reverts commit cd7693419bb5abd91ad2f407dab69c480e417a61. The new revocable functionality is fundamentally broken and at a minimum needs to be redesigned. Drop the revocable Kunit tests to allow the implementation to be reverted. Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260204142849.22055-3-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-06Revert "selftests: revocable: Add kselftest cases"Johan Hovold6-380/+0
This reverts commit 9d4502fef00fa7a798d3c0806d4da4466a7ffc6f. The new revocable functionality is fundamentally broken and at a minimum needs to be redesigned. Drop the revocable selftests to allow the implementation to be reverted. Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://patch.msgid.link/20260204142849.22055-2-johan@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-06perf metricgroup: Don't early exit if no CPUID table existsIan Rogers1-13/+5
The failure to find a table of metrics with a CPUID shouldn't early exit as the metric code will now also consider the default table. When searching for a metric or metric group, pmu_metrics_table__for_each_metric() considers all tables and so the caller doesn't need to switch the table to do this. Fixes: c7adeb0974f18da4 ("perf jevents: Add set of common metrics based on default ones") Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06perf tests: build-test coverage for NO_JEVENTS=1Ian Rogers1-0/+2
Leo reported 'perf stat' being broken and this highlighted that the 'make NO_JEVENTS=1' variant is missing from 'make -C tools/perf build-test', add it. Closes: https://lore.kernel.org/linux-perf-users/20260205175250.GC3529712@e132581.arm.com/ Reported-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06perf tests: Additional 'perf stat' testsIan Rogers1-0/+242
Recently 'perf stat' regressed in per CPU mode [1]. Let's expand test coverage to catch the same breakage again as well as to test the repeat, pid, detailed and no aggregation options. [1] https://lore.kernel.org/linux-perf-users/cgja46br2smmznxs7kbeabs6zgv3b4olfqgh2fdp5mxk2yom4v@w6jjgov6hdi6/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andres Freund <andres@anarazel.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06perf record: Make logs more readable for event open failuresLeo Yan1-0/+1
Since commit ee27476fa3004f83 ("perf record: Skip don't fail for events that don't open"), if a user does not have permission to access a PMU event, perf reports: perf record -e cs_etm// -C 3 -- ls Error: Failure to open event 'cs_etm//u' on PMU 'cs_etm' which will be removed. No fallback found for 'cs_etm//u' for error 13 Error: Failure to open event 'dummy:u' on PMU 'software' which will be removed. No fallback found for 'dummy:u' for error 13 Error: Failure to open any events for recording. The log is not very helpful, as no clear indication of what "error 13" means or how to address the issue. This commit restores evsel__open_strerror() to generate a readable error message and print it out: perf record -e cs_etm// -C 3 -- ls Error: Failure to open event 'cs_etm//' on PMU 'cs_etm' which will be removed. Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open access to performance monitoring and observability operations for processes without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability. More information can be found at 'Perf events and tool security' document: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html perf_event_paranoid setting is 1: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow raw and ftrace function tracepoint access >= 1: Disallow CPU event access >= 2: Disallow kernel profiling To make the adjusted perf_event_paranoid setting permanent preserve it in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>) Error: Failure to open event 'dummy:u' on PMU 'software' which will be removed. Access to performance monitoring and observability operations is limited. Consider adjusting /proc/sys/kernel/perf_event_paranoid setting to open access to performance monitoring and observability operations for processes without CAP_PERFMON, CAP_SYS_PTRACE or CAP_SYS_ADMIN Linux capability. More information can be found at 'Perf events and tool security' document: https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html perf_event_paranoid setting is 1: -1: Allow use of (almost) all events by all users Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK >= 0: Disallow raw and ftrace function tracepoint access >= 1: Disallow CPU event access >= 2: Disallow kernel profiling To make the adjusted perf_event_paranoid setting permanent preserve it in /etc/sysctl.conf (e.g. kernel.perf_event_paranoid = <setting>) Error: Failure to open any events for recording. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-06usb: typec: ucsi: Add Thunderbolt alternate mode supportAndrei Kuchynski4-5/+249
Introduce support for Thunderbolt (TBT) alternate mode to the UCSI driver. This allows the driver to manage the entry and exit of TBT altmode by providing the necessary typec_altmode_ops. ucsi_altmode_update_active() is invoked when the Connector Partner Changed bit is set in the GET_CONNECTOR_STATUS data. This ensures that the alternate mode's active state is synchronized with the current mode the connector is operating in. Signed-off-by: Andrei Kuchynski <akuchynski@chromium.org> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://patch.msgid.link/20260206115754.828954-1-akuchynski@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-06io_uring: allow registration of per-task restrictionsJens Axboe8-1/+231
Currently io_uring supports restricting operations on a per-ring basis. To use those, the ring must be setup in a disabled state by setting IORING_SETUP_R_DISABLED. Then restrictions can be set for the ring, and the ring can then be enabled. This commit adds support for IORING_REGISTER_RESTRICTIONS with ring_fd == -1, like the other "blind" register opcodes which work on the task rather than a specific ring. This allows registration of the same kind of restrictions as can been done on a specific ring, but with the task itself. Once done, any ring created will inherit these restrictions. If a restriction filter is registered with a task, then it's inherited on fork for its children. Children may only further restrict operations, not extend them. Inheriting restrictions include both the classic IORING_REGISTER_RESTRICTIONS based restrictions, as well as the BPF filters that have been registered with the task via IORING_REGISTER_BPF_FILTER. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-06io_uring: add task fork hookJens Axboe4-10/+36
Called when copy_process() is called to copy state to a new child. Right now this is just a stub, but will be used shortly to properly handle fork'ing of task based io_uring restrictions. Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-06mtd: spi-nor: hisi-sfc: fix refcounting bug in hisi_spi_nor_register_all()Dan Carpenter1-1/+0
This was converted to a _scoped() loop but this of_node_put() was accidentally left behind which is a double free. Fixes: aa8cb72c2018 ("mtd: spi-nor: hisi-sfc: Simplify with scoped for each OF child loop") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2026-02-06sparc: remove unused variable strtabAlex Shi1-2/+0
The commit 1b35a57b1c178 ("sparc32: Kill off software 32-bit multiply/divide routines") removed the last usage of strtab in funtion module_frob_arch_sections Therefore, it can be removed now. Reported-by: kernel test robot <lkp@intel.com> Cc: sparclinux@vger.kernel.org Cc: David S. Miller <davem@davemloft.net> Cc: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Alex Shi <alexs@kernel.org> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2026-02-06sparc64: fix unused variable warningAlex Shi1-6/+0
arch/sparc/mm/init_64.c: In function 'arch_hugetlb_valid_size': arch/sparc/mm/init_64.c:361:24: warning: variable 'hv_pgsz_idx' set but not used [-Wunused-but-set-variable] 361 | unsigned short hv_pgsz_idx; | ^~~~~~~~~~~ Reported-by: kernel test robot <lkp@intel.com> Cc: sparclinux@vger.kernel.org CC: Nitin Gupta <nitin.m.gupta@oracle.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Alex Shi <alexs@kernel.org> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2026-02-06sparc: don't reference obsolete termio struct for TC* constantsSam James1-4/+4
Similar in nature to commit ab107276607a ("powerpc: Fix struct termio related ioctl macros"). glibc-2.42 drops the legacy termio struct, but the ioctls.h header still defines some TC* constants in terms of termio (via sizeof). Hardcode the values instead. This fixes building Python for example, which falls over like: ./Modules/termios.c:1119:16: error: invalid application of 'sizeof' to incomplete type 'struct termio' Link: https://bugs.gentoo.org/961769 Link: https://bugs.gentoo.org/962600 Signed-off-by: Sam James <sam@gentoo.org> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2026-02-06sparc: vio: Replace snprintf with strscpy in vio_create_oneThorsten Blum1-2/+2
Replace snprintf("%s", ...) with the faster and more direct strscpy(). Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2026-02-06sparc: Add architecture support for clone3Ludwig Rydberg9-15/+78
Add support for the clone3 system call to the SPARC architectures. The implementation follows the pattern of the original clone syscall. However, instead of explicitly calling kernel_clone, the clone3 handler calls the generic sys_clone3 handler in kernel/fork. In case no stack is provided, the parents stack is reused. The return value convention for clone3 follows the regular kernel return value convention (in contrast to the original clone/fork on SPARC). Closes: https://github.com/sparclinux/issues/issues/10 Signed-off-by: Ludwig Rydberg <ludwig.rydberg@gaisler.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Andreas Larsson <andreas@gaisler.com> Tested-by: Andreas Larsson <andreas@gaisler.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/20260119144753.27945-3-ludwig.rydberg@gaisler.com Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2026-02-06sparc: Synchronize user stack on fork and cloneAndreas Larsson1-14/+24
Flush all uncommitted user windows before calling the generic syscall handlers for clone, fork, and vfork. Prior to entering the arch common handlers sparc_{clone|fork|vfork}, the arch-specific syscall wrappers for these syscalls will attempt to flush all windows (including user windows). In the window overflow trap handlers on both SPARC{32|64}, if the window can't be stored (i.e due to MMU related faults) the routine backups the user window and increments a thread counter (wsaved). By adding a synchronization point after the flush attempt, when fault handling is enabled, any uncommitted user windows will be flushed. Link: https://sourceware.org/bugzilla/show_bug.cgi?id=31394 Closes: https://lore.kernel.org/sparclinux/fe5cc47167430007560501aabb28ba154985b661.camel@physik.fu-berlin.de/ Signed-off-by: Andreas Larsson <andreas@gaisler.com> Signed-off-by: Ludwig Rydberg <ludwig.rydberg@gaisler.com> Tested-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Link: https://lore.kernel.org/r/20260119144753.27945-2-ludwig.rydberg@gaisler.com Signed-off-by: Andreas Larsson <andreas@gaisler.com>
2026-02-06ALSA: oss: delete self assignmentDan Carpenter1-1/+1
No need to assign "uctl" to itself. Delete it. Fixes: 55f98ece9939 ("ALSA: oss: Relax __free() variable declarations") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/aYXvm2YoV2yRimhk@stanley.mountain Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-02-06irqchip/riscv-imsic: Adjust the number of available guest irq filesXu Lu3-2/+15
Currently, KVM assumes the minimum of implemented HGEIE bits and "BIT(gc->guest_index_bits) - 1" as the number of guest files available across all CPUs. This will not work when CPUs have different number of guest files because KVM may incorrectly allocate a guest file on a CPU with fewer guest files. To address above, during initialization, calculate the number of available guest interrupt files according to MMIO resources and constrain the number of guest interrupt files that can be allocated by KVM. Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Acked-by: Thomas Gleixner <tglx@kernel.org> Link: https://lore.kernel.org/r/20260104133457.57742-1-luxu.kernel@bytedance.com Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06RISC-V: KVM: Transparent huge page supportJessica Liu2-0/+142
Use block mapping if backed by a THP, as implemented in architectures like ARM and x86_64. Signed-off-by: Jessica Liu <liu.xuemei1@zte.com.cn> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251127165137780QbUOVPKPAfWSGAFl5qtRy@zte.com.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06RISC-V: KVM: selftests: Add Zalasr extensions to get-reg-list testXu Lu1-0/+4
The KVM RISC-V allows Zalasr extensions for Guest/VM so add this extension to get-reg-list test. Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251020042904.32096-1-luxu.kernel@bytedance.com Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06RISC-V: KVM: Allow Zalasr extensions for Guest/VMXu Lu2-0/+3
Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zalasr extensions for Guest/VM. Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251020042457.30915-5-luxu.kernel@bytedance.com Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06KVM: riscv: selftests: Add riscv vm satp modesWu Fei5-14/+142
Current vm modes cannot represent riscv guest modes precisely, here add all 9 combinations of P(56,40,41) x V(57,48,39). Also the default vm mode is detected on runtime instead of hardcoded one, which might not be supported on specific machine. Signed-off-by: Wu Fei <wu.fei9@sanechips.com.cn> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251105151442.28767-1-wu.fei9@sanechips.com.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06KVM: riscv: selftests: add Zilsd and Zclsd extension to get-reg-list testPincheng Wang1-0/+8
The KVM RISC-V allows Zilsd and Zclsd extensions for Guest/VM so add this extension to get-reg-list test. Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20250826162939.1494021-6-pincheng.plct@isrc.iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06riscv: KVM: allow Zilsd and Zclsd extensions for Guest/VMPincheng Wang2-0/+4
Extend the KVM ISA extension ONE_REG interface to allow KVM user space to detect and enable Zilsd and Zclsd extensions for Guest/VM. Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20250826162939.1494021-5-pincheng.plct@isrc.iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06RISC-V: KVM: Skip IMSIC update if vCPU IMSIC state is not initializedJiakai Xu1-0/+4
kvm_riscv_vcpu_aia_imsic_update() assumes that the vCPU IMSIC state has already been initialized and unconditionally accesses imsic->vsfile_lock. However, in fuzzed ioctl sequences, the AIA device may be initialized at the VM level while the per-vCPU IMSIC state is still NULL. This leads to invalid access when entering the vCPU run loop before IMSIC initialization has completed. The crash manifests as: Unable to handle kernel paging request at virtual address dfffffff00000006 ... kvm_riscv_vcpu_aia_imsic_update arch/riscv/kvm/aia_imsic.c:801 kvm_riscv_vcpu_aia_update arch/riscv/kvm/aia_device.c:493 kvm_arch_vcpu_ioctl_run arch/riscv/kvm/vcpu.c:927 ... Add a guard to skip the IMSIC update path when imsic_state is NULL. This allows the vCPU run loop to continue safely. This issue was discovered during fuzzing of RISC-V KVM code. Fixes: db8b7e97d6137a ("RISC-V: KVM: Add in-kernel virtualization of AIA IMSIC") Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20260127084313.3496485-1-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_rw_attr()Jiakai Xu1-1/+3
Add a null pointer check for imsic_state before dereferencing it in kvm_riscv_aia_imsic_rw_attr(). While the function checks that the vcpu exists, it doesn't verify that the vcpu's imsic_state has been initialized, leading to a null pointer dereference when accessed. The crash manifests as: Unable to handle kernel paging request at virtual address dfffffff00000006 ... kvm_riscv_aia_imsic_rw_attr+0x2d8/0x854 arch/riscv/kvm/aia_imsic.c:958 aia_set_attr+0x2ee/0x1726 arch/riscv/kvm/aia_device.c:354 kvm_device_ioctl_attr virt/kvm/kvm_main.c:4744 [inline] kvm_device_ioctl+0x296/0x374 virt/kvm/kvm_main.c:4761 vfs_ioctl fs/ioctl.c:51 [inline] ... The fix adds a check to return -ENODEV if imsic_state is NULL and moves isel assignment after imsic_state NULL check. Fixes: 5463091a51cfaa ("RISC-V: KVM: Expose IMSIC registers as attributes of AIA irqchip") Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20260127072219.3366607-1-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06RISC-V: KVM: Fix null pointer dereference in kvm_riscv_aia_imsic_has_attr()Jiakai Xu1-1/+4
Add a null pointer check for imsic_state before dereferencing it in kvm_riscv_aia_imsic_has_attr(). While the function checks that the vcpu exists, it doesn't verify that the vcpu's imsic_state has been initialized, leading to a null pointer dereference when accessed. This issue was discovered during fuzzing of RISC-V KVM code. The crash occurs when userspace calls KVM_HAS_DEVICE_ATTR ioctl on an AIA IMSIC device before the IMSIC state has been fully initialized for a vcpu. The crash manifests as: Unable to handle kernel paging request at virtual address dfffffff00000001 ... epc : kvm_riscv_aia_imsic_has_attr+0x464/0x50e arch/riscv/kvm/aia_imsic.c:998 ... kvm_riscv_aia_imsic_has_attr+0x464/0x50e arch/riscv/kvm/aia_imsic.c:998 aia_has_attr+0x128/0x2bc arch/riscv/kvm/aia_device.c:471 kvm_device_ioctl_attr virt/kvm/kvm_main.c:4722 [inline] kvm_device_ioctl+0x296/0x374 virt/kvm/kvm_main.c:4739 ... The fix adds a check to return -ENODEV if imsic_state is NULL, which is consistent with other error handling in the function and prevents the null pointer dereference. Fixes: 5463091a51cf ("RISC-V: KVM: Expose IMSIC registers as attributes of AIA irqchip") Signed-off-by: Jiakai Xu <xujiakai2025@iscas.ac.cn> Signed-off-by: Jiakai Xu <jiakaiPeanut@gmail.com> Reviewed-by: Nutty Liu <nutty.liu@hotmail.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20260125143344.2515451-1-xujiakai2025@iscas.ac.cn Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06RISC-V: KVM: Remove unnecessary 'ret' assignmentQiang Ma1-4/+1
If execution reaches "ret = 0" assignment in kvm_riscv_vcpu_pmu_event_info() then it means kvm_vcpu_write_guest() returned 0 hence ret is already zero and does not need to be assigned 0. Fixes: e309fd113b9f ("RISC-V: KVM: Implement get event info function") Signed-off-by: Qiang Ma <maqianga@uniontech.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20251229072530.3075496-1-maqianga@uniontech.com Signed-off-by: Anup Patel <anup@brainfault.org>
2026-02-06ARM: 9469/1: Implement ARCH_HAS_CC_CAN_LINKThomas Weissschuh1-0/+11
The generic CC_CAN_LINK detection does not handle different byte orders. This may lead to userprogs which are not actually runnable on the target kernel. Use architecture-specific logic supporting byte orders instead. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2026-02-06ovl: relax requirement for uuid=off,index=onAmir Goldstein4-20/+24
uuid=off,index=on required that all upper/lower directories are on the same filesystem. Relax the requirement so that only all the lower directories need to be on the same filesystem. Reported-by: André Almeida <andrealmeid@igalia.com> Link: https://lore.kernel.org/r/20260114-tonyk-get_disk_uuid-v1-3-e6a319e25d57@igalia.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com>
2026-02-06netfilter: nft_set_rbtree: validate open interval overlapPablo Neira Ayuso3-14/+82
Open intervals do not have an end element, in particular an open interval at the end of the set is hard to validate because of it is lacking the end element, and interval validation relies on such end element to perform the checks. This patch adds a new flag field to struct nft_set_elem, this is not an issue because this is a temporary object that is allocated in the stack from the insert/deactivate path. This flag field is used to specify that this is the last element in this add/delete command. The last flag is used, in combination with the start element cookie, to check if there is a partial overlap, eg. Already exists: 255.255.255.0-255.255.255.254 Add interval: 255.255.255.0-255.255.255.255 ~~~~~~~~~~~~~ start element overlap Basically, the idea is to check for an existing end element in the set if there is an overlap with an existing start element. However, the last open interval can come in any position in the add command, the corner case can get a bit more complicated: Already exists: 255.255.255.0-255.255.255.254 Add intervals: 255.255.255.0-255.255.255.255,255.255.255.0-255.255.255.254 ~~~~~~~~~~~~~ start element overlap To catch this overlap, annotate that the new start element is a possible overlap, then report the overlap if the next element is another start element that confirms that previous element in an open interval at the end of the set. For deletions, do not update the start cookie when deleting an open interval, otherwise this can trigger spurious EEXIST when adding new elements. Unfortunately, there is no NFT_SET_ELEM_INTERVAL_OPEN flag which would make easier to detect open interval overlaps. Fixes: 7c84d41416d8 ("netfilter: nft_set_rbtree: Detect partial overlaps on insertion") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-02-06netfilter: nft_set_rbtree: validate element belonging to intervalPablo Neira Ayuso1-4/+143
The existing partial overlap detection does not check if the elements belong to the interval, eg. add element inet x y { 1.1.1.1-2.2.2.2, 4.4.4.4-5.5.5.5 } add element inet x y { 1.1.1.1-5.5.5.5 } => this should fail: ENOENT Similar situation occurs with deletions: add element inet x y { 1.1.1.1-2.2.2.2, 4.4.4.4-5.5.5.5} delete element inet x y { 1.1.1.1-5.5.5.5 } => this should fail: ENOENT This currently works via mitigation by nft in userspace, which is performing the overlap detection before sending the elements to the kernel. This requires a previous netlink dump of the set content which slows down incremental updates on interval sets, because a netlink set content dump is needed. This patch extends the existing overlap detection to track the most recent start element that already exists. The pointer to the existing start element is stored as a cookie (no pointer dereference is ever possible). If the end element is added and it already exists, then check that the existing end element is adjacent to the already existing start element. Similar logic applies to element deactivation. This patch also annotates the timestamp to identify if start cookie comes from an older batch, in such case reset it. Otherwise, a failing create element command leaves the start cookie in place, resulting in bogus error reporting. There is still a few more corner cases of overlap detection related to the open interval that are addressed in follow up patches. This is address an early design mistake where an interval is expressed as two elements, using the NFT_SET_ELEM_INTERVAL_END flag, instead of the more recent NFTA_SET_ELEM_KEY_END attribute that pipapo already uses. Fixes: 7c84d41416d8 ("netfilter: nft_set_rbtree: Detect partial overlaps on insertion") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-02-06netfilter: nft_set_rbtree: check for partial overlaps in anonymous setsPablo Neira Ayuso1-5/+25
Userspace provides an optimized representation in case intervals are adjacent, where the end element is omitted. The existing partial overlap detection logic skips anonymous set checks on start elements for this reason. However, it is possible to add intervals that overlap to this anonymous where two start elements with the same, eg. A-B, A-C where C < B. start end A B start end A C Restore the check on overlapping start elements to report an overlap. Fixes: c9e6978e2725 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-02-06netfilter: nft_set_rbtree: fix bogus EEXIST with NLM_F_CREATE with null intervalPablo Neira Ayuso2-0/+18
Userspace adds a non-matching null element to the kernel for historical reasons. This null element is added when the set is populated with elements. Inclusion of this element is conditional, therefore, userspace needs to dump the set content to check for its presence. If the NLM_F_CREATE flag is turned on, this becomes an issue because kernel bogusly reports EEXIST. Add special case to ignore NLM_F_CREATE in this case, therefore, re-adding the nul-element never fails. Fixes: c016c7e45ddf ("netfilter: nf_tables: honor NLM_F_EXCL flag in set element insertion") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-02-06netfilter: nft_counter: fix reset of counters on 32bit archsAnders Grahn2-2/+12
nft_counter_reset() calls u64_stats_add() with a negative value to reset the counter. This will work on 64bit archs, hence the negative value added will wrap as a 64bit value which then can wrap the stat counter as well. On 32bit archs, the added negative value will wrap as a 32bit value and _not_ wrapping the stat counter properly. In most cases, this would just lead to a very large 32bit value being added to the stat counter. Fix by introducing u64_stats_sub(). Fixes: 4a1d3acd6ea8 ("netfilter: nft_counter: Use u64_stats_t for statistic.") Signed-off-by: Anders Grahn <anders.grahn@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-02-06netfilter: nft_set_hash: fix get operation on big endianFlorian Westphal1-2/+7
tests/shell/testcases/packetpath/set_match_nomatch_hash_fast fails on big endian with: Error: Could not process rule: No such file or directory reset element ip test s { 244.147.90.126 } ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Fatal: Cannot fetch element "244.147.90.126" ... because the wrong bucket is searched, jhash() and jhash1_word are not interchangeable on big endian. Fixes: 3b02b0adc242 ("netfilter: nft_set_hash: fix lookups with fixed size hash on big endian") Signed-off-by: Florian Westphal <fw@strlen.de>
2026-02-06selftests: netfilter: add IPV6_TUNNEL to configFlorian Westphal2-6/+14
The script now requires IPV6 tunnel support, enable this. This should have caught by CI, but as the config option is missin