aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-02-03cpufreq: intel_pstate: Enable asym capacity only when CPU SMT is not possibleYaxiong Tian1-1/+1
According to the description in the intel_pstate.rst documentation, Capacity-Aware Scheduling and Energy-Aware Scheduling are only supported on a hybrid processor without SMT. Previously, the system used sched_smt_active() for judgment, which is not a strict condition because users can switch it on or off via /sys at any time. This could lead to incorrect driver settings in certain scenarios. For example, on a CPU that supports SMT, a user can disable SMT via the nosmt parameter to enable asym capacity, and then re-enable SMT via /sys. In such cases, some settings in the driver would no longer be correct. To address this issue, replace sched_smt_active() with cpu_smt_possible(), and only enable asym capacity when CPU SMT is not possible. Fixes: 929ebc93ccaa ("cpufreq: intel_pstate: Set asymmetric CPU capacity on hybrid systems") Signed-off-by: Yaxiong Tian <tianyaxiong@kylinos.cn> [ rjw: Subject and changelog edits ] Link: https://patch.msgid.link/20260203024852.301066-1-tianyaxiong@kylinos.cn Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-02-03spi: xilinx: use device property accessors.Abdurrahman Hussain1-5/+5
Switch to device property accessors. Signed-off-by: Abdurrahman Hussain <abdurrahman@nexthop.ai> Link: https://patch.msgid.link/20260203-spi-xilinx-v4-1-42f7c326061b@nexthop.ai Signed-off-by: Mark Brown <broonie@kernel.org>
2026-02-03perf thread: Don't require machine to compute the e_machineIan Rogers3-7/+9
The machine can be calculated from a thread via its maps. Don't require the machine argument to simplify callers and also to delay computing the machine until a little later. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-03perf header: Add e_machine/e_flags to the headerIan Rogers4-6/+64
Add 64-bits of feature data to record the ELF machine and flags. This allows readers to initialize based on the data. For example, `perf kvm stat` wants to initialize based on the kind of data to be read, but at initialization time there are no threads to base this data upon and using the host means cross platform support won't work. The values in the perf_env also act as a cache for these within the session. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-03perf session: Add e_flags to the e_machine helperIan Rogers8-26/+55
Allow e_flags as well as e_machine to be computed using the e_machine helper. This isn't currently used, the argument is always NULL, but it will be used for a new header feature. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-03perf kvm: Wire up e_machineIan Rogers8-68/+80
Pass the e_machine to the kvm functions so that they aren't just wired to EM_HOST. In the case of a session move some setup until the session is created. As the session isn't fully running the default EM_HOST is returned as no e_machine can be found in a running machine. This is, however, some marginal progress to cross platform support. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-03perf kvm stat: Remove use of the arch directoryIan Rogers28-177/+451
`perf kvm stat` supports record and report options. By using the arch directory a report for a different machine type cannot be supported. Move the kvm-stat code out of the arch directory and into util/kvm-stat-arch following the pattern of perf-regs and dwarf-regs. Avoid duplicate symbols by renaming functions to have the architecture name within them. For global variables, wrap them in an architecture specific function. Selecting the architecture to use with `perf kvm stat` is selected by EM_HOST, ie no different than before the change. Later the ELF machine can be determined from the session or a header feature (ie EM_HOST at the time of the record). The build and #define HAVE_KVM_STAT_SUPPORT is now redundant so remove across Makefiles and in the build. Opportunistically constify architectural structs and arrays. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-03libperf build: Always place libperf includes firstIan Rogers1-1/+1
When building tools/perf the CFLAGS can contain a directory for the installed headers. As the headers may be being installed while building libperf.a this can cause headers to be partially installed and found in the include path while building an object file for libperf.a. The installed header may reference other installed headers that are missing given the partial nature of the install and then the build fails with a missing header file. Avoid this by ensuring the libperf source headers are always first in the CFLAGS. Fixes: 3143504918105156 ("libperf: Make libperf.a part of the perf build") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-03perf test kvm: Add stat live testingIan Rogers1-1/+29
Ensure the `perf kvm stat live -p ..` has some basic functionality. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-03Merge tag 'i2c-host-6.20' of ↵Wolfram Sang22-452/+423
git://git.kernel.org/pub/scm/linux/kernel/git/andi.shyti/linux into i2c/for-mergewindow i2c-host for v6.20 - amd-mp2, designware, mlxbf, rtl9300, spacemit, tegra: cleanups - designware: use a dedicated algorithm for AMD Navi - designware: replace magic numbers with named constants - designware: replace min_t() with min() to avoid u8 truncation - designware: refactor core to enable mode switching - imx-lpi2c: add runtime PM support for IRQ and clock handling - lan9691-i2c: add new driver - rtl9300: use OF helpers directly and avoid fwnode handling - spacemit: add bus reset support - units: add HZ_PER_GHZ and use it in several i2c drivers
2026-02-03i2c: imx: preserve error state in block data length handlerLI Qingwu1-1/+2
When a block read returns an invalid length, zero or >I2C_SMBUS_BLOCK_MAX, the length handler sets the state to IMX_I2C_STATE_FAILED. However, i2c_imx_master_isr() unconditionally overwrites this with IMX_I2C_STATE_READ_CONTINUE, causing an endless read loop that overruns buffers and crashes the system. Guard the state transition to preserve error states set by the length handler. Fixes: 5f5c2d4579ca ("i2c: imx: prevent rescheduling in non dma mode") Signed-off-by: LI Qingwu <Qing-wu.Li@leica-geosystems.com.cn> Cc: <stable@vger.kernel.org> # v6.13+ Reviewed-by: Stefan Eichenberger <eichest@gmail.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20260116111906.3413346-2-Qing-wu.Li@leica-geosystems.com.cn Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2026-02-03ceph: fix oops due to invalid pointer for kfree() in parse_longname()Daniel Vogelbacher1-4/+5
This fixes a kernel oops when reading ceph snapshot directories (.snap), for example by simply running `ls /mnt/my_ceph/.snap`. The variable str is guarded by __free(kfree), but advanced by one for skipping the initial '_' in snapshot names. Thus, kfree() is called with an invalid pointer. This patch removes the need for advancing the pointer so kfree() is called with correct memory pointer. Steps to reproduce: 1. Create snapshots on a cephfs volume (I've 63 snaps in my testcase) 2. Add cephfs mount to fstab $ echo "samba-fileserver@.files=/volumes/datapool/stuff/3461082b-ecc9-4e82-8549-3fd2590d3fb6 /mnt/test/stuff ceph acl,noatime,_netdev 0 0" >> /etc/fstab 3. Reboot the system $ systemctl reboot 4. Check if it's really mounted $ mount | grep stuff 5. List snapshots (expected 63 snapshots on my system) $ ls /mnt/test/stuff/.snap Now ls hangs forever and the kernel log shows the oops. Cc: stable@vger.kernel.org Fixes: 101841c38346 ("[ceph] parse_longname(): strrchr() expects NUL-terminated string") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220807 Suggested-by: Helge Deller <deller@gmx.de> Signed-off-by: Daniel Vogelbacher <daniel@chaospixel.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2026-02-03rbd: check for EOD after exclusive lock is ensured to be heldIlya Dryomov1-12/+21
Similar to commit 870611e4877e ("rbd: get snapshot context after exclusive lock is ensured to be held"), move the "beyond EOD" check into the image request state machine so that it's performed after exclusive lock is ensured to be held. This avoids various race conditions which can arise when the image is shrunk under I/O (in practice, mostly readahead). In one such scenario rbd_assert(objno < rbd_dev->object_map_size); can be triggered if a close-to-EOD read gets queued right before the shrink is initiated and the EOD check is performed against an outdated mapping_size. After the resize is done on the server side and exclusive lock is (re)acquired bringing along the new (now shrunk) object map, the read starts going through the state machine and rbd_obj_may_exist() gets invoked on an object that is out of bounds of rbd_dev->object_map array. Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@linux.dev>
2026-02-03perf/arm-cmn: Reject unsupported hardware configurationsRobin Murphy1-1/+14
So far we've been fairly lax about accepting both unknown CMN models (at least with a warning), and unknown revisions of those which we do know, as although things do frequently change between releases, typically enough remains the same to be somewhat useful for at least some basic bringup checks. However, we also make assumptions of the maximum supported sizes and numbers of things in various places, and there's no guarantee that something new might not be bigger and lead to nasty array overflows. Make sure we only try to run on things that actually match our assumptions and so will not risk memory corruption. We have at least always failed on completely unknown node types, so update that error message for clarity and consistency too. Cc: stable@vger.kernel.org Fixes: 7819e05a0dce ("perf/arm-cmn: Revamp model detection") Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-02-03perf: arm_spe: Properly set hw.state on failuresLeo Yan1-6/+12
When arm_spe_pmu_next_off() fails to calculate a valid limit, it returns zero to indicate that tracing should not start. However, the caller arm_spe_perf_aux_output_begin() does not propagate this failure by updating hwc->state, cause the error to be silently ignored by upper layers. Because hwc->state remains zero after a failure, arm_spe_pmu_start() continues to programs filter registers unnecessarily. The driver still reports success to the perf core, so the core assumes the SPE event was enabled and proceeds to enable other events. This breaks event group semantics: SPE is already stopped while other events in the same group are enabled. Fix this by updating arm_spe_perf_aux_output_begin() to return a status code indicating success (0) or failure (-EIO). Both the interrupt handler and arm_spe_pmu_start() check the return value and call arm_spe_pmu_stop() to set PERF_HES_STOPPED in hwc->state. In the interrupt handler, the period (e.g., period_left) needs to be updated, so PERF_EF_UPDATE is passed to arm_spe_pmu_stop(). When the error occurs during event start, the trace unit is not yet enabled, so a flag '0' is used to drain buffer and update state only. Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") Signed-off-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2026-02-03workqueue: add CONFIG_BOOTPARAM_WQ_STALL_PANIC optionBreno Leitao3-2/+26
Add a kernel config option to set the default value of workqueue.panic_on_stall, similar to CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC, CONFIG_BOOTPARAM_HARDLOCKUP_PANIC and CONFIG_BOOTPARAM_HUNG_TASK_PANIC. This allows setting the number of workqueue stalls before triggering a kernel panic at build time, which is useful for high-availability systems that need consistent panic-on-stall, in other words, those servers which run with CONFIG_BOOTPARAM_*_PANIC=y already. The default remains 0 (disabled). Setting it to 1 will panic on the first stall, and higher values will panic after that many stall warnings. The value can still be overridden at runtime via the workqueue.panic_on_stall boot parameter or sysfs. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-02-03accel/amdxdna: Fix incorrect error code returned for failed chain commandLizhi Hou1-1/+1
The driver currently returns an incorrect error code when a chain command fails. In this case, ERT_CMD_STATE_ERROR is expected to be reported for failed chain commands. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260203184037.2751889-1-lizhi.hou@amd.com
2026-02-03Merge branch 'bpf-add-bpf_stream_print_stack-kfunc'Alexei Starovoitov4-1/+85
Emil Tsalapatis says: ==================== bpf: Add bpf_stream_print_stack kfunc Add a new bpf_stream_print_stack kfunc for printing a BPF program stack into a BPF stream. Update the verifier to allow the new kfunc to be called with BPF spinlocks held, along with bpf_stream_vprintk. Patchset spun out of the larger libarena + ASAN patchset. (https://lore.kernel.org/bpf/20260127181610.86376-1-emil@etsalapatis.com/) Changeset: - Update bpf_stream_print_stack to take stream_id arg (Kumar) - Added selftest for the bpf_stream_print_stack - Add selftest for calling the streams kfuncs under lock v2->v1: (https://lore.kernel.org/bpf/20260202193311.446717-1-emil@etsalapatis.com/) - Updated Signed-off-by to be consistent with past submissions - Updated From email to be consistent with Signed-off-by Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> ==================== Link: https://patch.msgid.link/20260203180424.14057-1-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03selftests/bpf: Add selftests for stream functions under lockEmil Tsalapatis1-0/+32
Add a selftest to ensure BPF stream functions can now be called while holding a lock. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260203180424.14057-5-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03bpf: Allow BPF stream kfuncs while holding a lockEmil Tsalapatis1-1/+12
The BPF stream kfuncs bpf_stream_vprintk and bpf_stream_print_stack do not sleep and so are safe to call while holding a lock. Amend the verifier to allow that. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260203180424.14057-4-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03selftests/bpf: Add selftests for bpf_stream_print_stackEmil Tsalapatis1-0/+21
Add selftests for the new bpf_stream_print_stack kfunc. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260203180424.14057-3-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03bpf: Add bpf_stream_print_stack stack dumping kfuncEmil Tsalapatis2-0/+20
Add a new kfunc called bpf_stream_print_stack to be used by programs that need to print out their current BPF stack. The kfunc is essentially a wrapper around the existing bpf_stream_dump_stack functionality used to generate stack traces for error events like may_goto violations and BPF-side arena page faults. Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20260203180424.14057-2-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03Merge branch 'bpf-verifier-improve-state-pruning-for-scalar-registers'Alexei Starovoitov3-25/+199
Puranjay Mohan says: ==================== bpf: Improve state pruning for scalar registers V2: https://lore.kernel.org/all/20260203022229.1630849-1-puranjay@kernel.org/ Changes in V3: - Fix spelling mistakes in commit logs (AI) - Fix an incorrect comment in the selftest added in patch 5 (AI) - Improve the title of patch 5 V1: https://lore.kernel.org/all/20260202104414.3103323-1-puranjay@kernel.org/ Changes in V2: - Collected acked by Eduard - Removed some unnecessary comments - Added a selftest for id=0 equivalence in Patch 5 This series improves BPF verifier state pruning by relaxing scalar ID equivalence requirements. Scalar register IDs are used to track relationships between registers for bounds propagation. However, once an ID becomes "singular" (only one register/stack slot carries it), it can no longer participate in bounds propagation and becomes stale. These stale IDs can prevent pruning of otherwise equivalent states. The series addresses this in four patches: Patch 1: Assign IDs on stack fills to ensure stack slots have IDs before being read into registers, preparing for the singular ID clearing in patch 2. Patch 2: Clear IDs that appear only once before caching, as they cannot contribute to bounds propagation. Patch 3: Relax maybe_widen_reg() to only compare value-tracking fields (bounds, tnum, var_off) rather than also requiring ID matches. Two scalars with identical value constraints but different IDs represent the same abstract value and don't need widening. Patch 4: Relax scalar ID equivalence in state comparison by treating rold->id == 0 as "independent". If the old state didn't rely on ID relationships for a register, any linking in the current state only adds constraints and is safe to accept for pruning. Patch 5: Add a selftest to show the exact case being handled by Patch 4 I ran veristat on BPF programs from sched_ext, meta's internal programs, and on selftest programs, showing programs with insn diff > 5%: Scx Progs File Program States (A) States (B) States (DIFF) Insns (A) Insns (B) Insns (DIFF) ------------------ ------------------- ---------- ---------- ------------- --------- --------- --------------- scx_rusty.bpf.o rusty_set_cpumask 320 230 -90 (-28.12%) 4478 3259 -1219 (-27.22%) scx_bpfland.bpf.o bpfland_select_cpu 55 49 -6 (-10.91%) 691 618 -73 (-10.56%) scx_beerland.bpf.o beerland_select_cpu 27 25 -2 (-7.41%) 320 295 -25 (-7.81%) scx_p2dq.bpf.o p2dq_init 265 250 -15 (-5.66%) 3423 3233 -190 (-5.55%) scx_layered.bpf.o layered_enqueue 1461 1386 -75 (-5.13%) 14541 13792 -749 (-5.15%) FB Progs File Program States (A) States (B) States (DIFF) Insns (A) Insns (B) Insns (DIFF) ------------ ------------------- ---------- ---------- -------------- --------- --------- --------------- bpf007.bpf.o bpfj_free 1726 1342 -384 (-22.25%) 25671 19096 -6575 (-25.61%) bpf041.bpf.o armr_net_block_init 22373 20411 -1962 (-8.77%) 651697 602873 -48824 (-7.49%) bpf227.bpf.o layered_quiescent 28 26 -2 (-7.14%) 365 340 -25 (-6.85%) bpf248.bpf.o p2dq_init 263 248 -15 (-5.70%) 3370 3159 -211 (-6.26%) bpf254.bpf.o p2dq_init 263 248 -15 (-5.70%) 3388 3177 -211 (-6.23%) bpf241.bpf.o p2dq_init 264 249 -15 (-5.68%) 3428 3240 -188 (-5.48%) bpf230.bpf.o p2dq_init 287 271 -16 (-5.57%) 3666 3431 -235 (-6.41%) bpf251.bpf.o lavd_cpu_offline 321 316 -5 (-1.56%) 6221 5891 -330 (-5.30%) bpf251.bpf.o lavd_cpu_online 321 316 -5 (-1.56%) 6219 5889 -330 (-5.31%) Selftest Progs File Program States (A) States (B) States (DIFF) Insns (A) Insns (B) Insns (DIFF) ---------------------------------- ----------------- ---------- ---------- ------------- --------- --------- --------------- verifier_iterating_callbacks.bpf.o test2 4 2 -2 (-50.00%) 29 18 -11 (-37.93%) verifier_iterating_callbacks.bpf.o test3 4 2 -2 (-50.00%) 31 19 -12 (-38.71%) strobemeta_bpf_loop.bpf.o on_event 318 221 -97 (-30.50%) 3938 2755 -1183 (-30.04%) bpf_qdisc_fq.bpf.o bpf_fq_dequeue 133 105 -28 (-21.05%) 1686 1385 -301 (-17.85%) iters.bpf.o delayed_read_mark 6 5 -1 (-16.67%) 60 46 -14 (-23.33%) arena_strsearch.bpf.o arena_strsearch 107 106 -1 (-0.93%) 1394 1258 -136 (-9.76%) ==================== Link: https://patch.msgid.link/20260203165102.2302462-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03selftests/bpf: Add a test for ids=0 to verifier_scalar_ids testPuranjay Mohan1-0/+45
Test that two registers with their id=0 (unlinked) in the cached state can be mapped to a single id (linked) in the current state. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-6-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03bpf: Relax scalar id equivalence for state pruningPuranjay Mohan2-15/+56
Scalar register IDs are used by the verifier to track relationships between registers and enable bounds propagation across those relationships. Once an ID becomes singular (i.e. only a single register/stack slot carries it), it can no longer contribute to bounds propagation and effectively becomes stale. The previous commit makes the verifier clear such ids before caching the state. When comparing the current and cached states for pruning, these stale IDs can cause technically equivalent states to be considered different and thus prevent pruning. For example, in the selftest added in the next commit, two registers - r6 and r7 are not linked to any other registers and get cached with id=0, in the current state, they are both linked to each other with id=A. Before this commit, check_scalar_ids would give temporary ids to r6 and r7 (say tid1 and tid2) and then check_ids() would map tid1->A, and when it would see tid2->A, it would not consider these state equivalent. Relax scalar ID equivalence by treating rold->id == 0 as "independent": if the old state did not rely on any ID relationships for a register, then any ID/linking present in the current state only adds constraints and is always safe to accept for pruning. Implement this by returning true immediately in check_scalar_ids() when old_id == 0. Maintain correctness for the opposite direction (old_id != 0 && cur_id == 0) by still allocating a temporary ID for cur_id == 0. This avoids incorrectly allowing multiple independent current registers (id==0) to satisfy a single linked old ID during mapping. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03bpf: Relax maybe_widen_reg() constraintsPuranjay Mohan1-8/+14
The maybe_widen_reg() function widens imprecise scalar registers to unknown when their values differ between the cached and current states. Previously, it used regs_exact() which also compared register IDs via check_ids(), requiring registers to have matching IDs (or mapped IDs) to be considered exact. For scalar widening purposes, what matters is whether the value tracking (bounds, tnum, var_off) is the same, not whether the IDs match. Two scalars with identical value constraints but different IDs represent the same abstract value and don't need to be widened. Introduce scalars_exact_for_widen() that only compares the value-tracking portion of bpf_reg_state (fields before 'id'). This allows the verifier to preserve more scalar value information during state merging when IDs differ but actual tracked values are identical, reducing unnecessary widening and potentially improving verification precision. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03bpf: Clear singular ids for scalars in is_state_visited()Puranjay Mohan2-2/+73
The verifier assigns ids to scalar registers/stack slots when they are linked through a mov or stack spill/fill instruction. These ids are later used to propagate newly found bounds from one register to all registers that share the same id. The verifier also compares the ids of these registers in current state and cached state when making pruning decisions. When an ID becomes singular (i.e., only a single register or stack slot has that ID), it can no longer participate in bounds propagation. During comparisons between current and cached states for pruning decisions, however, such stale IDs can prevent pruning of otherwise equivalent states. Find and clear all singular ids before caching a state in is_state_visited(). struct bpf_idset which is currently unused has been repurposed for this use case. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03bpf: Let the verifier assign ids on stack fillsPuranjay Mohan1-0/+11
The next commit will allow clearing of scalar ids if no other register/stack slot has that id. This is because if only one register has a unique id, it can't participate in bounds propagation and is equivalent to having no id. But if the id of a stack slot is cleared by clear_singular_ids() in the next commit, reading that stack slot into a register will not establish a link because the stack slot's id is cleared. This can happen in a situation where a register is spilled and later loses its id due to a multiply operation (for example) and then the stack slot's id becomes singular and can be cleared. Make sure that scalar stack slots have an id before we read them into a register. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03Merge tag 'for-6.19-rc8-tag' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "A regression fix for a memory leak when raid56 is used" * tag 'for-6.19-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap
2026-02-03Merge tag 'platform-drivers-x86-v6.19-4' of ↵Linus Torvalds10-7/+64
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: - amd/pmc: Add quirk for MECHREVO Wujie 15X Pro - classmate-laptop: Add missing NULL pointer checks - hp-bioscfg: Skip empty attribute names - intel_telemetry: - Fix PSS event register mask - Fix swapped arrays in PSS output - intel/tpmi/plr: Make the file domain<n>/status writeable - intel/vsec: Add Nova Lake PUNIT support - lg-laptop: Recognize 2022-2025 models - panasonic-laptop: Fix sysfs group leak in error path - toshiba_haps: Fix memory leaks in add/remove routines * tag 'platform-drivers-x86-v6.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel/tpmi/plr: Make the file domain<n>/status writeable platform/x86: hp-bioscfg: Skip empty attribute names platform/x86: classmate-laptop: Add missing NULL pointer checks platform/x86: lg-laptop: Recognize 2022-2025 models platform/x86/amd/pmc: Add quirk for MECHREVO Wujie 15X Pro platform/x86: intel_telemetry: Fix PSS event register mask platform/x86: intel_telemetry: Fix swapped arrays in PSS output platform/x86/intel/vsec: Add Nova Lake PUNIT support platform/x86: toshiba_haps: Fix memory leaks in add/remove routines platform/x86: panasonic-laptop: Fix sysfs group leak in error path
2026-02-03io_uring/fdinfo: be a bit nicer when looping a lot of SQEs/CQEsJens Axboe1-3/+8
Add cond_resched() in those dump loops, just in case a lot of entries are being dumped. And detect invalid CQ ring head/tail entries, to avoid iterating more than what is necessary. Generally not an issue, but can be if things like KASAN or other debugging metrics are enabled. Reported-by: 是参差 <shicenci@gmail.com> Link: https://lore.kernel.org/all/PS1PPF7E1D7501FE5631002D242DD89403FAB9BA@PS1PPF7E1D7501F.apcprd02.prod.outlook.com/ Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03cxl/acpi: Prepare use of EFI runtime servicesRobert Richter1-2/+6
In order to use EFI runtime services, esp. ACPI PRM which uses the efi_rts_wq workqueue, initialize EFI before CXL ACPI. There is a subsys_initcall order dependency if driver is builtin: subsys_initcall(cxl_acpi_init); subsys_initcall(efisubsys_init); Prevent the efi_rts_wq workqueue being used by cxl_acpi_init() before its allocation. Use subsys_initcall_sync(cxl_acpi_init) to always run efisubsys_init() first. Reported-by: Gregory Price <gourry@gourry.net> Tested-by: Joshua Hahn <joshua.hahnjy@gmail.com> Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>> --- Link: https://patch.msgid.link/20260114164837.1076338-10-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl: Introduce callback for HPA address ranges translationRobert Richter2-0/+25
Introduce a callback to translate an endpoint's HPA range to the address range of the root port which is the System Physical Address (SPA) range used by a region. The callback can be set if a platform needs to handle address translation. The callback is attached to the root port. An endpoint's root port can easily be determined in the PCI hierarchy without any CXL specific knowledge. This allows the early use of address translation for CXL enumeration. Address translation is esp. needed for the detection of the root decoders. Thus, the callback is embedded in struct cxl_root_ops instead of struct cxl_rd_ops. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-9-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Use region data to get the root decoderRobert Richter1-26/+24
To find a region's root decoder, the endpoint's HPA range is used to search the matching decoder by its range. With address translation the endpoint decoder's range is in a different address space and thus cannot be used to determine the root decoder. The region parameters are encapsulated within struct cxl_region_context and may include the translated Host Physical Address (HPA) range. Use this context to identify the root decoder rather than relying on the endpoint. Modify cxl_find_root_decoder() and add the region context as parameter. Rename this function to get_cxl_root_decoder() as a counterpart to put_cxl_root_decoder(). Simplify the implementation by removing function cxl_port_find_switch_decode(). The function is unnecessary because it is not referenced or utilized elsewhere in the code. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-8-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()Robert Richter1-6/+8
cxl_calc_interleave_pos() uses the endpoint decoder's HPA range to determine its interleaving position. This requires the endpoint decoders to be an SPA, which is not the case for systems that need address translation. Add a separate @hpa_range argument to function cxl_calc_interleave_pos() to specify the address range. Now it is possible to pass the SPA translated address range of an endpoint decoder to function cxl_calc_interleave_pos(). Refactor only, no functional changes. Patch is a prerequisite to implement address translation. Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-7-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Separate region parameter setup and region constructionRobert Richter2-9/+26
To construct a region, the region parameters such as address range and interleaving config need to be determined. This is done while constructing the region by inspecting the endpoint decoder configuration. The endpoint decoder is passed as a function argument. With address translation the endpoint decoder data is no longer sufficient to extract the region parameters as some of the information is obtained using other methods such as using firmware calls. In a first step, separate code to determine the region parameters from the region construction. Temporarily store all the data to create the region in the new struct cxl_region_context. Once the region data is determined and struct cxl_region_context is filled, construct the region. Patch is a prerequisite to implement address translation. The code separation helps to later extend it to determine region parameters using other methods as needed, esp. to support address translation. Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-6-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl: Simplify cxl_root_ops allocation and handlingRobert Richter4-24/+18
A root port's callback handlers are collected in struct cxl_root_ops. The structure is dynamically allocated, though it contains only a single pointer in it. This also requires to check two pointers to check for the existance of a callback. Simplify the allocation, release and handler check by embedding the ops statically in struct cxl_root. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-5-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Store HPA range in struct cxl_regionRobert Richter2-0/+9
Each region has a known host physical address (HPA) range it is assigned to. Endpoint decoders assigned to a region share the same HPA range. The region's address range is the system's physical address (SPA) range. Endpoint decoders in systems that need address translation use HPAs which are not SPAs. To make the SPA range accessible to the endpoint decoders, store and track the region's SPA range in struct cxl_region. Introduce the @hpa_range member to the struct. Now, the SPA range of an endpoint decoder can be determined based on its assigned region. Patch is a prerequisite to implement address translation which uses struct cxl_region to store all relevant region and interleaving parameters. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-4-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Store root decoder in struct cxl_regionRobert Richter2-18/+21
A region is always bound to a root decoder. The region's associated root decoder is often needed. Add it to struct cxl_region. This simplifies the code by removing dynamic lookups and the root decoder argument from the function argument list where possible. Patch is a prerequisite to implement address translation which uses struct cxl_region to store all relevant region and interleaving parameters. It changes the argument list of __construct_region() in preparation of adding a