aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-02-03Merge branch 'bpf-verifier-improve-state-pruning-for-scalar-registers'Alexei Starovoitov3-25/+199
Puranjay Mohan says: ==================== bpf: Improve state pruning for scalar registers V2: https://lore.kernel.org/all/20260203022229.1630849-1-puranjay@kernel.org/ Changes in V3: - Fix spelling mistakes in commit logs (AI) - Fix an incorrect comment in the selftest added in patch 5 (AI) - Improve the title of patch 5 V1: https://lore.kernel.org/all/20260202104414.3103323-1-puranjay@kernel.org/ Changes in V2: - Collected acked by Eduard - Removed some unnecessary comments - Added a selftest for id=0 equivalence in Patch 5 This series improves BPF verifier state pruning by relaxing scalar ID equivalence requirements. Scalar register IDs are used to track relationships between registers for bounds propagation. However, once an ID becomes "singular" (only one register/stack slot carries it), it can no longer participate in bounds propagation and becomes stale. These stale IDs can prevent pruning of otherwise equivalent states. The series addresses this in four patches: Patch 1: Assign IDs on stack fills to ensure stack slots have IDs before being read into registers, preparing for the singular ID clearing in patch 2. Patch 2: Clear IDs that appear only once before caching, as they cannot contribute to bounds propagation. Patch 3: Relax maybe_widen_reg() to only compare value-tracking fields (bounds, tnum, var_off) rather than also requiring ID matches. Two scalars with identical value constraints but different IDs represent the same abstract value and don't need widening. Patch 4: Relax scalar ID equivalence in state comparison by treating rold->id == 0 as "independent". If the old state didn't rely on ID relationships for a register, any linking in the current state only adds constraints and is safe to accept for pruning. Patch 5: Add a selftest to show the exact case being handled by Patch 4 I ran veristat on BPF programs from sched_ext, meta's internal programs, and on selftest programs, showing programs with insn diff > 5%: Scx Progs File Program States (A) States (B) States (DIFF) Insns (A) Insns (B) Insns (DIFF) ------------------ ------------------- ---------- ---------- ------------- --------- --------- --------------- scx_rusty.bpf.o rusty_set_cpumask 320 230 -90 (-28.12%) 4478 3259 -1219 (-27.22%) scx_bpfland.bpf.o bpfland_select_cpu 55 49 -6 (-10.91%) 691 618 -73 (-10.56%) scx_beerland.bpf.o beerland_select_cpu 27 25 -2 (-7.41%) 320 295 -25 (-7.81%) scx_p2dq.bpf.o p2dq_init 265 250 -15 (-5.66%) 3423 3233 -190 (-5.55%) scx_layered.bpf.o layered_enqueue 1461 1386 -75 (-5.13%) 14541 13792 -749 (-5.15%) FB Progs File Program States (A) States (B) States (DIFF) Insns (A) Insns (B) Insns (DIFF) ------------ ------------------- ---------- ---------- -------------- --------- --------- --------------- bpf007.bpf.o bpfj_free 1726 1342 -384 (-22.25%) 25671 19096 -6575 (-25.61%) bpf041.bpf.o armr_net_block_init 22373 20411 -1962 (-8.77%) 651697 602873 -48824 (-7.49%) bpf227.bpf.o layered_quiescent 28 26 -2 (-7.14%) 365 340 -25 (-6.85%) bpf248.bpf.o p2dq_init 263 248 -15 (-5.70%) 3370 3159 -211 (-6.26%) bpf254.bpf.o p2dq_init 263 248 -15 (-5.70%) 3388 3177 -211 (-6.23%) bpf241.bpf.o p2dq_init 264 249 -15 (-5.68%) 3428 3240 -188 (-5.48%) bpf230.bpf.o p2dq_init 287 271 -16 (-5.57%) 3666 3431 -235 (-6.41%) bpf251.bpf.o lavd_cpu_offline 321 316 -5 (-1.56%) 6221 5891 -330 (-5.30%) bpf251.bpf.o lavd_cpu_online 321 316 -5 (-1.56%) 6219 5889 -330 (-5.31%) Selftest Progs File Program States (A) States (B) States (DIFF) Insns (A) Insns (B) Insns (DIFF) ---------------------------------- ----------------- ---------- ---------- ------------- --------- --------- --------------- verifier_iterating_callbacks.bpf.o test2 4 2 -2 (-50.00%) 29 18 -11 (-37.93%) verifier_iterating_callbacks.bpf.o test3 4 2 -2 (-50.00%) 31 19 -12 (-38.71%) strobemeta_bpf_loop.bpf.o on_event 318 221 -97 (-30.50%) 3938 2755 -1183 (-30.04%) bpf_qdisc_fq.bpf.o bpf_fq_dequeue 133 105 -28 (-21.05%) 1686 1385 -301 (-17.85%) iters.bpf.o delayed_read_mark 6 5 -1 (-16.67%) 60 46 -14 (-23.33%) arena_strsearch.bpf.o arena_strsearch 107 106 -1 (-0.93%) 1394 1258 -136 (-9.76%) ==================== Link: https://patch.msgid.link/20260203165102.2302462-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03selftests/bpf: Add a test for ids=0 to verifier_scalar_ids testPuranjay Mohan1-0/+45
Test that two registers with their id=0 (unlinked) in the cached state can be mapped to a single id (linked) in the current state. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-6-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03bpf: Relax scalar id equivalence for state pruningPuranjay Mohan2-15/+56
Scalar register IDs are used by the verifier to track relationships between registers and enable bounds propagation across those relationships. Once an ID becomes singular (i.e. only a single register/stack slot carries it), it can no longer contribute to bounds propagation and effectively becomes stale. The previous commit makes the verifier clear such ids before caching the state. When comparing the current and cached states for pruning, these stale IDs can cause technically equivalent states to be considered different and thus prevent pruning. For example, in the selftest added in the next commit, two registers - r6 and r7 are not linked to any other registers and get cached with id=0, in the current state, they are both linked to each other with id=A. Before this commit, check_scalar_ids would give temporary ids to r6 and r7 (say tid1 and tid2) and then check_ids() would map tid1->A, and when it would see tid2->A, it would not consider these state equivalent. Relax scalar ID equivalence by treating rold->id == 0 as "independent": if the old state did not rely on any ID relationships for a register, then any ID/linking present in the current state only adds constraints and is always safe to accept for pruning. Implement this by returning true immediately in check_scalar_ids() when old_id == 0. Maintain correctness for the opposite direction (old_id != 0 && cur_id == 0) by still allocating a temporary ID for cur_id == 0. This avoids incorrectly allowing multiple independent current registers (id==0) to satisfy a single linked old ID during mapping. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-5-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03bpf: Relax maybe_widen_reg() constraintsPuranjay Mohan1-8/+14
The maybe_widen_reg() function widens imprecise scalar registers to unknown when their values differ between the cached and current states. Previously, it used regs_exact() which also compared register IDs via check_ids(), requiring registers to have matching IDs (or mapped IDs) to be considered exact. For scalar widening purposes, what matters is whether the value tracking (bounds, tnum, var_off) is the same, not whether the IDs match. Two scalars with identical value constraints but different IDs represent the same abstract value and don't need to be widened. Introduce scalars_exact_for_widen() that only compares the value-tracking portion of bpf_reg_state (fields before 'id'). This allows the verifier to preserve more scalar value information during state merging when IDs differ but actual tracked values are identical, reducing unnecessary widening and potentially improving verification precision. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03bpf: Clear singular ids for scalars in is_state_visited()Puranjay Mohan2-2/+73
The verifier assigns ids to scalar registers/stack slots when they are linked through a mov or stack spill/fill instruction. These ids are later used to propagate newly found bounds from one register to all registers that share the same id. The verifier also compares the ids of these registers in current state and cached state when making pruning decisions. When an ID becomes singular (i.e., only a single register or stack slot has that ID), it can no longer participate in bounds propagation. During comparisons between current and cached states for pruning decisions, however, such stale IDs can prevent pruning of otherwise equivalent states. Find and clear all singular ids before caching a state in is_state_visited(). struct bpf_idset which is currently unused has been repurposed for this use case. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03bpf: Let the verifier assign ids on stack fillsPuranjay Mohan1-0/+11
The next commit will allow clearing of scalar ids if no other register/stack slot has that id. This is because if only one register has a unique id, it can't participate in bounds propagation and is equivalent to having no id. But if the id of a stack slot is cleared by clear_singular_ids() in the next commit, reading that stack slot into a register will not establish a link because the stack slot's id is cleared. This can happen in a situation where a register is spilled and later loses its id due to a multiply operation (for example) and then the stack slot's id becomes singular and can be cleared. Make sure that scalar stack slots have an id before we read them into a register. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260203165102.2302462-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-02-03Merge tag 'for-6.19-rc8-tag' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "A regression fix for a memory leak when raid56 is used" * tag 'for-6.19-rc8-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap
2026-02-03Merge tag 'platform-drivers-x86-v6.19-4' of ↵Linus Torvalds10-7/+64
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver fixes from Ilpo Järvinen: - amd/pmc: Add quirk for MECHREVO Wujie 15X Pro - classmate-laptop: Add missing NULL pointer checks - hp-bioscfg: Skip empty attribute names - intel_telemetry: - Fix PSS event register mask - Fix swapped arrays in PSS output - intel/tpmi/plr: Make the file domain<n>/status writeable - intel/vsec: Add Nova Lake PUNIT support - lg-laptop: Recognize 2022-2025 models - panasonic-laptop: Fix sysfs group leak in error path - toshiba_haps: Fix memory leaks in add/remove routines * tag 'platform-drivers-x86-v6.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86/intel/tpmi/plr: Make the file domain<n>/status writeable platform/x86: hp-bioscfg: Skip empty attribute names platform/x86: classmate-laptop: Add missing NULL pointer checks platform/x86: lg-laptop: Recognize 2022-2025 models platform/x86/amd/pmc: Add quirk for MECHREVO Wujie 15X Pro platform/x86: intel_telemetry: Fix PSS event register mask platform/x86: intel_telemetry: Fix swapped arrays in PSS output platform/x86/intel/vsec: Add Nova Lake PUNIT support platform/x86: toshiba_haps: Fix memory leaks in add/remove routines platform/x86: panasonic-laptop: Fix sysfs group leak in error path
2026-02-03io_uring/fdinfo: be a bit nicer when looping a lot of SQEs/CQEsJens Axboe1-3/+8
Add cond_resched() in those dump loops, just in case a lot of entries are being dumped. And detect invalid CQ ring head/tail entries, to avoid iterating more than what is necessary. Generally not an issue, but can be if things like KASAN or other debugging metrics are enabled. Reported-by: 是参差 <shicenci@gmail.com> Link: https://lore.kernel.org/all/PS1PPF7E1D7501FE5631002D242DD89403FAB9BA@PS1PPF7E1D7501F.apcprd02.prod.outlook.com/ Reviewed-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03cxl/acpi: Prepare use of EFI runtime servicesRobert Richter1-2/+6
In order to use EFI runtime services, esp. ACPI PRM which uses the efi_rts_wq workqueue, initialize EFI before CXL ACPI. There is a subsys_initcall order dependency if driver is builtin: subsys_initcall(cxl_acpi_init); subsys_initcall(efisubsys_init); Prevent the efi_rts_wq workqueue being used by cxl_acpi_init() before its allocation. Use subsys_initcall_sync(cxl_acpi_init) to always run efisubsys_init() first. Reported-by: Gregory Price <gourry@gourry.net> Tested-by: Joshua Hahn <joshua.hahnjy@gmail.com> Reviewed-by: Joshua Hahn <joshua.hahnjy@gmail.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com>> --- Link: https://patch.msgid.link/20260114164837.1076338-10-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl: Introduce callback for HPA address ranges translationRobert Richter2-0/+25
Introduce a callback to translate an endpoint's HPA range to the address range of the root port which is the System Physical Address (SPA) range used by a region. The callback can be set if a platform needs to handle address translation. The callback is attached to the root port. An endpoint's root port can easily be determined in the PCI hierarchy without any CXL specific knowledge. This allows the early use of address translation for CXL enumeration. Address translation is esp. needed for the detection of the root decoders. Thus, the callback is embedded in struct cxl_root_ops instead of struct cxl_rd_ops. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-9-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Use region data to get the root decoderRobert Richter1-26/+24
To find a region's root decoder, the endpoint's HPA range is used to search the matching decoder by its range. With address translation the endpoint decoder's range is in a different address space and thus cannot be used to determine the root decoder. The region parameters are encapsulated within struct cxl_region_context and may include the translated Host Physical Address (HPA) range. Use this context to identify the root decoder rather than relying on the endpoint. Modify cxl_find_root_decoder() and add the region context as parameter. Rename this function to get_cxl_root_decoder() as a counterpart to put_cxl_root_decoder(). Simplify the implementation by removing function cxl_port_find_switch_decode(). The function is unnecessary because it is not referenced or utilized elsewhere in the code. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-8-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()Robert Richter1-6/+8
cxl_calc_interleave_pos() uses the endpoint decoder's HPA range to determine its interleaving position. This requires the endpoint decoders to be an SPA, which is not the case for systems that need address translation. Add a separate @hpa_range argument to function cxl_calc_interleave_pos() to specify the address range. Now it is possible to pass the SPA translated address range of an endpoint decoder to function cxl_calc_interleave_pos(). Refactor only, no functional changes. Patch is a prerequisite to implement address translation. Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-7-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Separate region parameter setup and region constructionRobert Richter2-9/+26
To construct a region, the region parameters such as address range and interleaving config need to be determined. This is done while constructing the region by inspecting the endpoint decoder configuration. The endpoint decoder is passed as a function argument. With address translation the endpoint decoder data is no longer sufficient to extract the region parameters as some of the information is obtained using other methods such as using firmware calls. In a first step, separate code to determine the region parameters from the region construction. Temporarily store all the data to create the region in the new struct cxl_region_context. Once the region data is determined and struct cxl_region_context is filled, construct the region. Patch is a prerequisite to implement address translation. The code separation helps to later extend it to determine region parameters using other methods as needed, esp. to support address translation. Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Gregory Price <gourry@gourry.net> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-6-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl: Simplify cxl_root_ops allocation and handlingRobert Richter4-24/+18
A root port's callback handlers are collected in struct cxl_root_ops. The structure is dynamically allocated, though it contains only a single pointer in it. This also requires to check two pointers to check for the existance of a callback. Simplify the allocation, release and handler check by embedding the ops statically in struct cxl_root. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-5-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Store HPA range in struct cxl_regionRobert Richter2-0/+9
Each region has a known host physical address (HPA) range it is assigned to. Endpoint decoders assigned to a region share the same HPA range. The region's address range is the system's physical address (SPA) range. Endpoint decoders in systems that need address translation use HPAs which are not SPAs. To make the SPA range accessible to the endpoint decoders, store and track the region's SPA range in struct cxl_region. Introduce the @hpa_range member to the struct. Now, the SPA range of an endpoint decoder can be determined based on its assigned region. Patch is a prerequisite to implement address translation which uses struct cxl_region to store all relevant region and interleaving parameters. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-4-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Store root decoder in struct cxl_regionRobert Richter2-18/+21
A region is always bound to a root decoder. The region's associated root decoder is often needed. Add it to struct cxl_region. This simplifies the code by removing dynamic lookups and the root decoder argument from the function argument list where possible. Patch is a prerequisite to implement address translation which uses struct cxl_region to store all relevant region and interleaving parameters. It changes the argument list of __construct_region() in preparation of adding a context argument. Additionally the arg list of cxl_region_attach_position() is simplified and the use of to_cxl_root_decoder() removed, which always reconstructs and checks the pointer. The pointer never changes and is frequently used. Code becomes more readable as this amphazises the binding between both objects. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-3-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl/region: Rename misleading variable name @hpa to @hpa_rangeRobert Richter1-13/+15
@hpa is actually a @hpa_range, rename variables accordingly. Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260114164837.1076338-2-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD ↵Robert Richter2-0/+305
Zen5 enablement This adds a convention document for the following patch series: cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement Version 7 and later: https://lore.kernel.org/linux-cxl/20251114213931.30754-1-rrichter@amd.com/ Link: https://lore.kernel.org/linux-cxl/20251114213931.30754-1-rrichter@amd.com/ Reviewed-by: Gregory Price <gourry@gourry.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260203173604.1440334-3-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl, doc: Moving conventions in separate filesRobert Richter3-170/+178
Moving conventions in separate files. Cc: Jonathan Corbet <corbet@lwn.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260203173604.1440334-2-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03cxl, doc: Remove isonum.txt inclusionRobert Richter1-1/+0
This patch removes the line to include:: <isonum.txt>. From Jon: "This include has been cargo-culted around the docs...the only real use of it is to write |copy| rather than ©, but these docs don't even do that. It can be taken out." Cc: Jonathan Corbet <corbet@lwn.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Link: https://patch.msgid.link/20260203173604.1440334-1-rrichter@amd.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-02-03accel/amdxdna: Remove hardware context statusLizhi Hou3-28/+5
One newly supported command does not require hardware context configuration to be performed upfront. As a result, checking hardware context status causes this command to fail incorrectly. Remove hardware context status handling entirely. For other commands, if userspace submits a request without configuring the hardware context first, the firmware will report an error or time out as appropriate. Fixes: aac243092b70 ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260202212450.2681273-1-lizhi.hou@amd.com
2026-02-03ALSA: usb-audio: fix broken logic in snd_audigy2nx_led_update()Sergey Shtylyov1-7/+2
When the support for the Sound Blaster X-Fi Surround 5.1 Pro was added, the existing logic for the X-Fi Surround 5.1 in snd_audigy2nx_led_put() was broken due to missing *else* before the added *if*: snd_usb_ctl_msg() became incorrectly called twice and an error from first snd_usb_ctl_msg() call ignored. As the added snd_usb_ctl_msg() call was totally identical to the existing one for the "plain" X-Fi Surround 5.1, just merge those two *if* statements while fixing the broken logic... Found by Linux Verification Center (linuxtesting.org) with the Svace static analysis tool. Fixes: 7cdd8d73139e ("ALSA: usb-audio - Add support for USB X-Fi S51 Pro") Signed-off-by: Sergey Shtylyov <s.shtylyov@auroraos.dev> Link: https://patch.msgid.link/20260203161558.18680-1-s.shtylyov@auroraos.dev Signed-off-by: Takashi Iwai <tiwai@suse.de>
2026-02-03io_uring/fdinfo: kill unnecessary newline feed in CQE32 printingJens Axboe1-1/+1
There's an unconditional newline feed anyway after dumping both normal and big CQE contents, remove the \n from the CQE32 extra1/extra2 printing. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03remoteproc: imx_rproc: Fix invalid loaded resource table detectionPeng Fan1-0/+4
imx_rproc_elf_find_loaded_rsc_table() may incorrectly report a loaded resource table even when the current firmware does not provide one. When the device tree contains a "rsc-table" entry, priv->rsc_table is non-NULL and denotes where a resource table would be located if one is present in memory. However, when the current firmware has no resource table, rproc->table_ptr is NULL. The function still returns priv->rsc_table, and the remoteproc core interprets this as a valid loaded resource table. Fix this by returning NULL from imx_rproc_elf_find_loaded_rsc_table() when there is no resource table for the current firmware (i.e. when rproc->table_ptr is NULL). This aligns the function's semantics with the remoteproc core: a loaded resource table is only reported when a valid table_ptr exists. With this change, starting firmware without a resource table no longer triggers a crash. Fixes: e954a1bd1610 ("remoteproc: imx_rproc: Use imx specific hook for find_loaded_rsc_table") Cc: stable@vger.kernel.org Signed-off-by: Peng Fan <peng.fan@nxp.com> Acked-by: Daniel Baluta <daniel.baluta@nxp.com> Link: https://lore.kernel.org/r/20260129-imx-rproc-fix-v3-1-fc4e41e6e750@nxp.com Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
2026-02-03panic: add panic_force_cpu= parameter to redirect panic to a specific CPUPnina Feder4-2/+186
Some platforms require panic handling to execute on a specific CPU for crash dump to work reliably. This can be due to firmware limitations, interrupt routing constraints, or platform-specific requirements where only a single CPU is able to safely enter the crash kernel. Add the panic_force_cpu= kernel command-line parameter to redirect panic execution to a designated CPU. When the parameter is provided, the CPU that initially triggers panic forwards the panic context to the target CPU via IPI, which then proceeds with the normal panic and kexec flow. The IPI delivery is implemented as a weak function (panic_smp_redirect_cpu) so architectures with NMI support can override it for more reliable delivery. If the specified CPU is invalid, offline, or a panic is already in progress on another CPU, the redirection is skipped and panic continues on the current CPU. [pnina.feder@mobileye.com: fix unused variable warning] Link: https://lkml.kernel.org/r/20260126122618.2967950-1-pnina.feder@mobileye.com Link: https://lkml.kernel.org/r/20260122102457.1154599-1-pnina.feder@mobileye.com Signed-off-by: Pnina Feder <pnina.feder@mobileye.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Cc: Baoquan He <bhe@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-03netclassid: use thread_group_leader(p) in update_classid_task()Oleg Nesterov1-1/+1
Cleanup and preparation to simplify planned future changes. Link: https://lkml.kernel.org/r/aXY_4NSP094-Cf-2@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-03RDMA/umem: don't abuse current->group_leaderOleg Nesterov1-2/+2
Cleanup and preparation to simplify the next changes. Use current->tgid instead of current->group_leader->pid. Link: https://lkml.kernel.org/r/aXY_2JIhCeGAYC0r@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Leon Romanovsky <leon@kernel.org> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-03drm/pan*: don't abuse current->group_leaderOleg Nesterov2-2/+2
Cleanup and preparation to simplify the next changes. Use current->tgid instead of current->group_leader->pid. Link: https://lkml.kernel.org/r/aXY_0MrQBZWKbbmA@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Steven Price <steven.price@arm.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-03drm/amd: kill the outdated "Only the pthreads threading model is supported" ↵Oleg Nesterov2-13/+0
checks Nowadays task->group_leader->mm != task->mm is only possible if a) task is not a group leader and b) task->group_leader->mm == NULL because task->group_leader has already exited using sys_exit(). I don't think that drm/amd tries to detect/nack this case. Link: https://lkml.kernel.org/r/aXY_yLVHd63UlWtm@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Christan König <christian.koenig@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-03drm/amdgpu: don't abuse current->group_leaderOleg Nesterov2-2/+2
Cleanup and preparation to simplify the next changes. - Use current->tgid instead of current->group_leader->pid - Use get_task_pid(current, PIDTYPE_TGID) instead of get_task_pid(current->group_leader, PIDTYPE_PID) Link: https://lkml.kernel.org/r/aXY_wKewzV5lCa5I@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-03android/binder: use same_thread_group(proc->tsk, current) in binder_mmap()Oleg Nesterov1-1/+1
With or without this change the checked condition can be falsely true if proc->tsk execs, but this is fine: binder_alloc_mmap_handler() checks vma->vm_mm == alloc->mm. Link: https://lkml.kernel.org/r/aXY_uPYyUg4rwNOg@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-03android/binder: don't abuse current->group_leaderOleg Nesterov2-5/+4
Patch series "don't abuse task_struct.group_leader", v2. This series removes the usage of ->group_leader when it is "obviously unnecessary". I am going to move ->group_leader from task_struct to signal_struct or at least add the new task_group_leader() helper. So I will send more tree-wide changes on top of this series. This patch (of 7): Cleanup and preparation to simplify the next changes. - Use current->tgid instead of current->group_leader->pid - Use the value returned by get_task_struct() to initialize proc->tsk Link: https://lkml.kernel.org/r/aXY_h8i78n6yD9JY@redhat.com Link: https://lkml.kernel.org/r/aXY_ryGDwdygl1Tv@redhat.com Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Christan König <christian.koenig@amd.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Leon Romanovsky <leon@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-03mtd: spinand: winbond: Remove unneeded semicolonChen Ni1-1/+1
Remove unnecessary semicolons reported by Coccinelle/coccicheck and the semantic patch at scripts/coccinelle/misc/semicolon.cocci. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2026-02-03dt-bindings: mtd: cdns,hp-nfc: Add dma-coherent propertyKhairul Anuar Romli1-0/+2
The Cadence HP NAND Flash Controller on supports DMA transactions through a coherent interconnect. In previous generations SoC (Stratix10 and Agilex) the interconnect was non-coherent, hence there is no need for dma-coherent property to be presence. In Agilex 5, the architecture has changed. It introduced a coherent interconnect that supports cache-coherent DMA. Signed-off-by: Khairul Anuar Romli <khairul.anuar.romli@altera.com> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Dinh Nguyen <dinguyen@kernel.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
2026-02-03drm/bridge: imx8qxp-pixel-combiner: Fix bailout for imx8qxp_pc_bridge_probe()Liu Ying1-1/+1
In case the channel0 is unavailable and bailing out from free_child is needed when we fail to add a DRM bridge for the available channel1, pointer pc->ch[0] in the bailout path would be NULL and it would be dereferenced as pc->ch[0]->bridge.next_bridge. Fix this by checking pc->ch[0] before dereferencing it. Fixes: ae754f049ce1 ("drm/bridge: imx8qxp-pixel-combiner: get/put the next bridge") Fixes: 99764593528f ("drm/bridge: imx8qxp-pixel-combiner: convert to devm_drm_bridge_alloc() API") Signed-off-by: Liu Ying <victor.liu@nxp.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20260123-imx8qxp-drm-bridge-fixes-v1-3-8bb85ada5866@nxp.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
2026-02-03blk-mq: add documentation for new queue attribute async_dpethYu Kuai1-0/+34
Explain the attribute and the default value in different case. Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03block, bfq: convert to use request_queue->async_depthYu Kuai1-26/+17
The default limits is unchanged, and user can configure async_depth now. Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03mq-deadline: covert to use request_queue->async_depthYu Kuai1-34/+5
In downstream kernel, we test with mq-deadline with many fio workloads, and we found a performance regression after commit 39823b47bbd4 ("block/mq-deadline: Fix the tag reservation code") with following test: [global] rw=randread direct=1 ramp_time=1 ioengine=libaio iodepth=1024 numjobs=24 bs=1024k group_reporting=1 runtime=60 [job1] filename=/dev/sda Root cause is that mq-deadline now support configuring async_depth, although the default value is nr_request, however the minimal value is 1, hence min_shallow_depth is set to 1, causing wake_batch to be 1. For consequence, sbitmap_queue will be waken up after each IO instead of 8 IO. In this test case, sda is HDD and max_sectors is 128k, hence each submitted 1M io will be splited into 8 sequential 128k requests, however due to there are 24 jobs and total tags are exhausted, the 8 requests are unlikely to be dispatched sequentially, and changing wake_batch to 1 will make this much worse, accounting blktrace D stage, the percentage of sequential io is decreased from 8% to 0.8%. Fix this problem by converting to request_queue->async_depth, where min_shallow_depth is set each time async_depth is updated. Noted elevator attribute async_depth is now removed, queue attribute with the same name is used instead. Fixes: 39823b47bbd4 ("block/mq-deadline: Fix the tag reservation code") Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03kyber: covert to use request_queue->async_depthYu Kuai1-28/+5
Instead of the internal async_depth, remove kqd->async_depth and related helpers. Noted elevator attribute async_depth is now removed, queue attribute with the same name is used instead. Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03blk-mq: add a new queue sysfs attribute async_depthYu Kuai5-0/+51
Add a new field async_depth to request_queue and related APIs, this is currently not used, following patches will convert elevators to use this instead of internal async_depth. Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03blk-mq: factor out a helper blk_mq_limit_depth()Yu Kuai1-25/+37
There are no functional changes, just make code cleaner. Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03blk-mq-sched: unify elevators checking for async requestsYu Kuai4-3/+8
bfq and mq-deadline consider sync writes as async requests and only reserve tags for sync reads by async_depth, however, kyber doesn't consider sync writes as async requests for now. Consider the case there are lots of dirty pages, and user use fsync to flush dirty pages. In this case sched_tags can be exhausted by sync writes and sync reads can stuck waiting for tag. Hence let kyber follow what mq-deadline and bfq did, and unify async requests checking for all elevators. Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03block: convert nr_requests to unsigned intYu Kuai1-1/+1
This value represents the number of requests for elevator tags, or drivers tags if elevator is none. The max value for elevator tags is 2048, and in drivers at most 16 bits is used for tag. Signed-off-by: Yu Kuai <yukuai@fnnas.com> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-02-03perf capstone: Support for dlopen-ing libcapstone.soIan Rogers5-42/+179
If perf is built with LIBCAPSTONE_DLOPEN=1, support dlopen-ing libcapstone.so and then calling the necessary functions by looking them up using dlsym. The types come from capstone.h which means the libcapstone feature check needs to pass, and NO_CAPSTONE=1 hasn't been defined. This will cause the definition of HAVE_LIBCAPSTONE_SUPPORT. Earlier versions of this code tried to declare the necessary capstone.h constants and structs, but they weren't stable and caused breakages across libcapstone releases. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: