aboutsummaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2025-09-19Merge tag 'pmdomain-v6.17-rc2' of ↵Linus Torvalds1-0/+7
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm Pull pmdomain fixes from Ulf Hansson: "pmdomain core: - Restore behaviour for disabling unused PM domains and introduce the GENPD_FLAG_NO_STAY_ON configuration bit pmdomain providers: - renesas: Don't keep unused PM domains powered-on - rockchip: Fix regulator dependency with GENPD_FLAG_NO_STAY_ON" * tag 'pmdomain-v6.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/linux-pm: pmdomain: renesas: rmobile-sysc: Don't keep unused PM domains powered-on pmdomain: renesas: rcar-gen4-sysc: Don't keep unused PM domains powered-on pmdomain: renesas: rcar-sysc: Don't keep unused PM domains powered-on pmdomain: rockchip: Fix regulator dependency with GENPD_FLAG_NO_STAY_ON pmdomain: core: Restore behaviour for disabling unused PM domains pmdomain: renesas: rcar-sysc: Make rcar_sysc_onecell_np __initdata
2025-09-19bpf: table based bpf_insn_successors()Eduard Zingerman1-0/+1
Converting bpf_insn_successors() to use lookup table makes it ~1.5 times faster. Also remove unnecessary conditionals: - `idx + 1 < prog->len` is unnecessary because after check_cfg() all jump targets are guaranteed to be within a program; - `i == 0 || succ[0] != dst` is unnecessary because any client of bpf_insn_successors() can handle duplicate edges: - compute_live_registers() - compute_scc() Moving bpf_insn_successors() to liveness.c allows its inlining in liveness.c:__update_stack_liveness(). Such inlining speeds up __update_stack_liveness() by ~40%. bpf_insn_successors() is used in both verifier.c and liveness.c. perf shows such move does not negatively impact users in verifier.c, as these are executed only once before main varification pass. Unlike __update_stack_liveness() which can be triggered multiple times. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-10-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-19bpf: disable and remove registers chain based livenessEduard Zingerman1-25/+0
Remove register chain based liveness tracking: - struct bpf_reg_state->{parent,live} fields are no longer needed; - REG_LIVE_WRITTEN marks are superseded by bpf_mark_stack_write() calls; - mark_reg_read() calls are superseded by bpf_mark_stack_read(); - log.c:print_liveness() is superseded by logging in liveness.c; - propagate_liveness() is superseded by bpf_update_live_stack(); - no need to establish register chains in is_state_visited() anymore; - fix a bunch of tests expecting "_w" suffixes in verifier log messages. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-9-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-19bpf: signal error if old liveness is more conservative than newEduard Zingerman1-0/+1
Unlike the new algorithm, register chain based liveness tracking is fully path sensitive, and thus should be strictly more accurate. Validate the new algorithm by signaling an error whenever it considers a stack slot dead while the old algorithm considers it alive. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-8-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-19bpf: callchain sensitive stack liveness tracking using CFGEduard Zingerman1-0/+14
This commit adds a flow-sensitive, context-sensitive, path-insensitive data flow analysis for live stack slots: - flow-sensitive: uses program control flow graph to compute data flow values; - context-sensitive: collects data flow values for each possible call chain in a program; - path-insensitive: does not distinguish between separate control flow graph paths reaching the same instruction. Compared to the current path-sensitive analysis, this approach trades some precision for not having to enumerate every path in the program. This gives a theoretical capability to run the analysis before main verification pass. See cover letter for motivation. The basic idea is as follows: - Data flow values indicate stack slots that might be read and stack slots that are definitely written. - Data flow values are collected for each (call chain, instruction number) combination in the program. - Within a subprogram, data flow values are propagated using control flow graph. - Data flow values are transferred from entry instructions of callee subprograms to call sites in caller subprograms. In other words, a tree of all possible call chains is constructed. Each node of this tree represents a subprogram. Read and write marks are collected for each instruction of each node. Live stack slots are first computed for lower level nodes. Then, information about outer stack slots that might be read or are definitely written by a subprogram is propagated one level up, to the corresponding call instructions of the upper nodes. Procedure repeats until root node is processed. In the absence of value range analysis, stack read/write marks are collected during main verification pass, and data flow computation is triggered each time verifier.c:states_equal() needs to query the information. Implementation details are documented in kernel/bpf/liveness.c. Quantitative data about verification performance changes and memory consumption is in the cover letter. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-6-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-19bpf: compute instructions postorder per subprogramEduard Zingerman1-1/+5
The next patch would require doing postorder traversal of individual subprograms. Facilitate this by moving env->cfg.insn_postorder computation from check_cfg() to a separate pass, as check_cfg() descends into called subprograms (and it needs to, because of merge_callee_effects() logic). env->cfg.insn_postorder is used only by compute_live_registers(), this function does not track cross subprogram dependencies, thus the change does not affect it's operation. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-5-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-19bpf: declare a few utility functions as internal apiEduard Zingerman1-0/+5
Namely, rename the following functions and add prototypes to bpf_verifier.h: - find_containing_subprog -> bpf_find_containing_subprog - insn_successors -> bpf_insn_successors - calls_callback -> bpf_calls_callback - fmt_stack_mask -> bpf_fmt_stack_mask Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-4-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-19bpf: bpf_verifier_state->cleaned flag instead of REG_LIVE_DONEEduard Zingerman1-1/+1
Prepare for bpf_reg_state->live field removal by introducing a separate flag to track if clean_verifier_state() had been applied to the state. No functional changes. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-1-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-19ns: rename to __ns_refChristian Brauner1-6/+6
Make it easier to grep and rename to ns_count. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19uts: port to ns_ref_*() helpersChristian Brauner1-2/+2
Stop accessing ns.count directly. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19user: port to ns_ref_*() helpersChristian Brauner1-2/+2
Stop accessing ns.count directly. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19time: port to ns_ref_*() helpersChristian Brauner1-2/+2
Stop accessing ns.count directly. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19pid: port to ns_ref_*() helpersChristian Brauner1-1/+1
Stop accessing ns.count directly. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19ipc: port to ns_ref_*() helpersChristian Brauner1-2/+2
Stop accessing ns.count directly. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19cgroup: port to ns_ref_*() helpersChristian Brauner1-2/+2
Stop accessing ns.count directly. Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19ns: add reference count helpersChristian Brauner1-10/+35
Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19ns: add ns_common_free()Christian Brauner2-2/+3
And drop ns_free_inum(). Anything common that can be wasted centrally should be wasted in the new common helper. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19Add QSPI support for sam9x7 and sama7d65 SoCsMark Brown6-11/+42
Merge series from Dharma Balasubiramani <dharma.b@microchip.com>: This patch series adds support for SAM9X7 and sama7d65 QSPI controller along with the SoC-specific capabilities.
2025-09-19nscommon: simplify initializationChristian Brauner1-2/+37
There's a lot of information that namespace implementers don't need to know about at all. Encapsulate this all in the initialization helper. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19cgroup: split namespace into separate headerChristian Brauner2-50/+57
We have dedicated headers for all namespace types. Add one for the cgroup namespace as well. Now it's consistent for all namespace types and easy to figure out what to include. Acked-by: Tejun Heo <tj@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19nscommon: move to separate fileChristian Brauner2-19/+3
It's really awkward spilling the ns common infrastructure into multiple headers. Move it to a separate file. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19mnt: expose pointer to init_mnt_nsChristian Brauner1-0/+2
There's various scenarios where we need to know whether we are in the initial set of namespaces or not to e.g., shortcut permission checking. All namespaces expose that information. Let's do that too. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19uts: split namespace into separate headerChristian Brauner2-57/+66
We have dedicated headers for all namespace types. Add one for the uts namespace as well. Now it's consistent for all namespace types. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19nsfs: support file handlesChristian Brauner1-0/+6
A while ago we added support for file handles to pidfs so pidfds can be encoded and decoded as file handles. Userspace has adopted this quickly and it's proven very useful. Implement file handles for namespaces as well. A process is not always able to open /proc/self/ns/. That requires procfs to be mounted and for /proc/self/ or /proc/self/ns/ to not be overmounted. However, userspace can always derive a namespace fd from a pidfd. And that always works for a task's own namespace. There's no need to introduce unnecessary behavioral differences between /proc/self/ns/ fds, pidfd-derived namespace fds, and file-handle-derived namespace fds. So namespace file handles are always decodable if the caller is located in the namespace the file handle refers to. This also allows a task to e.g., store a set of file handles to its namespaces in a file on-disk so it can verify when it gets rexeced that they're still valid and so on. This is akin to the pidfd use-case. Or just plainly for namespace comparison reasons where a file handle to the task's own namespace can be easily compared against others. Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19nsfs: add current_in_namespace()Christian Brauner1-1/+15
Add a helper to easily check whether a given namespace is the caller's current namespace. This is currently open-coded in a lot of places. Simply switch on the type and compare the results. Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19ns: add to_<type>_ns() to respective headersChristian Brauner6-0/+29
Every namespace type has a container_of(ns, <ns_type>, ns) static inline function that is currently not exposed in the header. So we have a bunch of places that open-code it via container_of(). Move it to the headers so we can use it directly. Reviewed-by: Aleksa Sarai <cyphar@cyphar.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19time: support ns lookupChristian Brauner1-0/+5
Support the generic ns lookup infrastructure to support file handles for namespaces. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19Merge branch 'no-rebase-mnt_ns_tree_remove'Christian Brauner19-79/+66
Bring in the fix for removing a mount namespace from the mount namespace rbtree and list. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19nstree: make iterator genericChristian Brauner3-0/+103
Move the namespace iteration infrastructure originally introduced for mount namespaces into a generic library usable by all namespace types. Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19ns: remove ns_alloc_inum()Christian Brauner1-6/+0
It's now unused. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19ns: uniformly initialize ns_commonChristian Brauner1-0/+16
No point in cargo-culting the same code across all the different types. Use one common initializer. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19nsfs: add nsfs.h headerChristian Brauner2-12/+27
And move the stuff out from proc_ns.h where it really doesn't belong. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19ns: move to_ns_common() to ns_common.hChristian Brauner2-11/+20
Move the helper to ns_common.h where it belongs. Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-19writeback: Avoid contention on wb->list_lock when switching inodesJan Kara2-0/+6
There can be multiple inode switch works that are trying to switch inodes to / from the same wb. This can happen in particular if some cgroup exits which owns many (thousands) inodes and we need to switch them all. In this case several inode_switch_wbs_work_fn() instances will be just spinning on the same wb->list_lock while only one of them makes forward progress. This wastes CPU cycles and quickly leads to softlockup reports and unusable system. Instead of running several inode_switch_wbs_work_fn() instances in parallel switching to the same wb and contending on wb->list_lock, run just one work item per wb and manage a queue of isw items switching to this wb. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jan Kara <jack@suse.cz>
2025-09-19wifi: mac80211: correctly initialise S1G chandef for STALachlan Hodges1-1/+17
When moving to the APs channel, ensure we correctly initialise the chandef and perform the required validation. Additionally, if the AP is beaconing on a 2MHz primary, calculate the 2MHz primary center frequency by extracting the sibling 1MHz primary and averaging the frequencies to find the 2MHz primary center frequency. Signed-off-by: Lachlan Hodges <lachlan.hodges@morsemicro.com> Link: https://patch.msgid.link/20250918051913.500781-3-lachlan.hodges@morsemicro.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19wifi: cfg80211: Advertise supported NAN capabilitiesIlan Peer1-0/+17
Allow drivers to specify the supported NAN capabilities and support advertising the NAN capabilities to user space. Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20250908140015.2976966556f5.Ic6e43b10049573180c909dad806f279cfb31143e@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-09-19Merge tag 'drm-intel-next-2025-09-12' of ↵Dave Airlie1-0/+70
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next Cross-subsystem Changes: - Overflow: add range_overflows and range_end_overflows (Jani) Core Changes: - Get rid of dev->struct_mutex (Luiz) Non-display related: - GVT: Remove redundant ternary operators (Liao) - Various i915_utils clean-ups (Jani) Display related: - Wait PSR idle before on dsb commit (Jouni) - Fix size for for_each_set_bit() in abox iteration (Jani) - Abstract figuring out encoder name (Jani) - Remove FBC modulo 4 restriction for ADL-P+ (Uma) - Panic: refactor framebuffer allocation (Jani) - Backlight luminance control improvements (Suraj, Aaron) - Add intel_display_device_present (Jani) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aMxX_lBxm7wd5wmi@intel.com
2025-09-18bpf: Move the signature kfuncs to helpers.cKP Singh1-0/+32
No functional changes, except for the addition of the headers for the kfuncs so that they can be used for signature verification. Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-8-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-18bpf: Return hashes of maps in BPF_OBJ_GET_INFO_BY_FDKP Singh1-0/+3
Currently only array maps are supported, but the implementation can be extended for other maps and objects. The hash is memoized only for exclusive and frozen maps as their content is stable until the exclusive program modifies the map. This is required for BPF signing, enabling a trusted loader program to verify a map's integrity. The loader retrieves the map's runtime hash from the kernel and compares it against an expected hash computed at build time. Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-7-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-18bpf: Implement exclusive map creationKP Singh1-0/+1
Exclusive maps allow maps to only be accessed by program with a program with a matching hash which is specified in the excl_prog_hash attr. For the signing use-case, this allows the trusted loader program to load the map and verify the integrity Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-3-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-18bpf: Update the bpf_prog_calc_tag to use SHA256KP Singh1-1/+5
Exclusive maps restrict map access to specific programs using a hash. The current hash used for this is SHA1, which is prone to collisions. This patch uses SHA256, which is more resilient against collisions. This new hash is stored in bpf_prog and used by the verifier to determine if a program can access a given exclusive map. The original 64-bit tags are kept, as they are used by users as a short, possibly colliding program identifier for non-security purposes. Signed-off-by: KP Singh <kpsingh@kernel.org> Link: https://lore.kernel.org/r/20250914215141.15144-2-kpsingh@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-18Merge tag 'mlx5-next-09-11' of ↵Jakub Kicinski1-8/+8
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-09-17 This series by Carolina contains cleanups significantly touching shared mlx5 net and rdma headers. * tag 'mlx5-next-09-11' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5e: Prevent WQE metadata conflicts between timestamping and offloads net/mlx5: Refactor MACsec WQE metadata shifts net/mlx5: Remove VLAN insertion fields from WQE Ether segment ==================== Link: https://patch.msgid.link/1757574619-604874-1-git-send-email-tariqt@nvidia.com Link: https://patch.msgid.link/1758104780-642426-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-19power: supply: max77705_charger: use REGMAP_IRQ_REG_LINE macroDzmitry Sankouski1-26/+16
Refactor regmap_irq declarations with REGMAP_IRQ_REG_LINE saves a few lines on definitions. Signed-off-by: Dzmitry Sankouski <dsankouski@gmail.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2025-09-19power: supply: max77705_charger: use regfields for config registersDzmitry Sankouski1-48/+54
Using regfields allows to cleanup masks and register offset definition, allowing to access register info by it's functional name. Signed-off-by: Dzmitry Sankouski <dsankouski@gmail.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
2025-09-18Merge tag 'trace-rv-v6.17-rc5' of ↵Linus Torvalds1-4/+2
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull runtime verifier fixes from Steven Rostedt: - Fix build in some RISC-V flavours Some system calls only are available for the 64bit RISC-V machines. #ifdef out the cases of clock_nanosleep and futex in the sleep monitor if they are not supported by the architecture. - Fix wrong cast, obsolete after refactoring Use container_of() to get to the rv_monitor structure from the enable_monitors_next() 'p' pointer. The assignment worked only because the list field used happened to be the first field of the structure. - Remove redundant include files Some include files were listed twice. Remove the extra ones and sort the includes. - Fix missing unlock on failure There was an error path that exited the rv_register_monitor() function without releasing a lock. Change that to goto the lock release. - Add Gabriele Monaco to be Runtime Verifier maintainer Gabriele is doing most of the work on RV as well as collecting patches. Add him to the maintainers file for Runtime Verification. * tag 'trace-rv-v6.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rv: Add Gabriele Monaco as maintainer for Runtime Verification rv: Fix missing mutex unlock in rv_register_monitor() include/linux/rv.h: remove redundant include file rv: Fix wrong type cast in enabled_monitors_next() rv: Support systems with time64-only syscalls
2025-09-18soundwire: bus: add sdw_slave_get_current_bank helperSrinivas Kandagatla1-0/+8
There has been 2 instances of this helper in codec drivers, it does not make sense to keep duplicating this part of code. Lets add a helper sdw_get_current_bank() for codec drivers to use it. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com> Acked-by: Vinod Koul <vkoul@kernel.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://patch.msgid.link/20250909121954.225833-5-srinivas.kandagatla@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-09-18soundwire: bus: add of_sdw_find_device_by_node helperSrinivas Kandagatla1-0/+9
There has been more than 3 instances of this helper in multiple codec drivers, it does not make sense to keep duplicating this part of code. Lets add a helper of_sdw_find_device_by_node for codec drivers to use it. Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Acked-by: Vinod Koul <vkoul@kernel.org> Link: https://patch.msgid.link/20250909121954.225833-4-srinivas.kandagatla@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-09-18arm64: Enable permission change on arm64 kernel block mappingsDev Jain1-0/+3
This patch paves the path to enable huge mappings in vmalloc space and linear map space by default on arm64. For this we must ensure that we can handle any permission games on the kernel (init_mm) pagetable. Previously, __change_memory_common() used apply_to_page_range() which does not support changing permissions for block mappings. We move away from this by using the pagewalk API, similar to what riscv does right now. It is the responsibility of the caller to ensure that the range over which permissions are being changed falls on leaf mapping boundaries. For systems with BBML2, this will be handled in future patches by dyanmically splitting the mappings when required. Unlike apply_to_page_range(), the pagewalk API currently enforces the init_mm.mmap_lock to be held. To avoid the unnecessary bottleneck of the mmap_lock for our usecase, this patch extends this generic API to be used locklessly, so as to retain the existing behaviour for changing permissions. Apart from this reason, it is noted at [1] that KFENCE can manipulate kernel pgtable entries during softirqs. It does this by calling set_memory_valid() -> __change_memory_common(). This being a non-sleepable context, we cannot take the init_mm mmap lock. Add comments to highlight the conditions under which we can use the lockless variant - no underlying VMA, and the user having exclusive control over the range, thus guaranteeing no concurrent access. We require that the start and end of a given range do not partially overlap block mappings, or cont mappings. Return -EINVAL in case a partial block mapping is detected in any of the PGD/P4D/PUD/PMD levels; add a corresponding comment in update_range_prot() to warn that eliminating such a condition is the responsibility of the caller. Note that, the pte level callback may change permissions for a whole contpte block, and that will be done one pte at a time, as opposed to an atomic operation for the block mappings. This is fine as any access will decode either the old or the new permission until the TLBI. apply_to_page_range() currently performs all pte level callbacks while in lazy mmu mode. Since arm64 can optimize performance by batching barriers when modifying kernel pgtables in lazy mmu mode, we would like to continue to benefit from this optimisation. Unfortunately walk_kernel_page_table_range() does not use lazy mmu mode. However, since the pagewalk framework is not allocating any memory, we can safely bracket the whole operation inside lazy mmu mode ourselves. Therefore, wrap the call to walk_kernel_page_table_range() with the lazy MMU helpers. Link: https://lore.kernel.org/linux-arm-kernel/89d0ad18-4772-4d8f-ae8a-7c48d26a927e@arm.com/ [1] Signed-off-by: Dev Jain <dev.jain@arm.com> Signed-off-by: Yang Shi <yshi@os.amperecomputing.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-18io_uring/msg_ring: kill alloc_cache for io_kiocb allocationsJens Axboe1-3/+0
A recent commit: fc582cd26e88 ("io_uring/msg_ring: ensure io_kiocb freeing is deferred for RCU") fixed an issue with not deferring freeing of io_kiocb structs that msg_ring allocates to after the current RCU grace period. But this only covers requests that don't end up in the allocation cache. If a request goes into the alloc cache, it can get reused before it is sane to do so. A recent syzbot report would seem to indicate that there's something there, however it may very well just be because of the KASAN poisoning that the alloc_cache handles manually. Rather than attempt to make the alloc_cache sane for that use case, just drop the usage of the alloc_cache for msg_ring request payload data. Fixes: 50cf5f3842af ("io_uring/msg_ring: add an alloc cache for io_kiocb entries") Link: https://lore.kernel.org/io-uring/68cc2687.050a0220.139b6.0005.GAE@google.com/ Reported-by: syzbot+baa2e0f4e02df602583e@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-09-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-0/+13
Cross-merge networking fixes after downstream PR (net-6.17-rc7). No conflicts. Adjacent changes: drivers/net/ethernet/mellanox/mlx5/core/en/fs.h 9536fbe10c9d ("net/mlx5e: Add PSP steering in local NIC RX") 7601a0a46216 ("net/mlx5e: Add a miss level for ipsec crypto offload") Signed-off-by: Jakub Kicinski <kuba@kernel.org>