aboutsummaryrefslogtreecommitdiff
path: root/kernel/module/main.c
AgeCommit message (Collapse)AuthorFilesLines
2025-12-06Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko) fixes a build issue and does some cleanup in ib/sys_info.c - "Implement mul_u64_u64_div_u64_roundup()" (David Laight) enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich) makes BPF symbol names, sizes, and line numbers available to the GDB debugger - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang) adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu) adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations - "rbree: inline rb_first() and rb_last()" (Eric Dumazet) makes TCP a little faster - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin) reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin) increases the flexibility of KEXEC Handover. Also preparation for LUO - "Live Update Orchestrator" (Pasha Tatashin) is a major new feature targeted at cloud environments. Quoting the cover letter: This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition. As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot. Mike Rappaport merits a mention here, for his extensive review and testing work. - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain) moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day - "kho: fixes for vmalloc restoration" (Mike Rapoport) fixes a BUG which was being hit during KHO restoration of vmalloc() regions * tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits) calibrate: update header inclusion Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()" vmcoreinfo: track and log recoverable hardware errors kho: fix restoring of contiguous ranges of order-0 pages kho: kho_restore_vmalloc: fix initialization of pages array MAINTAINERS: TPM DEVICE DRIVER: update the W-tag init: replace simple_strtoul with kstrtoul to improve lpj_setup KHO: fix boot failure due to kmemleak access to non-PRESENT pages Documentation/ABI: new kexec and kdump sysfs interface Documentation/ABI: mark old kexec sysfs deprecated kexec: move sysfs entries to /sys/kernel/kexec test_kho: always print restore status kho: free chunks using free_page() instead of kfree() selftests/liveupdate: add kexec test for multiple and empty sessions selftests/liveupdate: add simple kexec-based selftest for LUO selftests/liveupdate: add userspace API selftests docs: add documentation for memfd preservation via LUO mm: memfd_luo: allow preserving memfd liveupdate: luo_file: add private argument to store runtime state mm: shmem: export some functions to internal.h ...
2025-11-19ima: Access decompressed kernel module to verify appended signatureCoiby Xu1-3/+14
Currently, when in-kernel module decompression (CONFIG_MODULE_DECOMPRESS) is enabled, IMA has no way to verify the appended module signature as it can't decompress the module. Define a new kernel_read_file_id enumerate READING_MODULE_COMPRESSED so IMA can calculate the compressed kernel module data hash on READING_MODULE_COMPRESSED and defer appraising/measuring it until on READING_MODULE when the module has been decompressed. Before enabling in-kernel module decompression, a kernel module in initramfs can still be loaded with ima_policy=secure_boot. So adjust the kernel module rule in secure_boot policy to allow either an IMA signature OR an appended signature i.e. to use "appraise func=MODULE_CHECK appraise_type=imasig|modsig". Reported-by: Karel Srot <ksrot@redhat.com> Suggested-by: Mimi Zohar <zohar@linux.ibm.com> Suggested-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Coiby Xu <coxu@redhat.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2025-11-12taint/module: remove unnecessary taint_flag.module fieldPetr Pavlu1-1/+1
The TAINT_RANDSTRUCT and TAINT_FWCTL flags are mistakenly set in the taint_flags table as per-module flags. While this can be trivially corrected, the issue can be avoided altogether by removing the taint_flag.module field. This is possible because, since commit 7fd8329ba502 ("taint/module: Clean up global and module taint flags handling") in 2016, the handling of module taint flags has been fully generic. Specifically, module_flags_taint() can print all flags, and the required output buffer size is properly defined in terms of TAINT_FLAGS_COUNT. The actual per-module flags are always those added to module.taints by calls to add_taint_module(). Link: https://lkml.kernel.org/r/20251022082938.26670-1-petr.pavlu@suse.com Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Acked-by: Petr Mladek <pmladek@suse.com> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Cc: Aaron Tomlin <atomlin@atomlin.com> Cc: Luis Chamberalin <mcgrof@kernel.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-08-05Merge tag 'mm-stable-2025-08-03-12-35' of ↵Linus Torvalds1-11/+2
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: "Significant patch series in this pull request: - "mseal cleanups" (Lorenzo Stoakes) Some mseal cleaning with no intended functional change. - "Optimizations for khugepaged" (David Hildenbrand) Improve khugepaged throughput by batching PTE operations for large folios. This gain is mainly for arm64. - "x86: enable EXECMEM_ROX_CACHE for ftrace and kprobes" (Mike Rapoport) A bugfix, additional debug code and cleanups to the execmem code. - "mm/shmem, swap: bugfix and improvement of mTHP swap in" (Kairui Song) Bugfixes, cleanups and performance improvememnts to the mTHP swapin code" * tag 'mm-stable-2025-08-03-12-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (38 commits) mm: mempool: fix crash in mempool_free() for zero-minimum pools mm: correct type for vmalloc vm_flags fields mm/shmem, swap: fix major fault counting mm/shmem, swap: rework swap entry and index calculation for large swapin mm/shmem, swap: simplify swapin path and result handling mm/shmem, swap: never use swap cache and readahead for SWP_SYNCHRONOUS_IO mm/shmem, swap: tidy up swap entry splitting mm/shmem, swap: tidy up THP swapin checks mm/shmem, swap: avoid redundant Xarray lookup during swapin x86/ftrace: enable EXECMEM_ROX_CACHE for ftrace allocations x86/kprobes: enable EXECMEM_ROX_CACHE for kprobes allocations execmem: drop writable parameter from execmem_fill_trapping_insns() execmem: add fallback for failures in vmalloc(VM_ALLOW_HUGE_VMAP) execmem: move execmem_force_rw() and execmem_restore_rox() before use execmem: rework execmem_cache_free() execmem: introduce execmem_alloc_rw() execmem: drop unused execmem_update_copy() mm: fix a UAF when vma->mm is freed after vma->vm_refcnt got dropped mm/rmap: add anon_vma lifetime debug check mm: remove mm/io-mapping.c ...
2025-08-02execmem: introduce execmem_alloc_rw()Mike Rapoport (Microsoft)1-11/+2
Some callers of execmem_alloc() require the memory to be temporarily writable even when it is allocated from ROX cache. These callers use execemem_make_temp_rw() right after the call to execmem_alloc(). Wrap this sequence in execmem_alloc_rw() API. Link: https://lkml.kernel.org/r/20250713071730.4117334-3-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-07-31module: Remove unnecessary +1 from last_unloaded_module::name sizePetr Pavlu1-1/+1
The variable last_unloaded_module::name tracks the name of the last unloaded module. It is a string copy of module::name, which is MODULE_NAME_LEN bytes in size and includes the NUL terminator. Therefore, the size of last_unloaded_module::name can also be just MODULE_NAME_LEN, without the need for an extra byte. Fixes: e14af7eeb47e ("debug: track and print last unloaded module in the oops trace") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Link: https://lore.kernel.org/r/20250630143535.267745-3-petr.pavlu@suse.com Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
2025-07-31module: Prevent silent truncation of module name in delete_module(2)Petr Pavlu1-4/+6
Passing a module name longer than MODULE_NAME_LEN to the delete_module syscall results in its silent truncation. This really isn't much of a problem in practice, but it could theoretically lead to the removal of an incorrect module. It is more sensible to return ENAMETOOLONG or ENOENT in such a case. Update the syscall to return ENOENT, as documented in the delete_module(2) man page to mean "No module by that name exists." This is appropriate because a module with a name longer than MODULE_NAME_LEN cannot be loaded in the first place. Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Link: https://lore.kernel.org/r/20250630143535.267745-2-petr.pavlu@suse.com Signed-off-by: Daniel Gomez <da.gomez@samsung.com>
2025-07-30Merge tag 'ftrace-v6.17' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace updates from Steven Rostedt: - Keep track of when fgraph_ops are registered or not Keep accounting of when fgraph_ops are registered as if a fgraph_ops is registered twice it can mess up the accounting and it will not work as expected later. Trigger a warning if something registers it twice as to catch bugs before they are found by things just not working as expected. - Make DYNAMIC_FTRACE always enabled for architectures that support it As static ftrace (where all functions are always traced) is very expensive and only exists to help architectures support ftrace, do not make it an option. As soon as an architecture supports DYNAMIC_FTRACE make it use it. This simplifies the code. - Remove redundant config HAVE_FTRACE_MCOUNT_RECORD The CONFIG_HAVE_FTRACE_MCOUNT was added to help simplify the DYNAMIC_FTRACE work, but now every architecture that implements DYNAMIC_FTRACE also has HAVE_FTRACE_MCOUNT set too, making it redundant with the HAVE_DYNAMIC_FTRACE. - Make pid_ptr string size match the comment In print_graph_proc() the pid_ptr string is of size 11, but the comment says /* sign + log10(MAX_INT) + '\0' */ which is actually 12. * tag 'ftrace-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Remove redundant config HAVE_FTRACE_MCOUNT_RECORD ftrace: Make DYNAMIC_FTRACE always enabled for architectures that support it fgraph: Keep track of when fgraph_ops are registered or not fgraph: Make pid_str size match the comment
2025-07-29Merge tag 'sysctl-6.17-rc1' of ↵Linus Torvalds1-1/+29
git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl Pull sysctl updates from Joel Granados: - Move sysctls out of the kern_table array This is the final move of ctl_tables into their respective subsystems. Only 5 (out of the original 50) will remain in kernel/sysctl.c file; these handle either sysctl or common arch variables. By decentralizing sysctl registrations, subsystem maintainers regain control over their sysctl interfaces, improving maintainability and reducing the likelihood of merge conflicts. - docs: Remove false positives from check-sysctl-docs Stopped falsely identifying sysctls as undocumented or unimplemented in the check-sysctl-docs script. This script can now be used to automatically identify if documentation is missing. * tag 'sysctl-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/sysctl/sysctl: (23 commits) docs: Downgrade arm64 & riscv from titles to comment docs: Replace spaces with tabs in check-sysctl-docs docs: Remove colon from ctltable title in vm.rst docs: Add awk section for ucount sysctl entries docs: Use skiplist when checking sysctl admin-guide docs: nixify check-sysctl-docs sysctl: rename kern_table -> sysctl_subsys_table kernel/sys.c: Move overflow{uid,gid} sysctl into kernel/sys.c uevent: mv uevent_helper into kobject_uevent.c sysctl: Removed unused variable sysctl: Nixify sysctl.sh sysctl: Remove superfluous includes from kernel/sysctl.c sysctl: Remove (very) old file changelog sysctl: Move sysctl_panic_on_stackoverflow to kernel/panic.c sysctl: move cad_pid into kernel/pid.c sysctl: Move tainted ctl_table into kernel/panic.c Input: sysrq: mv sysrq into drivers/tty/sysrq.c fork: mv threads-max into kernel/fork.c parisc/power: Move soft-power into power.c mm: move randomize_va_space into memory.c ...
2025-07-28Merge tag 'audit-pr-20250725' of ↵Linus Torvalds1-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit update from Paul Moore: "A single audit patch that restores logging of an audit event in the module load failure case" * tag 'audit-pr-20250725' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit,module: restore audit logging in load failure case
2025-07-23module: Move modprobe_path and modules_disabled ctl_tables into the module ↵Joel Granados1-1/+29
subsys Move module sysctl (modprobe_path and modules_disabled) out of sysctl.c and into the modules subsystem. Make modules_disabled static as it no longer needs to be exported. Remove module.h from the includes in sysctl as it no longer uses any module exported variables. This is part of a greater effort to move ctl tables into their respective subsystems which will reduce the merge conflicts in kernel/sysctl.c. Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Joel Granados <joel.granados@kernel.org>
2025-07-22tracing: Remove redundant config HAVE_FTRACE_MCOUNT_RECORDSteven Rostedt1-1/+1
Ftrace is tightly coupled with architecture specific code because it requires the use of trampolines written in assembly. This means that when a new feature or optimization is made, it must be done for all architectures. To simplify the approach, CONFIG_HAVE_FTRACE_* configs are added to denote which architecture has the new enhancement so that other architectures can still function until they too have been updated. The CONFIG_HAVE_FTRACE_MCOUNT was added to help simplify the DYNAMIC_FTRACE work, but now every architecture that implements DYNAMIC_FTRACE also has HAVE_FTRACE_MCOUNT set too, making it redundant with the HAVE_DYNAMIC_FTRACE. Remove the HAVE_FTRACE_MCOUNT config and use DYNAMIC_FTRACE directly where applicable. Link: https://lore.kernel.org/all/20250703154916.48e3ada7@gandalf.local.home/ Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/20250704104838.27a18690@gandalf.local.home Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-07-08module: Make sure relocations are applied to the per-CPU sectionSebastian Andrzej Siewior1-2/+8
The per-CPU data section is handled differently than the other sections. The memory allocations requires a special __percpu pointer and then the section is copied into the view of each CPU. Therefore the SHF_ALLOC flag is removed to ensure move_module() skips it. Later, relocations are applied and apply_relocations() skips sections without SHF_ALLOC because they have not been copied. This also skips the per-CPU data section. The missing relocations result in a NULL pointer on x86-64 and very small values on x86-32. This results in a crash because it is not skipped like NULL pointer would and can't be dereferenced. Such an assignment happens during static per-CPU lock initialisation with lockdep enabled. Allow relocation processing for the per-CPU section even if SHF_ALLOC is missing. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202506041623.e45e4f7d-lkp@intel.com Fixes: 1a6100caae425 ("Don't relocate non-allocated regions in modules.") #v2.6.1-rc3 Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Link: https://lore.kernel.org/r/20250610163328.URcsSUC1@linutronix.de Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Message-ID: <20250610163328.URcsSUC1@linutronix.de>
2025-07-08module: Avoid unnecessary return value initialization in move_module()Petr Pavlu1-2/+1
All error conditions in move_module() set the return value by updating the ret variable. Therefore, it is not necessary to the initialize the variable when declaring it. Remove the unnecessary initialization. Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Link: https://lore.kernel.org/r/20250618122730.51324-3-petr.pavlu@suse.com Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Message-ID: <20250618122730.51324-3-petr.pavlu@suse.com>
2025-07-08module: Fix memory deallocation on error path in move_module()Petr Pavlu1-2/+2
The function move_module() uses the variable t to track how many memory types it has allocated and consequently how many should be freed if an error occurs. The variable is initially set to 0 and is updated when a call to module_memory_alloc() fails. However, move_module() can fail for other reasons as well, in which case t remains set to 0 and no memory is freed. Fix the problem by initializing t to MOD_MEM_NUM_TYPES. Additionally, make the deallocation loop more robust by not relying on the mod_mem_type_t enum having a signed integer as its underlying type. Fixes: c7ee8aebf6c0 ("module: add stop-grap sanity check on module memcpy()") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Daniel Gomez <da.gomez@samsung.com> Link: https://lore.kernel.org/r/20250618122730.51324-2-petr.pavlu@suse.com Signed-off-by: Daniel Gomez <da.gomez@samsung.com> Message-ID: <20250618122730.51324-2-petr.pavlu@suse.com>
2025-06-16audit,module: restore audit logging in load failure caseRichard Guy Briggs1-2/+4
The move of the module sanity check to earlier skipped the audit logging call in the case of failure and to a place where the previously used context is unavailable. Add an audit logging call for the module loading failure case and get the module name when possible. Link: https://issues.redhat.com/browse/RHEL-52839 Fixes: 02da2cbab452 ("module: move check_modinfo() early to early_mod_check()") Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-06-07Merge tag 'kbuild-v6.16' of ↵Linus Torvalds1-2/+87
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add support for the EXPORT_SYMBOL_GPL_FOR_MODULES() macro, which exports a symbol only to specified modules - Improve ABI handling in gendwarfksyms - Forcibly link lib-y objects to vmlinux even if CONFIG_MODULES=n - Add checkers for redundant or missing <linux/export.h> inclusion - Deprecate the extra-y syntax - Fix a genksyms bug when including enum constants from *.symref files * tag 'kbuild-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (28 commits) genksyms: Fix enum consts from a reference affecting new values arch: use always-$(KBUILD_BUILTIN) for vmlinux.lds kbuild: set y instead of 1 to KBUILD_{BUILTIN,MODULES} efi/libstub: use 'targets' instead of extra-y in Makefile module: make __mod_device_table__* symbols static scripts/misc-check: check unnecessary #include <linux/export.h> when W=1 scripts/misc-check: check missing #include <linux/export.h> when W=1 scripts/misc-check: add double-quotes to satisfy shellcheck kbuild: move W=1 check for scripts/misc-check to top-level Makefile scripts/tags.sh: allow to use alternative ctags implementation kconfig: introduce menu type enum docs: symbol-namespaces: fix reST warning with literal block kbuild: link lib-y objects to vmlinux forcibly even when CONFIG_MODULES=n tinyconfig: enable CONFIG_LD_DEAD_CODE_DATA_ELIMINATION docs/core-api/symbol-namespaces: drop table of contents and section numbering modpost: check forbidden MODULE_IMPORT_NS("module:") at compile time kbuild: move kbuild syntax processing to scripts/Makefile.build Makefile: remove dependency on archscripts for header installation Documentation/kbuild: Add new gendwarfksyms kABI rules Documentation/kbuild: Drop section numbers ...
2025-06-05alloc_tag: handle module codetag load errors as module load failuresSuren Baghdasaryan1-2/+3
Failures inside codetag_load_module() are currently ignored. As a result an error there would not cause a module load failure and freeing of the associated resources. Correct this behavior by propagating the error code to the caller and handling possible errors. With this change, error to allocate percpu counters, which happens at this stage, will not be ignored and will cause a module load failure and freeing of resources. With this change we also do not need to disable memory allocation profiling when this error happens, instead we fail to load the module. Link: https://lkml.kernel.org/r/20250521160602.1940771-1-surenb@google.com Fixes: 10075262888b ("alloc_tag: allocate percpu counters for module tags dynamically") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Reported-by: Casey Chen <cachen@purestorage.com> Closes: https://lore.kernel.org/all/20250520231620.15259-1-cachen@purestorage.com/ Cc: Daniel Gomez <da.gomez@samsung.com> Cc: David Wang <00107082@163.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Luis Chamberalin <mcgrof@kernel.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-06-02Merge tag 'modules-6.16-rc1' of ↵Linus Torvalds1-20/+7
git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux Pull module updates from Petr Pavlu: - Make .static_call_sites in modules read-only after init The .static_call_sites sections in modules have been made read-only after init to avoid any (non-)accidental modifications, similarly to how they are read-only after init in vmlinux - The rest are minor cleanups * tag 'modules-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: module: Remove outdated comment about text_size module: Make .static_call_sites read-only after init module: Add a separate function to mark sections as read-only after init module: Constify parameters of module_enforce_rwx_sections()
2025-05-25module: Account for the build time module name manglingPeter Zijlstra1-1/+25
Sean noted that scripts/Makefile.lib:name-fix-token rule will mangle the module name with s/-/_/g. Since this happens late in the build, only the kernel needs to bother with this, the modpost tool still sees the original name. Reported-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Tested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-25module: Extend the module namespace parsingPeter Zijlstra1-2/+34
Instead of only accepting "module:${name}", extend it with a comma separated list of module names and add tail glob support. That is, something like: "module:foo-*,bar" is now possible. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-25module: Add module specific symbol namespace supportPeter Zijlstra1-2/+31
Designate the "module:${modname}" symbol namespace to mean: 'only export to the named module'. Notably, explicit imports of anything in the "module:" space is forbidden. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-05-25module: release codetag section when module load failsDavid Wang1-0/+1
When module load fails after memory for codetag section is ready, codetag section memory will not be properly released. This causes memory leak, and if next module load happens to get the same module address, codetag may pick the uninitialized section when manipulating tags during module unload, and leads to "unable to handle page fault" BUG. Link: https://lkml.kernel.org/r/20250519163823.7540-1-00107082@163.com Fixes: 0db6f8d7820a ("alloc_tag: load module tags into separate contiguous memory") Closes: https://lore.kernel.org/all/20250516131246.6244-1-00107082@163.com/ Signed-off-by: David Wang <00107082@163.com> Acked-by: Suren Baghdasaryan <surenb@google.com> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-18module: Remove outdated comment about text_sizeValentin Schneider1-5/+4
The text_size bit referred to by the comment has been removed as of commit ac3b43283923 ("module: replace module_layout with module_memory") and is thus no longer relevant. Remove it and comment about the contents of the masks array instead. Signed-off-by: Valentin Schneider <vschneid@redhat.com> Link: https://lore.kernel.org/r/20250429113242.998312-23-vschneid@redhat.com Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-05-18module: Add a separate function to mark sections as read-only after initPetr Pavlu1-15/+3
Move the logic to mark special sections as read-only after module initialization into a separate function, along other related code in strict_rwx.c. Use a table with names of such sections to make it easier to add more. Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20250306131430.7016-3-petr.pavlu@suse.com Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-31Merge tag 'trace-ringbuffer-v6.15-2' of ↵Linus Torvalds1-0/+13
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer updates from Steven Rostedt: - Restructure the persistent memory to have a "scratch" area Instead of hard coding the KASLR offset in the persistent memory by the ring buffer, push that work up to the callers of the persistent memory as they are the ones that need this information. The offsets and such is not important to the ring buffer logic and it should not be part of that. A scratch pad is now created when the caller allocates a ring buffer from persistent memory by stating how much memory it needs to save. - Allow where modules are loaded to be saved in the new scratch pad Save the addresses of modules when they are loaded into the persistent memory scratch pad. - A new module_for_each_mod() helper function was created With the acknowledgement of the module maintainers a new module helper function was created to iterate over all the currently loaded modules. This has a callback to be called for each module. This is needed for when tracing is started in the persistent buffer and the currently loaded modules need to be saved in the scratch area. - Expose the last boot information where the kernel and modules were loaded The last_boot_info file is updated to print out the addresses of where the kernel "_text" location was loaded from a previous boot, as well as where the modules are loaded. If the buffer is recording the current boot, it only prints "# Current" so that it does not expose the KASLR offset of the currently running kernel. - Allow the persistent ring buffer to be released (freed) To have this in production environments, where the kernel command line can not be changed easily, the ring buffer needs to be freed when it is not going to be used. The memory for the buffer will always be allocated at boot up, but if the system isn't going to enable tracing, the memory needs to be freed. Allow it to be freed and added back to the kernel memory pool. - Allow stack traces to print the function names in the persistent buffer Now that the modules are saved in the persistent ring buffer, if the same modules are loaded, the printing of the function names will examine the saved modules. If the module is found in the scratch area and is also loaded, then it will do the offset shift and use kallsyms to display the function name. If the address is not found, it simply displays the address from the previous boot in hex. * tag 'trace-ringbuffer-v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Use _text and the kernel offset in last_boot_info tracing: Show last module text symbols in the stacktrace ring-buffer: Remove the unused variable bmeta tracing: Skip update_last_data() if cleared and remove active check for save_mod() tracing: Initialize scratch_size to zero to prevent UB tracing: Fix a compilation error without CONFIG_MODULES tracing: Freeable reserved ring buffer mm/memblock: Add reserved memory release function tracing: Update modules to persistent instances when loaded tracing: Show module names and addresses of last boot tracing: Have persistent trace instances save module addresses module: Add module_for_each_mod() function tracing: Have persistent trace instances save KASLR offset ring-buffer: Add ring_buffer_meta_scratch() ring-buffer: Add buffer meta data for persistent ring buffer ring-buffer: Use kaslr address instead of text delta ring-buffer: Fix bytes_dropped calculation issue
2025-03-30Merge tag 'modules-6.15-rc1' of ↵Linus Torvalds1-70/+39
git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux Pull modules updates from Petr Pavlu: - Use RCU instead of RCU-sched The mix of rcu_read_lock(), rcu_read_lock_sched() and preempt_disable() in the module code and its users has been replaced with just rcu_read_lock() - The rest of changes are smaller fixes and updates * tag 'modules-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: (32 commits) MAINTAINERS: Update the MODULE SUPPORT section module: Remove unnecessary size argument when calling strscpy() module: Replace deprecated strncpy() with strscpy() params: Annotate struct module_param_attrs with __counted_by() bug: Use RCU instead RCU-sched to protect module_bug_list. static_call: Use RCU in all users of __module_text_address(). kprobes: Use RCU in all users of __module_text_address(). bpf: Use RCU in all users of __module_text_address(). jump_label: Use RCU in all users of __module_text_address(). jump_label: Use RCU in all users of __module_address(). x86: Use RCU in all users of __module_address(). cfi: Use RCU while invoking __module_address(). powerpc/ftrace: Use RCU in all users of __module_text_address(). LoongArch: ftrace: Use RCU in all users of __module_text_address(). LoongArch/orc: Use RCU in all users of __module_address(). arm64: module: Use RCU in all users of __module_text_address(). ARM: module: Use RCU in all users of __module_text_address(). module: Use RCU in all users of __module_text_address(). module: Use RCU in all users of __module_address(). module: Use RCU in search_module_extables(). ...
2025-03-28module: Add module_for_each_mod() functionSteven Rostedt1-0/+13
The tracing system needs a way to save all the currently loaded modules and their addresses into persistent memory so that it can evaluate the addresses on a reboot from a crash. When the persistent memory trace starts, it will load the module addresses and names into the persistent memory. To do so, it will call the module_for_each_mod() function and pass it a function and data structure to get called on each loaded module. Then it can record the memory. This only implements that function. Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: linux-modules@vger.kernel.org Link: https://lore.kernel.org/20250305164608.962615966@goodmis.org Acked-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-03-10module: Remove unnecessary size argument when calling strscpy()Thorsten Blum1-2/+2
The size parameter is optional and strscpy() automatically determines the length of the destination buffer using sizeof() if the argument is omitted. This makes the explicit sizeof() unnecessary. Remove it to shorten and simplify the code. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20250308194631.191670-2-thorsten.blum@linux.dev Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-10module: Replace deprecated strncpy() with strscpy()Thorsten Blum1-1/+1
strncpy() is deprecated for NUL-terminated destination buffers; use strscpy() instead. The destination buffer ownername is only used with "%s" format strings and must therefore be NUL-terminated, but not NUL- padded. No functional changes intended. Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20250307113546.112237-2-thorsten.blum@linux.dev Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-10module: Use RCU in all users of __module_text_address().Sebastian Andrzej Siewior1-11/+5
__module_text_address() can be invoked within a RCU section, there is no requirement to have preemption disabled. Replace the preempt_disable() section around __module_text_address() with RCU. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108090457.512198-16-bigeasy@linutronix.de Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-10module: Use RCU in all users of __module_address().Sebastian Andrzej Siewior1-7/+2
__module_address() can be invoked within a RCU section, there is no requirement to have preemption disabled. Replace the preempt_disable() section around __module_address() with RCU. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108090457.512198-15-bigeasy@linutronix.de Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-10module: Use RCU in search_module_extables().Sebastian Andrzej Siewior1-14/+9
search_module_extables() returns an exception_table_entry belonging to a module. The lookup via __module_address() can be performed with RCU protection. The returned exception_table_entry remains valid because the passed address usually belongs to a module that is currently executed. So the module can not be removed because "something else" holds a reference to it, ensuring that it can not be removed. Exceptions here are: - kprobe, acquires a reference on the module beforehand - MCE, invokes the function from within a timer and the RCU lifetime guarantees (of the timer) are sufficient. Therefore it is safe to return the exception_table_entry outside the RCU section which provided the module. Use RCU for the lookup in search_module_extables() and update the comment. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108090457.512198-14-bigeasy@linutronix.de Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-10module: Allow __module_address() to be called from RCU section.Sebastian Andrzej Siewior1-3/+1
mod_find() uses either the modules list to find a module or a tree lookup (CONFIG_MODULES_TREE_LOOKUP). The list and the tree can both be iterated under RCU assumption (as well as RCU-sched). Remove module_assert_mutex_or_preempt() from __module_address() and entirely since __module_address() is the last user. Update comments. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108090457.512198-13-bigeasy@linutronix.de Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-10module: Use RCU in __is_module_percpu_address().Sebastian Andrzej Siewior1-5/+1
The modules list can be accessed under RCU assumption. Use RCU protection instead preempt_disable(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108090457.512198-12-bigeasy@linutronix.de Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-10module: Use RCU in find_symbol().Sebastian Andrzej Siewior1-18/+12
module_assert_mutex_or_preempt() is not needed in find_symbol(). The function checks for RCU-sched or the module_mutex to be acquired. The list_for_each_entry_rcu() below does the same check. Remove module_assert_mutex_or_preempt() from try_add_tainted_module(). Use RCU protection to invoke find_symbol() and update callers. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108090457.512198-11-bigeasy@linutronix.de Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-10module: Use RCU in find_module_all().Sebastian Andrzej Siewior1-4/+2
The modules list and module::kallsyms can be accessed under RCU assumption. Remove module_assert_mutex_or_preempt() from find_module_all() so it can be used under RCU protection without warnings. Update its callers to use RCU protection instead of preempt_disable(). Cc: Jiri Kosina <jikos@kernel.org> Cc: Joe Lawrence <joe.lawrence@redhat.com> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Miroslav Benes <mbenes@suse.cz> Cc: Petr Mladek <pmladek@suse.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: linux-trace-kernel@vger.kernel.org Cc: live-patching@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20250108090457.512198-7-bigeasy@linutronix.de Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-10module: Begin to move from RCU-sched to RCU.Sebastian Andrzej Siewior1-5/+4
The RCU usage in module was introduced in commit d72b37513cdfb ("Remove stop_machine during module load v2") and it claimed not to be RCU but similar. Then there was another improvement in commit e91defa26c527 ("module: don't use stop_machine on module load"). It become a mix of RCU and RCU-sched and was eventually fixed 0be964be0d450 ("module: Sanitize RCU usage and locking"). Later RCU & RCU-sched was merged in commit cb2f55369d3a9 ("modules: Replace synchronize_sched() and call_rcu_sched()") so that was aligned. Looking at it today, there is still leftovers. The preempt_disable() was used instead rcu_read_lock_sched(). The RCU & RCU-sched merge was not complete as there is still rcu_dereference_sched() for module::kallsyms. The RCU-list modules and unloaded_tainted_modules are always accessed under RCU protection or the module_mutex. The modules list iteration can always happen safely because the module will not disappear. Once the module is removed (free_module()) then after removing the module from the list, there is a synchronize_rcu() which waits until every RCU reader left the section. That means iterating over the list within a RCU-read section is enough, there is no need to disable preemption. module::kallsyms is first assigned in add_kallsyms() before the module is added to the list. At this point, it points to init data. This pointer is later updated and before the init code is removed there is also synchronize_rcu() in do_free_init(). That means A RCU read lock is enough for protection and rcu_dereference() can be safely used. Convert module code and its users step by step. Update comments and convert print_modules() to use RCU. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250108090457.512198-3-bigeasy@linutronix.de Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-02-14module: don't annotate ROX memory as kmemleak_not_leak()Mike Rapoport (Microsoft)1-1/+2
The ROX memory allocations are part of a larger vmalloc allocation and annotating them with kmemleak_not_leak() confuses kmemleak. Skip kmemleak_not_leak() annotations for the ROX areas. Fixes: c287c0723329 ("module: switch to execmem API for remapping as RW and restoring ROX") Fixes: 64f6a4e10c05 ("x86: re-enable EXECMEM_ROX support") Reported-by: "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20250214084531.3299390-1-rppt@kernel.org
2025-02-03module: switch to execmem API for remapping as RW and restoring ROXMike Rapoport (Microsoft)1-57/+21
Instead of using writable copy for module text sections, temporarily remap the memory allocated from execmem's ROX cache as writable and restore its ROX permissions after the module is formed. This will allow removing nasty games with writable copy in alternatives patching on x86. Signed-off-by: "Mike Rapoport (Microsoft)" <rppt@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250126074733.1384926-7-rppt@kernel.org
2025-01-31Merge tag 'kbuild-v6.14' of ↵Linus Torvalds1-9/+85
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Support multiple hook locations for maint scripts of Debian package - Remove 'cpio' from the build tool requirement - Introduce gendwarfksyms tool, which computes CRCs for export symbols based on the DWARF information - Support CONFIG_MODVERSIONS for Rust - Resolve all conflicts in the genksyms parser - Fix several syntax errors in genksyms * tag 'kbuild-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (64 commits) kbuild: fix Clang LTO with CONFIG_OBJTOOL=n kbuild: Strip runtime const RELA sections correctly kconfig: fix memory leak in sym_warn_unmet_dep() kconfig: fix file name in warnings when loading KCONFIG_DEFCONFIG_LIST genksyms: fix syntax error for attribute before init-declarator genksyms: fix syntax error for builtin (u)int*x*_t types genksyms: fix syntax error for attribute after 'union' genksyms: fix syntax error for attribute after 'struct' genksyms: fix syntax error for attribute after abstact_declarator genksyms: fix syntax error for attribute before nested_declarator genksyms: fix syntax error for attribute before abstract_declarator genksyms: decouple ATTRIBUTE_PHRASE from type