| Age | Commit message (Collapse) | Author | Files | Lines |
|
reorder
Commit 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved
memory regions are processed") changed the processing order of reserved
memory regions, causing elfcorehdr to overlap with dynamically allocated
reserved memory regions during kdump kernel boot.
The issue occurs because:
1. kexec-tools allocates elfcorehdr in the last crashkernel reserved
memory region and passes it to the second kernel
2. The problematic commit moved dynamic reserved memory allocation
(like bman-fbpr) to occur during fdt_scan_reserved_mem(), before
elfcorehdr reservation in fdt_reserve_elfcorehdr()
3. bman-fbpr with 16MB alignment requirement can get allocated at
addresses that overlap with the elfcorehdr location
4. When fdt_reserve_elfcorehdr() tries to reserve elfcorehdr memory,
overlap detection identifies the conflict and skips reservation
5. kdump kernel fails with "Unable to handle kernel paging request"
because elfcorehdr memory is not properly reserved
The boot log:
Before 8a6e02d0c00e:
OF: fdt: Reserving 1 KiB of memory at 0xf4fff000 for elfcorehdr
OF: reserved mem: 0xf3000000..0xf3ffffff bman-fbpr
After 8a6e02d0c00e:
OF: reserved mem: 0xf4000000..0xf4ffffff bman-fbpr
OF: fdt: elfcorehdr is overlapped
Fix this by ensuring elfcorehdr reservation occurs before dynamic
reserved memory allocation.
Fixes: 8a6e02d0c00e ("of: reserved_mem: Restructure how the reserved memory regions are processed")
Signed-off-by: Jianpeng Chang <jianpeng.chang.cn@windriver.com>
Link: https://patch.msgid.link/20251205015934.700016-1-jianpeng.chang.cn@windriver.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Use the existing helper functions to simplify the logic of
early_init_dt_scan_memory()
Signed-off-by: Yuntao Wang <yuntao.wang@linux.dev>
Link: https://patch.msgid.link/20251115134753.179931-6-yuntao.wang@linux.dev
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
When reading the fdt_size value, the argument passed to dt_mem_next_cell()
is dt_root_addr_cells, but it should be dt_root_size_cells.
The same issue occurs when reading the scratch_size value.
Use a helper function to simplify the code and fix these issues.
Fixes: 274cdcb1c004 ("arm64: add KHO support")
Signed-off-by: Yuntao Wang <yuntao.wang@linux.dev>
Link: https://patch.msgid.link/20251115134753.179931-5-yuntao.wang@linux.dev
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
The len value is in bytes, while `dt_root_addr_cells + dt_root_size_cells`
is in cells (4 bytes per cell). Modulo calculation between them is
incorrect, the units must be converted first.
Use helper functions to simplify the code and fix this issue.
Fixes: fb319e77a0e7 ("of: fdt: Add memory for devices by DT property "linux,usable-memory-range"")
Fixes: 2af2b50acf9b9c38 ("of: fdt: Add generic support for handling usable memory range property")
Fixes: 8f579b1c4e347b23 ("arm64: limit memory regions based on DT property, usable-memory-range")
Signed-off-by: Yuntao Wang <yuntao.wang@linux.dev>
Link: https://patch.msgid.link/20251115134753.179931-4-yuntao.wang@linux.dev
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
The len value is in bytes, while `dt_root_addr_cells + dt_root_size_cells`
is in cells (4 bytes per cell). Comparing them directly is incorrect.
Use a helper function to simplify the code and address this issue.
Fixes: f7e7ce93aac1 ("of: fdt: Add generic support for handling elf core headers property")
Fixes: e62aaeac426ab1dd ("arm64: kdump: provide /proc/vmcore file")
Signed-off-by: Yuntao Wang <yuntao.wang@linux.dev>
Link: https://patch.msgid.link/20251115134753.179931-3-yuntao.wang@linux.dev
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Currently, there are many pieces of nearly identical code scattered across
different places. Consolidate the duplicate code into helper functions to
improve maintainability and reduce the likelihood of errors.
Signed-off-by: Yuntao Wang <yuntao.wang@linux.dev>
Link: https://patch.msgid.link/20251115134753.179931-2-yuntao.wang@linux.dev
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
We now have all bits in place to support KHO kexecs. Add awareness of KHO
in the kexec file as well as boot path for arm64 and adds the respective
kconfig option to the architecture so that it can use KHO successfully.
Changes to the "chosen" node have been sent to
https://github.com/devicetree-org/dt-schema/pull/158.
Link: https://lkml.kernel.org/r/20250509074635.3187114-10-changyuanl@google.com
Signed-off-by: Alexander Graf <graf@amazon.com>
Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Co-developed-by: Changyuan Lyu <changyuanl@google.com>
Signed-off-by: Changyuan Lyu <changyuanl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anthony Yznaga <anthony.yznaga@oracle.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ashish Kalra <ashish.kalra@amd.com>
Cc: Ben Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Gowans <jgowans@amazon.com>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Marc Rutland <mark.rutland@arm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pasha Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pratyush Yadav <ptyadav@amazon.de>
Cc: Rob Herring <robh@kernel.org>
Cc: Saravana Kannan <saravanak@google.com>
Cc: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Thomas Lendacky <thomas.lendacky@amd.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
"The various patchsets are summarized below. Plus of course many
indivudual patches which are described in their changelogs.
- "Allocate and free frozen pages" from Matthew Wilcox reorganizes
the page allocator so we end up with the ability to allocate and
free zero-refcount pages. So that callers (ie, slab) can avoid a
refcount inc & dec
- "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to
use large folios other than PMD-sized ones
- "Fix mm/rodata_test" from Petr Tesarik performs some maintenance
and fixes for this small built-in kernel selftest
- "mas_anode_descend() related cleanup" from Wei Yang tidies up part
of the mapletree code
- "mm: fix format issues and param types" from Keren Sun implements a
few minor code cleanups
- "simplify split calculation" from Wei Yang provides a few fixes and
a test for the mapletree code
- "mm/vma: make more mmap logic userland testable" from Lorenzo
Stoakes continues the work of moving vma-related code into the
(relatively) new mm/vma.c
- "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David
Hildenbrand cleans up and rationalizes handling of gfp flags in the
page allocator
- "readahead: Reintroduce fix for improper RA window sizing" from Jan
Kara is a second attempt at fixing a readahead window sizing issue.
It should reduce the amount of unnecessary reading
- "synchronously scan and reclaim empty user PTE pages" from Qi Zheng
addresses an issue where "huge" amounts of pte pagetables are
accumulated:
https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/
Qi's series addresses this windup by synchronously freeing PTE
memory within the context of madvise(MADV_DONTNEED)
- "selftest/mm: Remove warnings found by adding compiler flags" from
Muhammad Usama Anjum fixes some build warnings in the selftests
code when optional compiler warnings are enabled
- "mm: don't use __GFP_HARDWALL when migrating remote pages" from
David Hildenbrand tightens the allocator's observance of
__GFP_HARDWALL
- "pkeys kselftests improvements" from Kevin Brodsky implements
various fixes and cleanups in the MM selftests code, mainly
pertaining to the pkeys tests
- "mm/damon: add sample modules" from SeongJae Park enhances DAMON to
estimate application working set size
- "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn
provides some cleanups to memcg's hugetlb charging logic
- "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song
removes the global swap cgroup lock. A speedup of 10% for a
tmpfs-based kernel build was demonstrated
- "zram: split page type read/write handling" from Sergey Senozhatsky
has several fixes and cleaups for zram in the area of
zram_write_page(). A watchdog softlockup warning was eliminated
- "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin
Brodsky cleans up the pagetable destructor implementations. A rare
use-after-free race is fixed
- "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes
simplifies and cleans up the debugging code in the VMA merging
logic
- "Account page tables at all levels" from Kevin Brodsky cleans up
and regularizes the pagetable ctor/dtor handling. This results in
improvements in accounting accuracy
- "mm/damon: replace most damon_callback usages in sysfs with new
core functions" from SeongJae Park cleans up and generalizes
DAMON's sysfs file interface logic
- "mm/damon: enable page level properties based monitoring" from
SeongJae Park increases the amount of information which is
presented in response to DAMOS actions
- "mm/damon: remove DAMON debugfs interface" from SeongJae Park
removes DAMON's long-deprecated debugfs interfaces. Thus the
migration to sysfs is completed
- "mm/hugetlb: Refactor hugetlb allocation resv accounting" from
Peter Xu cleans up and generalizes the hugetlb reservation
accounting
- "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino
removes a never-used feature of the alloc_pages_bulk() interface
- "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park
extends DAMOS filters to support not only exclusion (rejecting),
but also inclusion (allowing) behavior
- "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi
introduces a new memory descriptor for zswap.zpool that currently
overlaps with struct page for now. This is part of the effort to
reduce the size of struct page and to enable dynamic allocation of
memory descriptors
- "mm, swap: rework of swap allocator locks" from Kairui Song redoes
and simplifies the swap allocator locking. A speedup of 400% was
demonstrated for one workload. As was a 35% reduction for kernel
build time with swap-on-zram
- "mm: update mips to use do_mmap(), make mmap_region() internal"
from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that
mmap_region() can be made MM-internal
- "mm/mglru: performance optimizations" from Yu Zhao fixes a few
MGLRU regressions and otherwise improves MGLRU performance
- "Docs/mm/damon: add tuning guide and misc updates" from SeongJae
Park updates DAMON documentation
- "Cleanup for memfd_create()" from Isaac Manjarres does that thing
- "mm: hugetlb+THP folio and migration cleanups" from David
Hildenbrand provides various cleanups in the areas of hugetlb
folios, THP folios and migration
- "Uncached buffered IO" from Jens Axboe implements the new
RWF_DONTCACHE flag which provides synchronous dropbehind for
pagecache reading and writing. To permite userspace to address
issues with massive buildup of useless pagecache when
reading/writing fast devices
- "selftests/mm: virtual_address_range: Reduce memory" from Thomas
Weißschuh fixes and optimizes some of the MM selftests"
* tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits)
mm/compaction: fix UBSAN shift-out-of-bounds warning
s390/mm: add missing ctor/dtor on page table upgrade
kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags()
tools: add VM_WARN_ON_VMG definition
mm/damon/core: use str_high_low() helper in damos_wmark_wait_us()
seqlock: add missing parameter documentation for raw_seqcount_try_begin()
mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh
mm/page_alloc: remove the incorrect and misleading comment
zram: remove zcomp_stream_put() from write_incompressible_page()
mm: separate move/undo parts from migrate_pages_batch()
mm/kfence: use str_write_read() helper in get_access_type()
selftests/mm/mkdirty: fix memory leak in test_uffdio_copy()
kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags()
selftests/mm: virtual_address_range: avoid reading from VM_IO mappings
selftests/mm: vm_util: split up /proc/self/smaps parsing
selftests/mm: virtual_address_range: unmap chunks after validation
selftests/mm: virtual_address_range: mmap() without PROT_WRITE
selftests/memfd/memfd_test: fix possible NULL pointer dereference
mm: add FGP_DONTCACHE folio creation flag
mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue
...
|
|
Before SLUB initialization, various subsystems used memblock_alloc to
allocate memory. In most cases, when memory allocation fails, an
immediate panic is required. To simplify this behavior and reduce
repetitive checks, introduce `memblock_alloc_or_panic`. This function
ensures that memory allocation failures result in a panic automatically,
improving code readability and consistency across subsystems that require
this behavior.
[guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic]
Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com
Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/
Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com
Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> [s390]
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
There are cases when the bootloader provides information to the kernel
in both ACPI and DTB, not interchangeably. One such use case is virtual
machines in Android. When running on x86, the Android Virtualization
Framework (AVF) boots VMs with ACPI like it is usually done on x86 (i.e.
the virtual LAPIC, IOAPIC, HPET, PCI MMCONFIG etc are described in ACPI)
but also passes various AVF-specific boot parameters in DTB. This allows
reusing the same implementations of various AVF components on both
arm64 and x86.
Commit 7b937cc243e5 ("of: Create of_root if no dtb provided by firmware")
removed the possibility to do that, since among other things
it introduced forcing emptying the bootloader-provided DTB if ACPI is
enabled (probably assuming that if ACPI is available, a DTB can only be
useful for applying overlays to it afterwards, for testing purposes).
So restore this possibility. Instead of completely preventing using ACPI
and DT together, rely on arch-specific setup code to prevent using both
to set up the same things (see various acpi_disabled checks under arch/).
Fixes: 7b937cc243e5 ("of: Create of_root if no dtb provided by firmware")
Signed-off-by: Dmytro Maluka <dmaluka@chromium.org>
Link: https://lore.kernel.org/r/20250105172741.3476758-3-dmaluka@chromium.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
early_init_fdt_scan_reserved_mem() invoks fdt_get_mem_rsv(), and it will
use uninitialized variables @base and @size once the callee suffers error.
Fix by checking fdt_get_mem_rsv() error as other callers do.
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20250109-of_core_fix-v4-13-db8a72415b8c@quicinc.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
The usage of the macro allows to remove the custom handler function,
saving some memory. Additionally the code is easier to read.
While at it also mark the attribute as __ro_after_init, as the only
modification happens in the __init phase.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20241122-sysfs-const-bin_attr-of-v1-1-7052f9dcd4be@weissschuh.net
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
While OpenFirmware originally allowed walking parent nodes and default
root values for #address-cells and #size-cells, FDT has long required
explicit values. It's been a warning in dtc for the root node since the
beginning (2005) and for any parent node since 2007. Of course, not all
FDT uses dtc, but that should be the majority by far. The various
extracted OF devicetrees I have dating back to the 1990s (various
PowerMac, OLPC, PASemi Nemo) all have explicit root node properties. The
warning is disabled for Sparc as there are known systems relying on
default root node values.
Link: https://lore.kernel.org/r/20241106171028.3830266-1-robh@kernel.org
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
__pa() is only intended to be used for linear map addresses and using
it for initial_boot_params which is in fixmap for arm64 will give an
incorrect value. Hence save the physical address when it is known at
boot time when calling early_init_dt_scan for arm64 and use it at kexec
time instead of converting the virtual address using __pa().
Note that arm64 doesn't need the FDT region reserved in the DT as the
kernel explicitly reserves the passed in FDT. Therefore, only a debug
warning is fixed with this change.
Reported-by: Breno Leitao <leitao@debian.org>
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Usama Arif <usamaarif642@gmail.com>
Fixes: ac10be5cdbfa ("arm64: Use common of_kexec_alloc_and_setup_fdt()")
Link: https://lore.kernel.org/r/20241023171426.452688-1-usamaarif642@gmail.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Reserved memory regions defined in the devicetree can be broken up into
two groups:
i) Statically-placed reserved memory regions
i.e. regions defined with a static start address and size using the
"reg" property.
ii) Dynamically-placed reserved memory regions.
i.e. regions defined by specifying an address range where they can be
placed in memory using the "alloc_ranges" and "size" properties.
These regions are processed and set aside at boot time.
This is done in two stages as seen below:
Stage 1:
At this stage, fdt_scan_reserved_mem() scans through the child nodes of
the reserved_memory node using the flattened devicetree and does the
following:
1) If the node represents a statically-placed reserved memory region,
i.e. if it is defined using the "reg" property:
- Call memblock_reserve() or memblock_mark_nomap() as needed.
- Add the information for that region into the reserved_mem array
using fdt_reserved_mem_save_node().
i.e. fdt_reserved_mem_save_node(node, name, base, size).
2) If the node represents a dynamically-placed reserved memory region,
i.e. if it is defined using "alloc-ranges" and "size" properties:
- Add the information for that region to the reserved_mem array with
the starting address and size set to 0.
i.e. fdt_reserved_mem_save_node(node, name, 0, 0).
Note: This region is saved to the array with a starting address of 0
because a starting address is not yet allocated for it.
Stage 2:
After iterating through all the reserved memory nodes and storing their
relevant information in the reserved_mem array,fdt_init_reserved_mem() is
called and does the following:
1) For statically-placed reserved memory regions:
- Call the region specific init function using
__reserved_mem_init_node().
2) For dynamically-placed reserved memory regions:
- Call __reserved_mem_alloc_size() which is used to allocate memory
for each of these regions, and mark them as nomap if they have the
nomap property specified in the DT.
- Call the region specific init function.
The current size of the resvered_mem array is 64 as is defined by
MAX_RESERVED_REGIONS. This means that there is a limitation of 64 for
how many reserved memory regions can be specified on a system.
As systems continue to grow more and more complex, the number of
reserved memory regions needed are also growing and are starting to hit
this 64 count limit, hence the need to make the reserved_mem array
dynamically sized (i.e. dynamically allocating memory for the
reserved_mem array using membock_alloc_*).
On architectures such as arm64, memory allocated using memblock is
writable only after the page tables have been setup. This means that if
the reserved_mem array is going to be dynamically allocated, it needs to
happen after the page tables have been setup, not before.
Since the reserved memory regions are currently being processed and
added to the array before the page tables are setup, there is a need to
change the order in which some of the processing is done to allow for
the reserved_mem array to be dynamically sized.
It is possible to process the statically-placed reserved memory regions
without needing to store them in the reserved_mem array until after the
page tables have been setup because all the information stored in the
array is readily available in the devicetree and can be referenced at
any time.
Dynamically-placed reserved memory regions on the other hand get
assigned a start address only at runtime, and hence need a place to be
stored once they are allocated since there is no other referrence to the
start address for these regions.
Hence this patch changes the processing order of the reserved memory
regions in the following ways:
Step 1:
fdt_scan_reserved_mem() scans through the child nodes of
the reserved_memory node using the flattened devicetree and does the
following:
1) If the node represents a statically-placed reserved memory region,
i.e. if it is defined using the "reg" property:
- Call memblock_reserve() or memblock_mark_nomap() as needed.
2) If the node represents a dynamically-placed reserved memory region,
i.e. if it is defined using "alloc-ranges" and "size" properties:
- Call __reserved_mem_alloc_size() which will:
i) Allocate memory for the reserved region and call
memblock_mark_nomap() as needed.
ii) Call the region specific initialization function using
fdt_init_reserved_mem_node().
iii) Save the region information in the reserved_mem array using
fdt_reserved_mem_save_node().
Step 2:
1) This stage of the reserved memory processing is now only used to add
the statically-placed reserved memory regions into the reserved_mem
array using fdt_scan_reserved_mem_reg_nodes(), as well as call their
region specific initialization functions.
2) This step has also been moved to be after the page tables are
setup. Moving this will allow us to replace the reserved_mem
array with a dynamically sized array before storing the rest of
these regions.
Signed-off-by: Oreoluwa Babatunde <quic_obabatun@quicinc.com>
Link: https://lore.kernel.org/r/20241008220624.551309-2-quic_obabatun@quicinc.com
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
|
|
scripts/Makefile.lib is included not only from scripts/Makefile.build
but also from scripts/Makefile.{modfinal,package,vmlinux,vmlinux_o},
where DT build rules are not required.
Split the DT build rules out to scripts/Makefile.dtbs, and include it
only when necessary.
While I was here, I added $(DT_TMP_SCHEMA) as a prerequisite of
$(multi-dtb-y).
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
|
|
Now that we initialize dt_root_addr_cells and dt_root_size_cells earlier,
use them and simplify of_fdt_limit_memory.
Link: https://lore.kernel.org/all/20180830190523.31474-3-robh@kernel.org/
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Scan the root node properties (#{size,address}-cells) earlier, so that
the dt_root_addr_cells and dt_root_size_cells variables are initialized
and can be used.
Link: https://lore.kernel.org/all/20180830190523.31474-2-robh@kernel.org/
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
The split of /reserved-memory handling between fdt.c and
of_reserved_mem.c makes for reading and restructuring the code
difficult. As of_reserved_mem.c is only built for
CONFIG_OF_EARLY_FLATTREE already, move all the code to one spot.
Acked-by: Saravana Kannan <saravanak@google.com>
Reviewed-by: Oreoluwa Babatunde <quic_obabatun@quicinc.com>
Link: https://lore.kernel.org/r/20240311181303.1516514-2-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
When enabling CONFIG_OF on a platform where 'of_root' is not populated
by firmware, we end up without a root node. In order to apply overlays
and create subnodes of the root node, we need one. Create this root node
by unflattening an empty builtin dtb.
If firmware provides a flattened device tree (FDT) then the FDT is
unflattened via setup_arch(). Otherwise, the call to
unflatten(_and_copy)?_device_tree() will create an empty root node.
We make of_have_populated_dt() return true only if the DTB was loaded by
firmware so that existing callers don't change behavior after this
patch. The call in the of platform code is removed because it prevents
overlays from creating platform devices when the empty root node is
used.
[sboyd@kernel.org: Update of_have_populated_dt() to treat this empty dtb
as not populated. Drop setup_of() initcall]
Signed-off-by: Frank Rowand <frowand.list@gmail.com>
Link: https://lore.kernel.org/r/20230317053415.2254616-2-frowand.list@gmail.com
Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20240217010557.2381548-3-sboyd@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
We want to populate an empty DT whenever CONFIG_OF is enabled so that
overlays can be applied and the DT unit tests can be run. Make
unflatten_and_copy_device_tree() stop printing a warning if the
'initial_boot_params' pointer is NULL. Instead, simply copy the dtb if
there is one and then unflatten it. If there isn't a DT to copy, then
the call to unflatten_device_tree() is largely a no-op, so nothing
really changes here.
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Link: https://lore.kernel.org/r/20240217010557.2381548-2-sboyd@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:
- Add Conor Dooley as a DT binding maintainer
- Swap the order of parsing /memreserve/ and /reserved-memory nodes so
that the /reserved-memory nodes which have more information are
handled first
- Fix some property dependencies in riscv,pmu binding
- Update maintainers entries on a couple of bindings
* tag 'devicetree-fixes-for-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
MAINTAINERS: add Conor as a dt-bindings maintainer
dt-bindings: perf: riscv,pmu: fix property dependencies
dt-bindings: xilinx: Remove Naga from memory and mtd bindings
of: fdt: Scan /memreserve/ last
dt-bindings: clock: r9a06g032-sysctrl: Change maintainer to Fabrizio Castro
dt-bindings: pinctrl: renesas,rzv2m: Change maintainer to Fabrizio Castro
dt-bindings: pinctrl: renesas,rzn1: Change maintainer to Fabrizio Castro
dt-bindings: i2c: renesas,rzv2m: Change maintainer to Fabrizio Castro
|
|
Change the scanning /memreserve/ and /reserved-memory node order to fix
Kernel panic on Khadas Vim3 Board.
If /memreserve/ goes first, the memory is reserved, but nomap can't be
applied to the region. So the memory won't be used by Linux, but it is
still present in the linear map as normal memory, which allows
speculation. Legitimate access to adjacent pages will cause the CPU
to end up prefetching into them leading to Kernel panic.
So /reserved-memory node should go first, as it has a more updated
description of the memory regions and can apply flags, like nomap.
Link: https://lore.kernel.org/all/CAJX_Q+1Tjc+-TjZ6JW9X0NxEdFe=82a9626yL63j7uVD4LpxEA@mail.gmail.com/
Signed-off-by: Lucas Tanure <tanure@linux.com>
Link: https://lore.kernel.org/r/20230424113846.46382-1-tanure@linux.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
During the early page table creation, we used to set the mapping for
PAGE_OFFSET to the kernel load address: but the kernel load address is
always offseted by PMD_SIZE which makes it impossible to use PUD/P4D/PGD
pages as this physical address is not aligned on PUD/P4D/PGD size (whereas
PAGE_OFFSET is).
But actually we don't have to establish this mapping (ie set va_pa_offset)
that early in the boot process because:
- first, setup_vm installs a temporary kernel mapping and among other
things, discovers the system memory,
- then, setup_vm_final creates the final kernel mapping and takes
advantage of the discovered system memory to create the linear
mapping.
During the first phase, we don't know the start of the system memory and
then until the second phase is finished, we can't use the linear mapping at
all and phys_to_virt/virt_to_phys translations must not be used because it
would result in a different translation from the 'real' one once the final
mapping is installed.
So here we simply delay the initialization of va_pa_offset to after the
system memory discovery. But to make sure noone uses the linear mapping
before, we add some guard in the DEBUG_VIRTUAL config.
Finally we can use PUD/P4D/PGD hugepages when possible, which will result
in a better TLB utilization.
Note that:
- this does not apply to rv32 as the kernel mapping lies in the linear
mapping.
- we rely on the firmware to protect itself using PMP.
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Acked-by: Rob Herring <robh@kernel.org> # DT bits
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20230324155421.271544-4-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
This reverts commit 972fa3a7c17c9d60212e32ecc0205dc585b1e769.
Kmemleak operates by periodically scanning memory regions for pointers to
allocated memory blocks to determine if they are leaked or not. However,
reserved memory regions can be used for DMA transactions between a device
and a CPU, and thus, wouldn't contain pointers to allocated memory blocks,
making them inappropriate for kmemleak to scan. Thus, revert this commit.
Link: https://lkml.kernel.org/r/20230124230254.295589-1-isaacmanjarres@google.com
Fixes: 972fa3a7c17c9 ("mm: kmemleak: alloc gray object for reserved region with direct map")
Signed-off-by: Isaac J. Manjarres <isaacmanjarres@google.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Calvin Zhang <calvinzhang.cool@gmail.com>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Saravana Kannan <saravanak@google.com>
Cc: <stable@vger.kernel.org> [5.17+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
I do not read a strict requirement on /chosen node in either ePAPR or in
Documentation/devicetree. Help text for CONFIG_CMDLINE and
CONFIG_CMDLINE_EXTEND doesn't make their behavior explicitly dependent on
the presence of /chosen or the presense of /chosen/bootargs.
However the early check for /chosen and bailing out in
early_init_dt_scan_chosen() skips CONFIG_CMDLINE handling which is not
really related to /chosen node or the particular method of passing cmdline
from bootloader.
This leads to counterintuitive combinations (assuming
CONFIG_CMDLINE_EXTEND=y):
a) bootargs="foo", CONFIG_CMDLINE="bar" => cmdline=="foo bar"
b) /chosen missing, CONFIG_CMDLINE="bar" => cmdline==""
c) bootargs="", CONFIG_CMDLINE="bar" => cmdline==" bar"
Rework early_init_dt_scan_chosen() so that the cmdline config options are
always handled.
[commit msg written by Alexander Sverdlin]
Cc: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Tested-by: Geoff Levand <geoff@infradead.org>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Link: https://lore.kernel.org/r/20230103-dt-cmdline-fix-v1-2-7038e88b18b6@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
This reverts commit a7d550f82b445cf218b47a2c1a9c56e97ecb8c7a.
Some arches (PPC at least) don't call early_init_dt_scan_nodes(), so
moving the cmdline processing there breaks them.
Reported-by: Geoff Levand <geoff@infradead.org>
Cc: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Tested-by: Geoff Levand <geoff@infradead.org>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Link: https://lore.kernel.org/r/20230103-dt-cmdline-fix-v1-1-7038e88b18b6@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
If memory has been found early_init_dt_scan_memory now returns 1. If
it hasn't found any memory it will return 0, allowing other memory
setup mechanisms to carry on.
Previously early_init_dt_scan_memory always returned 0 without
distinguishing between any kind of memory setup being done or not. Any
code path after the early_init_dt_scan memory call in the ramips
plat_mem_setup code wouldn't be executed anymore. Making
early_init_dt_scan_memory the only way to initialize the memory.
Some boards, including my mt7621 based Cudy X6 board, depend on memory
initialization being done via the soc_info.mem_detect function
pointer. Those wouldn't be able to obtain memory and panic the kernel
during early bootup with the message "early_init_dt_alloc_memory_arch:
Failed to allocate 12416 bytes align=0x40".
Fixes: 1f012283e936 ("of/fdt: Rework early_init_dt_scan_memory() to call directly")
Cc: stable@vger.kernel.org
Signed-off-by: Andreas Rammhold <andreas@rammhold.de>
Link: https://lore.kernel.org/r/20221223112748.2935235-1-andreas@rammhold.de
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
I do not read a strict requirement on /chosen node in either ePAPR or in
Documentation/devicetree. Help text for CONFIG_CMDLINE and
CONFIG_CMDLINE_EXTEND doesn't make their behavior explicitly dependent on
the presence of /chosen or the presense of /chosen/bootargs.
However the early check for /chosen and bailing out in
early_init_dt_scan_chosen() skips CONFIG_CMDLINE handling which is not
really related to /chosen node or the particular method of passing cmdline
from bootloader.
This leads to counterintuitive combinations (assuming
CONFIG_CMDLINE_EXTEND=y):
a) bootargs="foo", CONFIG_CMDLINE="bar" => cmdline=="foo bar"
b) /chosen missing, CONFIG_CMDLINE="bar" => cmdline==""
c) bootargs="", CONFIG_CMDLINE="bar" => cmdline==" bar"
Move CONFIG_CMDLINE handling outside of early_init_dt_scan_chosen() so that
cases b and c above result in the same cmdline.
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/11af73e05bad75e4ef49067515e3214f6d944b3d.camel@gmail.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree updates from Rob Herring:
"DT core:
- Fix node refcounting in of_find_last_cache_level()
- Constify device_node in of_device_compatible_match()
- Fix 'dma-ranges' handling in bus controller nodes
- Fix handling of initrd start > end
- Improve error reporting in of_irq_init()
- Taint kernel on DT unittest running
- Use strscpy instead of strlcpy
- Add a build target, dt_compatible_check, to check for compatible
strings used in kernel sources against compatible strings in DT
schemas.
- Handle DT_SCHEMA_FILES changes when rebuilding
DT bindings:
- LED bindings for MT6370 PMIC
- Convert Mediatek mtk-gce mailbox, MIPS CPU interrupt controller,
mt7621 I2C, virtio,pci-iommu, nxp,tda998x, QCom fastrpc, qcom,pdc,
and arm,versatile-sysreg to DT schema format
- Add nvmem cells to u-boot,env schema
- Add more LED_COLOR_ID definitions
- Require 'opp-table' uses to be a node
- Various schema fixes to match QEMU 'virt' DT usage
- Tree wide dropping of redundant 'Device Tree Binding' in schema
titles
- More (unevaluated|additional)Properties fixes in schema child nodes
- Drop various redundant minItems equal to maxItems"
* tag 'devicetree-for-6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (62 commits)
of: base: Shift refcount decrement in of_find_last_cache_level()
dt-bindings: leds: Add MediaTek MT6370 flashlight
dt-bindings: leds: mt6370: Add MediaTek MT6370 current sink type LED indicator
dt-bindings: mailbox: Convert mtk-gce to DT schema
of: base: make of_device_compatible_match() accept const device node
of: Fix "dma-ranges" handling for bus controllers
of: fdt: Remove unused struct fdt_scan_status
dt-bindings: display: st,stm32-dsi: Handle data-lanes in DSI port node
dt-bindings: timer: Add power-domains for TI timer-dm on K3
dt: Add a check for undocumented compatible strings in kernel
kbuild: take into account DT_SCHEMA_FILES changes while checking dtbs
dt-bindings: interrupt-controller: migrate MIPS CPU interrupt controller text bindings to YAML
dt-bindings: i2c: migrate mt7621 text bindings to YAML
dt-bindings: power: gpcv2: correct patternProperties
dt-bindings: virtio: Convert virtio,pci-iommu to DT schema
dt-bindings: timer: arm,arch_timer: Allow dual compatible string
dt-bindings: arm: cpus: Add kryo240 compatible
dt-bindings: display: bridge: nxp,tda998x: Convert to json-schema
dt-bindings: nvmem: u-boot,env: add basic NVMEM cells
dt-bindings: remoteproc: qcom,adsp: enforce smd-edge schema
...
|
|
After commit bba04d965d06("of/fdt: remove unused of_scan_flat_dt_by_path"), no
one use struct fdt_scan_status, so remove it.
Signed-off-by: Yuan Can <yuancan@huawei.com>
Reviewed-by: Frank Rowand <frank.rowand@sony.com>
Link: https://lore.kernel.org/r/20220927133739.98493-1-yuancan@huawei.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
If the properties 'linux,initrd-start' and 'linux,initrd-end' of
the chosen node populated from the bootloader, eg. U-Boot, are so that
start > end, then the phys_initrd_size calculated from end - start is
negative that subsequently gets converted to a high positive value for
being unsigned long long. Then, the memory region with the (invalid)
size is added to the bootmem and attempted being paged in paging_init()
that results in the kernel fault.
For example, on the FVP ARM64 system I'm running, the U-Boot populates
the 'linux,initrd-start' with 8800_0000 and 'linux,initrd-end' with 0.
The phys_initrd_size calculated is then ffff_ffff_7800_0000
(= 0 - 8800_0000 = -8800_0000 + ULLONG_MAX + 1). paging_init() then
attempts to map the address 8800_0000 + ffff_ffff_7800_0000 and oops'es
as below.
It should be stressed, it is generally a fault of the bootloader's with
the kernel relying on it, however we should not allow the bootloader's
misconfiguration to lead to the kernel oops. Not only the kernel should be
bullet proof against it but also finding the root cause of the paging
fault spanning over the bootloader, DT, and kernel may happen is not so
easy.
Unable to handle kernel paging request at virtual address fffffffefe43c000
Mem abort info:
ESR = 0x96000007
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000007
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000080e3d000
[fffffffefe43c000] pgd=0000000080de9003, pud=0000000080de9003
Unable to handle kernel paging request at virtual address ffffff8000de9f90
Mem abort info:
ESR = 0x96000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000005
CM = 0, WnR = 0
swapper pgtable: 4k pages, 39-bit VAs, pgdp=0000000080e3d000
[ffffff8000de9f90] pgd=0000000000000000, pud=0000000000000000
Internal error: Oops: 96000005 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.51-yocto-standard #1
Hardware name: FVP Base (DT)
pstate: 60000085 (nZCv daIf -PAN -UAO)
pc : show_pte+0x12c/0x1b4
lr : show_pte+0x100/0x1b4
sp : ffffffc010ce3b30
x29: ffffffc010ce3b30 x28: ffffffc010ceed80
x27: fffffffefe43c000 x26: fffffffefe43a028
x25: 0000000080bf0000 x24: 0000000000000025
x23: ffffffc010b8d000 x22: ffffffc010e3d000
x23: ffffffc010b8d000 x22: ffffffc010e3d000
x21: 0000000080de9000 x20: ffffff7f80000f90
x19: fffffffefe43c000 x18: 0000000000000030
x17: 0000000000001400 x16: 0000000000001c00
x15: ffffffc010cef1b8 x14: ffffffffffffffff
x13: ffffffc010df1f40 x12: ffffffc010df1b70
x11: ffffffc010ce3b30 x10: ffffffc010ce3b30
x9 : 00000000ffffffc8 x8 : 0000000000000000
x7 : 000000000000000f x6 : ffffffc010df16e8
x5 : 0000000000000000 x4 : 0000000000000000
x3 : 00000000ffffffff x2 : 0000000000000000
x1 : 0000008080000000 x0 : ffffffc010af1d68
Call trace:
show_pte+0x12c/0x1b4
die_kernel_fault+0x54/0x78
__do_kernel_fault+0x11c/0x128
do_translation_fault+0x58/0xac
do_mem_abort+0x50/0xb0
el1_da+0x1c/0x90
__create_pgd_mapping+0x348/0x598
paging_init+0x3f0/0x70d0
setup_arch+0x2c0/0x5d4
start_kernel+0x94/0x49c
Code: 92748eb5 900052a0 9135a000 cb010294 (f8756a96)
Signed-off-by: Marek Bykowski <marek.bykowski@gmail.com>
Link: https://lore.kernel.org/r/20220909023358.76881-1-marek.bykowski@gmail.com
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.
Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/20220818210054.7157-1-wsa+renesas@sang-engineering.com
|
|
Commit 78c44d910d3e ("drivers/of: Fix depth when unflattening devicetree")
forgot to fix up the depth check in the loop body in unflatten_dt_nodes()
which makes it possible to overflow the nps[] buffer...
Found by Linux Verification Center (linuxtesting.org) with the SVACE static
analysis tool.
Fixes: 78c44d910d3e ("drivers/of: Fix depth when unflattening devicetree")
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://lore.kernel.org/r/7c354554-006f-6b31-c195-cdfe4caee392@omp.ru
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty / serial driver updates from Greg KH:
"Here is the big set of tty and serial driver changes for 6.0-rc1.
It was delayed from last week as I wanted to make sure the last commit
here got some good testing in linux-next and elsewhere as it seemed to
show up only late in testing for some reason.
Nothing major here, just lots of cleanups from Jiri and Ilpo to make
the tty core cleaner (Jiri) and the rs485 code simpler to use (Ilpo).
Also included in here is the obligatory n_gsm updates from Daniel
Starke and lots of tiny driver updates and minor fixes and tweaks for
other smaller serial drivers.
All of these have been in linux-next for a while with no reported
problems"
* tag 'tty-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (186 commits)
tty: serial: qcom-geni-serial: Fix %lu -> %u in print statements
tty: amiserial: Fix comment typo
tty: serial: document uart_get_console()
tty: serial: serial_core, reformat kernel-doc for functions
Documentation: serial: link uart_ops properly
Documentation: serial: move GPIO kernel-doc to the functions
Documentation: serial: dedup kernel-doc for uart functions
Documentati |