aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2026-02-03Merge branch 'net-dsa-yt921x-add-dcb-qos-support'Paolo Abeni4-62/+393
David Yang says: ==================== net: dsa: yt921x: Add DCB/QoS support This series add DCB/QoS support to the driver. v5: https://lore.kernel.org/r/20260128215202.2244266-1-mmyangfl@gmail.com v4: https://lore.kernel.org/r/20260127020847.1482724-1-mmyangfl@gmail.com v3: https://lore.kernel.org/r/20260125001328.3784006-1-mmyangfl@gmail.com v2: https://lore.kernel.org/r/20260122194233.2777550-1-mmyangfl@gmail.com v1: https://lore.kernel.org/r/20260119185935.2072685-1-mmyangfl@gmail.com ==================== Link: https://patch.msgid.link/20260131021854.3405036-1-mmyangfl@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03net: dsa: yt921x: Add DCB/QoS supportDavid Yang3-11/+298
Set up global DSCP/PCP priority mappings and add related DCB methods. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260131021854.3405036-6-mmyangfl@gmail.com Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03net: dsa: yt921x: Refactor yt921x_chip_setup()David Yang1-11/+24
yt921x_chip_setup() is already pretty long, and is going to become longer. Split it into parts. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260131021854.3405036-5-mmyangfl@gmail.com Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03net: dsa: yt921x: Refactor VLAN awareness settingDavid Yang1-10/+14
Create a helper function to centralize the logic for enabling and disabling VLAN awareness on a port. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260131021854.3405036-4-mmyangfl@gmail.com Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03net: dsa: tag_yt921x: add priority supportDavid Yang1-1/+5
Required by DCB/QoS support of the switch driver, since the rx packets will have non-zero priorities. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260131021854.3405036-3-mmyangfl@gmail.com Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03net: dsa: tag_yt921x: clarify priority and code fieldsDavid Yang1-30/+53
Packet priority is part of the tag, and the priority and code fields are used by tx and rx. Make revisions to reflect the facts. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260131021854.3405036-2-mmyangfl@gmail.com Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03dt-bindings: spi: Add binding for Faraday FTSSP010Linus Walleij1-0/+43
This adds a binding for the Faraday FTSSP010 SSP controller, a pretty straight-forward syncronous serial port and SPI controller. The bindings are submitted separately because the one device that has this is using it in a "nonstandard way" with regards to the electronics, and does not make it possible to develop or test a proper driver. However we want to be able to add this resource to the device trees and it's not complex. Signed-off-by: Linus Walleij <linusw@kernel.org> Link: https://patch.msgid.link/20260203-gemini-ssp-bindings-v1-1-6d85c9c72371@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2026-02-03ASoC: SOF: Intel: hda: add SDCA property checkMac Chiang1-0/+6
If SDCA property is not present in the DisCo table, assume it is present. This allows DAI links to be created from codec_info_list instead of being skipped. Signed-off-by: Mac Chiang <mac.chiang@intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20260203095923.3741674-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-02-03ASoC: Intel: soc-acpi-intel-ptl-match: drop rt721 related match tablesMac Chiang1-50/+0
Use functional topologies to support all RT721-related topology and amplifier combinations, e.g. sof-ptl-rt721.tplg, sof-ptl-rt721-l3-rt1320-l3.tplg. If these entries are not removed, they will all use the sof-ptl-rt721.tplg. Signed-off-by: Mac Chiang <mac.chiang@intel.com> Reviewed-by: Liam Girdwood <liam.r.girdwood@intel.com> Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Link: https://patch.msgid.link/20260203100027.3741754-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-02-03ASoC: SOF: Intel: allow module parameter override BT link to 0Bard Liao1-2/+2
The existing code test if (bt_link_mask_override) to overwrite the BT link mask. This doesn't allow user to disable the BT link mask. User may want to disable the BT link when it is detected by the NHLT. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Link: https://patch.msgid.link/20260203111545.3742255-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-02-03ASoC: SOF: Intel: hda-sdw-bpt: support simultaneous audio and BPT streamsBard Liao1-2/+3
Currently the SoundWire BPT stream uses the paired link DMA but not reserve it. It works without any issue because we assume the SoundWire BPT will not run with audio streams simultaneously. To support simultaneous audio and BPT streams, we need to use the hda_dma_prepare/cleanup helpers to reserve the pair link host DMA. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev> Link: https://patch.msgid.link/20260203114027.3742558-4-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-02-03ASoC: SOF: Intel: add hda_dma_prepare/cleanup helpersBard Liao3-91/+127
SoundWire BPT stream needs to use link and host DMAs. Thus we need helpers to prepare and cleanup the link and host DMAs. Currently the SoundWire BPT stream uses hda_cl_prepare/cleanup helpers. It works fine because we assume the SwoundWire BPT will not run with audio streams simultaneously. The new helpers are copied from hda_cl_prepare/cleanup and add a flag to reserve the paired host and link DMAs. The new helpers will be used by both code loader and SoundWire BPT. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev> Link: https://patch.msgid.link/20260203114027.3742558-3-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-02-03ASoC: SOF: Intel: add hda_dsp_stream_pair_get/put helpersBard Liao2-3/+41
Currently, hda_dsp_stream_get/put are used to get/put the host dma. However, we may want to use a hda stream that both host and link dma are available. Add helper to find the hda stream and reserve/release it. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev> Link: https://patch.msgid.link/20260203114027.3742558-2-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-02-03ASoC: rt1320: fix intermittent no-sound issueShuming Fan1-0/+2
This patch adds a setting to resolve the intermittent no-sound issue. Signed-off-by: Shuming Fan <shumingf@realtek.com> Link: https://patch.msgid.link/20260203084827.768238-1-shumingf@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-02-03ASoC: SOF: Intel: use hdev->info.link_mask directlyBard Liao1-3/+2
The link_mask variable is not changed after setting to hdev->info.link_mask until it is used for another purpose to get the used SoundWire links and set to mach->mach_params.links. Besides, the link_mask variable should be reset before any link id is added to the link_mask. To fix the issue above and avoid confusing, use the hdev->info.link_mask variable directly to check if the SoundWire link is enabled. Fixes: 5226d19d4cae ("ASoC: SOF: Intel: use sof_sdw as default SDW machine driver") Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://patch.msgid.link/20260203072405.3716307-1-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2026-02-03iommupt: Always add IOVA range to iotlb_gather in gather_range_pages()Yu Zhang1-2/+1
Add current (iova, len) to the iotlb gather, regardless of the setting of PT_FEAT_FLUSH_RANGE or PT_FEAT_FLUSH_RANGE_NO_GAPS. In gather_range_pages(), the current IOVA range is only added to iotlb_gather when PT_FEAT_FLUSH_RANGE is set. Yet a virtual IOMMU with NpCache uses only PT_FEAT_FLUSH_RANGE_NO_GAPS. In that case, iotlb_gather will stay empty (start=ULONG_MAX, end=0) after initialization, and the current (iova, len) will not be added to the iotlb_gather, causing subsequent iommu_iotlb_sync() to perform IOTLB invalidation with wrong parameters (e.g., amd_iommu_iotlb_sync() computes size from gather->end - gather->start + 1, leading to an invalid range). The disjoint check and sync for PT_FEAT_FLUSH_RANGE_NO_GAPS remain unchanged: when the new range is disjoint from the existing gather, we still sync first and then add the new range, so semantics for NO_GAPS are preserved. Fixes: 7c53f4238aa8 ("iommupt: Add unmap_pages op") Cc: stable@vger.kernel.org Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Yu Zhang <zhangyu1@linux.microsoft.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-02-03iommu/amd: serialize sequence allocation under concurrent TLB invalidationsAnkit Soni3-8/+14
With concurrent TLB invalidations, completion wait randomly gets timed out because cmd_sem_val was incremented outside the IOMMU spinlock, allowing CMD_COMPL_WAIT commands to be queued out of sequence and breaking the ordering assumption in wait_on_sem(). Move the cmd_sem_val increment under iommu->lock so completion sequence allocation is serialized with command queuing. And remove the unnecessary return. Fixes: d2a0cac10597 ("iommu/amd: move wait_on_sem() out of spinlock") Tested-by: Srikanth Aithal <sraithal@amd.com> Reported-by: Srikanth Aithal <sraithal@amd.com> Signed-off-by: Ankit Soni <Ankit.Soni@amd.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2026-02-03wifi: iwlwifi: mvm: pause TCM on fast resumeMiri Korenblit1-1/+5
Not pausing it means that we can have the TCM work queued into a non-freezable workqueue, which, in resume, is re-activated before the driver's resume is called. The TCM work might send commands to the FW before we resumed the device, leading to an assert. Closes: https://lore.kernel.org/linux-wireless/aTDoDiD55qlUZ0pn@debian.local/ Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com> Fixes: e8bb19c1d590 ("wifi: iwlwifi: support fast resume") Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260129212650.05621f3faedb.I44df9cf9183b5143df8078131e0d87c0fd7e1763@changeid
2026-02-03wifi: iwlwifi: mld: cancel mlo_scan_start_wkMiri Korenblit2-2/+2
mlo_scan_start_wk is not canceled on disconnection. In fact, it is not canceled anywhere except in the restart cleanup, where we don't really have to. This can cause an init-after-queue issue: if, for example, the work was queued and then drv_change_interface got executed. This can also cause use-after-free: if the work is executed after the vif is freed. Fixes: 9748ad82a9d9 ("wifi: iwlwifi: defer MLO scan after link activation") Reviewed-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://patch.msgid.link/20260129212650.a36482a60719.I5bf64a108ca39dacb5ca0dcd8b7258a3ce8db74c@changeid
2026-02-03PCI: s32g: Skip Root Port removal during successVincent Guittot1-4/+4
Currently, s32g_pcie_parse_ports() exercises the 'err_port' path even during the success case. This results in ports getting deleted after successful parsing of Root Ports. Hence, skip the removal of Root Ports during success. Fixes: 5cbc7d3e316e ("PCI: s32g: Add NXP S32G PCIe controller driver (RC)") Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> [mani: reworded subject and description] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20260202151050.1446165-1-vincent.guittot@linaro.org
2026-02-03apparmor: fix cast in format string DEBUG statementJohn Johansen1-1/+1
if debugging is enabled the DEBUG statement will fail do to a bad fat fingered cast. Fixes: 102ada7ca37ed ("apparmor: fix fmt string type error in process_strs_entry") Signed-off-by: John Johansen <john.johansen@canonical.com>
2026-02-03wifi: brcmsmac: phy: Remove unreachable error handling codeIngyu Jang1-7/+2
wlc_phy_txpwr_srom_read_lcnphy() in wlc_phy_attach_lcnphy() always returns true, making the error handling code unreachable. Change the function's return type to void and remove the dead code, similar to the cleanup done for wlc_phy_txpwr_srom_read_nphy() in commit 47f0e32ffe4e ("wifi: brcmsmac: phy: Remove unreachable code"). Signed-off-by: Ingyu Jang <ingyujang25@korea.ac.kr> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Link: https://patch.msgid.link/20260131172355.3367673-1-ingyujang25@korea.ac.kr Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-02-03rust_binder: return p from rust_binder_transaction_target_node()Alice Ryhl1-1/+1
Somehow this got changed to NULL when I ported this to upstream it. No idea how that happened. Reported-by: Carlos Llamas <cmllamas@google.com> Closes: https://lore.kernel.org/r/aXkEiC1sGOGfDuzI@google.com Fixes: c1ea31205edf ("rust_binder: add binder_transaction tracepoint") Signed-off-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Carlos Llamas <cmllamas@google.com> Link: https://patch.msgid.link/20260128-binder-fix-target-node-null-v1-1-78d198ef55a5@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-03binderfs: fix ida_alloc_max() upper boundCarlos Llamas1-4/+4
The 'max' argument of ida_alloc_max() takes the maximum valid ID and not the "count". Using an ID of BINDERFS_MAX_MINOR (1 << 20) for dev->minor would exceed the limits of minor numbers (20-bits). Fix this off-by-one error by subtracting 1 from the 'max'. Cc: stable@vger.kernel.org Fixes: 3ad20fe393b3 ("binder: implement binderfs") Signed-off-by: Carlos Llamas <cmllamas@google.com> Link: https://patch.msgid.link/20260127235545.2307876-2-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-03rust_binderfs: fix ida_alloc_max() upper boundCarlos Llamas1-4/+4
The 'max' argument of ida_alloc_max() takes the maximum valid ID and not the "count". Using an ID of BINDERFS_MAX_MINOR (1 << 20) for dev->minor would exceed the limits of minor numbers (20-bits). Fix this off-by-one error by subtracting 1 from the 'max'. Cc: stable@vger.kernel.org Fixes: eafedbc7c050 ("rust_binder: add Rust Binder driver") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202512181203.IOv6IChH-lkp@intel.com/ Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Link: https://patch.msgid.link/20260127235545.2307876-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-03drivers: android: binder: Update ARef imports from sync::arefShankari Anand2-3/+2
Update call sites in binder files to import `ARef` from `sync::aref` instead of `types`. This aligns with the ongoing effort to move `ARef` and `AlwaysRefCounted` to sync. Suggested-by: Benno Lossin <lossin@kernel.org> Link: https://github.com/Rust-for-Linux/linux/issues/1173 Signed-off-by: Shankari Anand <shankari.ak0208@gmail.com> Acked-by: Alice Ryhl <aliceryhl@google.com> Link: https://patch.msgid.link/20260102202714.184223-2-shankari.ak0208@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-03rust_binder: fix needless borrow in context.rsShivam Kalra1-1/+1
Clippy warns about a needless borrow in context.rs: error: this expression creates a reference which is immediately dereferenced by the compiler --> drivers/android/binder/context.rs:141:18 | 141 | func(&proc); | ^^^^^ help: change this to: `proc` Remove the unnecessary borrow to satisfy clippy and improve code cleanliness. No functional change. Signed-off-by: Shivam Kalra <shivamklr@cock.li> Acked-by: Alice Ryhl <aliceryhl@google.com> Link: https://patch.msgid.link/20260130182842.217821-1-shivamklr@cock.li Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-03s390/kexec: Emit an error message when cmdline is too longVasily Gorbik1-1/+3
Currently, if the command line passed to kexec_file_load() exceeds the supported limit of the kernel being kexec'd, -EINVAL is returned to userspace, which is consistent across architectures. Since -EINVAL is not specific to this case, the kexec tool cannot provide a specific reason for the failure. Many architectures emit an error message in this case. Add a similar error message, including the effective limit, since the command line length is configurable. Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2026-02-03s390/configs: Enable BLK_DEV_NULL_BLK as moduleHalil Pasic2-0/+2
Enable BLK_DEV_NULL_BLK as module in defconfig and debug_defconfig, so the Null Test Block Device Driver can be easily used for testing purposes. Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2026-02-03s390: Document s390 stackprotector supportHeiko Carstens1-1/+1
Recently [1] s390 got stackprotector support. Document this. [1] commit f5730d44e05e ("s390: Add stackprotector support") Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2026-02-03Merge branch 'net-phy-remove-modalias-based-mdio-device-bus-matching'Paolo Abeni5-17/+8
Heiner Kallweit says: ==================== net: phy: remove modalias-based MDIO device bus matching modalias-based MDIO device bus matching has only one user (dsa-loop), where we can replace modalias-based matching with a simple custom match function. This, and first patch of the series, lay the foundation for removing modalias-based matching. ==================== Link: https://patch.msgid.link/d9543e7d-23e1-4dba-a6b3-35dcd6a35dec@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03net: phy: remove modalias-based mdio bus matchingHeiner Kallweit3-15/+0
Last user dsa_loop has been migrated away from modalias-based matching, so we can remove this feature now. It was the only user of MDIO_NAME_SIZE, so remove also this constant. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/ce1c6df0-4785-4b28-8322-32dc6bceea18@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03net: dsa: loop: remove MDIO device modaliasHeiner Kallweit1-1/+7
This change is a prerequisite for removing the MDIO device modalias, as dsa_loop is the only user. Switch from modalias to a custom bus match function. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Tested-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/15a4318f-50b5-4df5-874e-e387ee070a9d@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03net: ethernet: adi: make name member of struct adin1110_cfg a pointerHeiner Kallweit1-1/+1
Primary reason for this change is to remove the misuse of MDIO_NAME_SIZE here, so that this constant can be removed in a follow-up patch. Use case here is simply a chip name w/o any relationship to a MDIO device. Also there's no need to reserve a longer char array, so make the name a pointer. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/61bc14fa-eed3-43b6-ae40-b98063e81578@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03revocable: Add KUnit test for concurrent accessTzung-Bi Shih1-0/+104
Add a test case to verify correct synchronization between concurrent readers and a revocation. The test setup involves: 1. Consumer 1 enters the critical section (SRCU read lock) and verifies access to the resource. 2. Provider attempts to revoke the resource. This should block until Consumer 1 releases the lock. 3. Consumer 2 attempts to enter the critical section while revocation is pending. It should see the resource as revoked (NULL). 4. Consumer 1 exits, allowing the revocation to complete. This ensures that the SRCU mechanism correctly enforces grace periods and that new readers are properly prevented from accessing the resource once revocation has begun. A way to run the test: $ ./tools/testing/kunit/kunit.py run \ --kconfig_add CONFIG_REVOCABLE_KUNIT_TEST=y \ --kconfig_add CONFIG_PROVE_LOCKING=y \ --kconfig_add CONFIG_DEBUG_KERNEL=y \ --kconfig_add CONFIG_DEBUG_INFO=y \ --kconfig_add CONFIG_DEBUG_INFO_DWARF5=y \ --kconfig_add CONFIG_KASAN=y \ --kconfig_add CONFIG_DETECT_HUNG_TASK=y \ --kconfig_add CONFIG_DEFAULT_HUNG_TASK_TIMEOUT="10" \ --arch=x86_64 --raw_output=all \ revocable_test Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://patch.msgid.link/20260129143733.45618-5-tzungbi@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-03revocable: fix SRCU index corruption by requiring caller-provided storageTzung-Bi Shih5-113/+102
The struct revocable handle stores the SRCU read-side index (idx) for the duration of a resource access. If multiple threads share the same struct revocable instance, they race on writing to the idx field, corrupting the SRCU state and potentially causing unsafe unlocks. Refactor the API to replace revocable_alloc()/revocable_free() with revocable_init()/revocable_deinit(). This change requires the caller to provide the storage for struct revocable. By moving storage ownership to the caller, the API ensures that concurrent users maintain their own private idx storage, eliminating the race condition. Reported-by: Johan Hovold <johan@kernel.org> Closes: https://lore.kernel.org/all/20260124170535.11756-4-johan@kernel.org/ Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://patch.msgid.link/20260129143733.45618-4-tzungbi@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-03revocable: Add KUnit test for provider lifetime racesTzung-Bi Shih1-0/+41
Add a test to verify that revocable_alloc() correctly handles race conditions where the provider is being released. The test covers three scenarios: 1. Allocating from a NULL provider. 2. Allocating from a provider that has been detached (pointer is NULL). 3. Allocating from a provider that is in the process of destruction (refcount is 0), simulating a race between revocable_alloc() and revocable_provider_release(). A way to run the test: $ ./tools/testing/kunit/kunit.py run \ --kconfig_add CONFIG_REVOCABLE_KUNIT_TEST=y \ --kconfig_add CONFIG_PROVE_LOCKING=y \ --kconfig_add CONFIG_DEBUG_KERNEL=y \ --kconfig_add CONFIG_DEBUG_INFO=y \ --kconfig_add CONFIG_DEBUG_INFO_DWARF5=y \ --kconfig_add CONFIG_KASAN=y \ --kconfig_add CONFIG_DETECT_HUNG_TASK=y \ --kconfig_add CONFIG_DEFAULT_HUNG_TASK_TIMEOUT="10" \ --arch=x86_64 --raw_output=all \ revocable_test Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://patch.msgid.link/20260129143733.45618-3-tzungbi@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-03revocable: Fix races in revocable_alloc() using RCUTzung-Bi Shih5-71/+72
There are two race conditions when allocating a revocable instance: 1. After a struct revocable_provider is revoked, the caller might still hold a dangling pointer to it. A subsequent call to revocable_alloc() can trigger a use-after-free. 2. If revocable_provider_release() runs concurrently with revocable_alloc(), the memory of struct revocable_provider can be accessed during or after kfree(). To fix these: - Manage the lifetime of struct revocable_provider using RCU. Annotate pointers to it with __rcu and use kfree_rcu() for deallocation. - Update revocable_alloc() to safely acquire a reference using RCU primitives. - Update revocable_provider_revoke() to take a double pointer (`**rp`). It atomically NULLs out the caller's pointer before starting revocation. This prevents the caller from holding a dangling pointer. - Drop devm_revocable_provider_alloc(). The devm-managed model cannot support the required double-pointer semantic for safe pointer nulling. Reported-by: Johan Hovold <johan@kernel.org> Closes: https://lore.kernel.org/all/aXdy-b3GOJkzGqYo@hovoldconsulting.com/ Signed-off-by: Tzung-Bi Shih <tzungbi@kernel.org> Link: https://patch.msgid.link/20260129143733.45618-2-tzungbi@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2026-02-03Input: apbps2 - fix comment style and typosMicah Ostrow1-7/+7
Capitalize comment starts to match kernel coding style. Fix spelling: "reciever" -> "receiver" Fix grammar: "it's" (contraction of "it is") -> "its" (possessive) Remove uncertainty from "Clear error bits?" comment. Compile tested only. Signed-off-by: Micah Ostrow <bluefox9516@gmail.com> Link: https://patch.msgid.link/20260127181735.57132-1-bluefox9516@gmail.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2026-02-03sched: Re-evaluate scheduling when migrating queued tasks out of throttled ↵Zicheng Qu1-0/+4
cgroups Consider the following sequence on a CPU configured with nohz_full: 1) A task P runs in cgroup A, and cgroup A becomes throttled due to CFS bandwidth control. The gse (cgroup A) where the task P attached is dequeued and the CPU switches to idle. 2) Before cgroup A is unthrottled, task P is migrated from cgroup A to another cgroup B (not throttled). During sched_move_task(), the task P is observed as queued but not running, and therefore no resched_curr() is triggered. 3) Since the CPU is nohz_full, it remains in do_idle() waiting for an explicit scheduling event, i.e., resched_curr(). 4) For kernel <= 5.10: Later, cgroup A is unthrottled. However, the task P has already been migrated out of cgroup A, so unthrottle_cfs_rq() may observe load_weight == 0 and return early without resched_curr() called. For kernel >= 6.6: The unthrottling path normally triggers `resched_curr()` almost cases even when no runnable tasks remain in the unthrottled cgroup, preventing the idle stall described above. However, if cgroup A is removed before it gets unthrottled, the unthrottling path for cgroup A is never executed. In a result, no `resched_curr()` can be called. 5) At this point, the task P is runnable in cgroup B (not throttled), but the CPU remains in do_idle() with no pending reschedule point. The system stays in this state until an unrelated event (e.g. a new task wakeup or any cases) that can trigger a resched_curr() breaks the nohz_full idle state, and then the task P finally gets scheduled. The root cause is that sched_move_task() may classify the task as only queued, not running, and therefore fails to trigger a resched_curr(), while the later unthrottling path no longer has visibility of the migrated task. Preserve the existing behavior for running tasks by issuing resched_curr(), and explicitly invoke check_preempt_curr() for tasks that were queued at the time of migration. This ensures that runnable tasks are reconsidered for scheduling even when nohz_full suppresses periodic ticks. Fixes: 29f59db3a74b ("sched: group-scheduler core") Signed-off-by: Zicheng Qu <quzicheng@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Reviewed-by: Aaron Lu <ziqianlu@bytedance.com> Tested-by: Aaron Lu <ziqianlu@bytedance.com> Link: https://patch.msgid.link/20260130083438.1122457-1-quzicheng@huawei.com
2026-02-03sched/cpufreq: Use %pe format for PTR_ERR() printingzenghongling1-1/+1
Use %pe format specifier for printing PTR_ERR() error values to make error messages more readable. Found by Coccinelle: ./cpufreq_schedutil.c:685:49-56: WARNING: Consider using %pe to print PTR_ERR() Signed-off-by: zenghongling <zenghongling@kylinos.cn> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260120083333.148385-1-zenghongling@kylinos.cn
2026-02-03sched/rt: Skip currently executing CPU in rto_next_cpu()Chen Jinghuang1-0/+5
CPU0 becomes overloaded when hosting a CPU-bound RT task, a non-CPU-bound RT task, and a CFS task stuck in kernel space. When other CPUs switch from RT to non-RT tasks, RT load balancing (LB) is triggered; with HAVE_RT_PUSH_IPI enabled, they send IPIs to CPU0 to drive the execution of rto_push_irq_work_func. During push_rt_task on CPU0, if next_task->prio < rq->donor->prio, resched_curr() sets NEED_RESCHED and after the push operation completes, CPU0 calls rto_next_cpu(). Since only CPU0 is overloaded in this scenario, rto_next_cpu() should ideally return -1 (no further IPI needed). However, multiple CPUs invoking tell_cpu_to_push() during LB increments rd->rto_loop_next. Even when rd->rto_cpu is set to -1, the mismatch between rd->rto_loop and rd->rto_loop_next forces rto_next_cpu() to restart its search from -1. With CPU0 remaining overloaded (satisfying rt_nr_migratory && rt_nr_total > 1), it gets reselected, causing CPU0 to queue irq_work to itself and send self-IPIs repeatedly. As long as CPU0 stays overloaded and other CPUs run pull_rt_tasks(), it falls into an infinite self-IPI loop, which triggers a CPU hardlockup due to continuous self-interrupts. The trigging scenario is as follows: cpu0 cpu1 cpu2 pull_rt_task tell_cpu_to_push <------------irq_work_queue_on rto_push_irq_work_func push_rt_task resched_curr(rq) pull_rt_task rto_next_cpu tell_cpu_to_push <-------------------------- atomic_inc(rto_loop_next) rd->rto_loop != next rto_next_cpu irq_work_queue_on rto_push_irq_work_func Fix redundant self-IPI by filtering the initiating CPU in rto_next_cpu(). This solution has been verified to effectively eliminate spurious self-IPIs and prevent CPU hardlockup scenarios. Fixes: 4bdced5c9a29 ("sched/rt: Simplify the IPI based RT balancing logic") Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Chen Jinghuang <chenjinghuang2@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://patch.msgid.link/20260122012533.673768-1-chenjinghuang2@huawei.com
2026-02-03sched/clock: Avoid false sharing for sched_clock_irqtimeWangyang Guo4-8/+10
Read-mostly sched_clock_irqtime may share the same cacheline with frequently updated nohz struct. Make it as static_key to avoid false sharing issue. The only user of disable_sched_clock_irqtime() is tsc_.*mark_unstable() which may be invoked under atomic context and require a workqueue to disable static_key. But both of them calls clear_sched_clock_stable() just before doing disable_sched_clock_irqtime(). We can reuse "sched_clock_work" to also disable sched_clock_irqtime(). One additional case need to handle is if the tsc is marked unstable before late_initcall() phase, sched_clock_work will not be invoked and sched_clock_irqtime will stay enabled although clock is unstable: tsc_init() enable_sched_clock_irqtime() # irqtime accounting is enabled here ... if (unsynchronized_tsc()) # true mark_tsc_unstable() clear_sched_clock_stable() __sched_clock_stable_early = 0; ... if (static_key_count(&sched_clock_running.key) == 2) # Only happens at sched_clock_init_late() __clear_sched_clock_stable(); # Never executed ... # late_initcall() phase sched_clock_init_late() if (__sched_clock_stable_early) # Already false __set_sched_clock_stable(); # sched_clock is never marked stable # TSC unstable, but sched_clock_work won't run to disable irqtime So we need to disable_sched_clock_irqtime() in sched_clock_init_late() if clock is unstable. Reported-by: Benjamin Lei <benjamin.lei@intel.com> Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Suggested-by: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Wangyang Guo <wangyang.guo@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: K Prateek Nayak <kprateek.nayak@amd.com> Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com> Reviewed-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Link: https://patch.msgid.link/20260127072509.2627346-1-wangyang.guo@intel.com
2026-02-03selftests/sched_ext: Add test for DL server total_bw consistencyJoel Fernandes2-0/+282
Add a new kselftest to verify that the total_bw value in /sys/kernel/debug/sched/debug remains consistent across all CPUs under different sched_ext BPF program states: 1. Before a BPF scheduler is loaded 2. While a BPF scheduler is loaded and active 3. After a BPF scheduler is unloaded The test runs CPU stress threads to ensure DL server bandwidth values stabilize before checking consistency. This helps catch potential issues with DL server bandwidth accounting during sched_ext transitions. Co-developed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/20260126100050.3854740-8-arighi@nvidia.com
2026-02-03selftests/sched_ext: Add test for sched_ext dl_serverAndrea Righi3-0/+264
Add a selftest to validate the correct behavior of the deadline server for the ext_sched_class. Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/20260126100050.3854740-7-arighi@nvidia.com
2026-02-03sched/debug: Fix dl_server (re)start conditionsPeter Zijlstra2-21/+16
There are two problems with sched_server_write_common() that can cause the dl_server to malfunction upon attempting to change the parameters: 1) when, after having disabled the dl_server by setting runtime=0, it is enabled again while tasks are already enqueued. In this case is_active would still be 0 and dl_server_start() would not be called. 2) when dl_server_apply_params() would fail, runtime is not applied and does not reflect the new state. Instead have dl_server_start() check its actual dl_runtime, and have sched_server_write_common() unconditionally (re)start the dl_server. It will automatically stop if there isn't anything to do, so spurious activation is harmless -- while failing to start it is a problem. While there, move the printk out of the locked region and make it symmetric, also printing on enable. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260203103407.GK1282955@noisy.programming.kicks-ass.net
2026-02-03sched/debug: Add support to change sched_ext server paramsJoel Fernandes1-24/+133
When a sched_ext server is loaded, tasks in the fair class are automatically moved to the sched_ext class. Add support to modify the ext server parameters similar to how the fair server parameters are modified. Re-use common code between ext and fair servers as needed. Co-developed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/20260126100050.3854740-6-arighi@nvidia.com
2026-02-03sched_ext: Add a DL server for sched_ext tasksAndrea Righi6-23/+109
sched_ext currently suffers starvation due to RT. The same workload when converted to EXT can get zero runtime if RT is 100% running, causing EXT processes to stall. Fix it by adding a DL server for EXT. A kselftest is also included later to confirm that both DL servers are functioning correctly: # ./runner -t rt_stall ===== START ===== TEST: rt_stall DESCRIPTION: Verify that RT tasks cannot stall SCHED_EXT tasks OUTPUT: TAP version 13 1..1 # Runtime of FAIR task (PID 1511) is 0.250000 seconds # Runtime of RT task (PID 1512) is 4.750000 seconds # FAIR task got 5.00% of total runtime ok 1 PASS: FAIR task got more than 4.00% of runtime TAP version 13 1..1 # Runtime of EXT task (PID 1514) is 0.250000 seconds # Runtime of RT task (PID 1515) is 4.750000 seconds # EXT task got 5.00% of total runtime ok 2 PASS: EXT task got more than 4.00% of runtime TAP version 13 1..1 # Runtime of FAIR task (PID 1517) is 0.250000 seconds # Runtime of RT task (PID 1518) is 4.750000 seconds # FAIR task got 5.00% of total runtime ok 3 PASS: FAIR task got more than 4.00% of runtime TAP version 13 1..1 # Runtime of EXT task (PID 1521) is 0.250000 seconds # Runtime of RT task (PID 1522) is 4.750000 seconds # EXT task got 5.00% of total runtime ok 4 PASS: EXT task got more than 4.00% of runtime ok 1 rt_stall # ===== END ===== Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/20260126100050.3854740-5-arighi@nvidia.com
2026-02-03sched/debug: Stop and start server based on if it was activeJoel Fernandes1-3/+8
Currently the DL server interface for applying parameters checks CFS-internals to identify if the server is active. This is error-prone and makes it difficult when adding new servers in the future. Fix it, by using dl_server_active() which is also used by the DL server code to determine if the DL server was started. Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Acked-by: Tejun Heo <tj@kernel.org> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/20260126100050.3854740-4-arighi@nvidia.com
2026-02-03sched/debug: Fix updating of ppos on server write opsJoel Fernandes1-3/+4
Updating "ppos" on error conditions does not make much sense. The pattern is to return the error code directly without modifying the position, or modify the position on success and return the number of bytes written. Since on success, the return value of apply is 0, there is no point in modifying ppos either. Fix it by removing all this and just returning error code or number of bytes written on success. Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Juri Lelli <juri.lelli@redhat.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Acked-by: Tejun Heo <tj@kernel.org> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/20260126100050.3854740-3-arighi@nvidia.com