diff options
| author | Thomas Gleixner <tglx@kernel.org> | 2026-04-08 13:53:46 +0200 |
|---|---|---|
| committer | Thomas Gleixner <tglx@kernel.org> | 2026-05-01 21:36:11 +0200 |
| commit | bd5956166d20adbde3af0f6f265dc2f0ce5f4df9 (patch) | |
| tree | bd557131cd5fd11d86272dcd1031bf49597388d9 /include | |
| parent | 254f49634ee16a731174d2ae34bc50bd5f45e731 (diff) | |
hrtimer: Provide hrtimer_start_range_ns_user()
Calvin reported an odd NMI watchdog lockup which claims that the CPU locked
up in user space. He provided a reproducer, which set's up a timerfd based
timer and then rearms it in a loop with an absolute expiry time of 1ns.
As the expiry time is in the past, the timer ends up as the first expiring
timer in the per CPU hrtimer base and the clockevent device is programmed
with the minimum delta value. If the machine is fast enough, this ends up
in a endless loop of programming the delta value to the minimum value
defined by the clock event device, before the timer interrupt can fire,
which starves the interrupt and consequently triggers the lockup detector
because the hrtimer callback of the lockup mechanism is never invoked.
The clockevents code already has a last resort mechanism to prevent that,
but it's sensible to catch such issues before trying to reprogram the clock
event device.
Provide a variant of hrtimer_start_range_ns(), which sanity checks the
timer after queueing it. It does not so before because the timer might be
armed and therefore needs to be dequeued. also we optimize for the latest
possible point to check, so that the clock event prevention is avoided as
much as possible.
If the timer is already expired _before_ the clock event is reprogrammed,
remove the timer from the queue and signal to the caller that the operation
failed by returning false.
That allows the caller to take immediate action without going through the
loops and hoops of the hrtimer interrupt.
The queueing code can't invoke the timer callback as the caller might hold
a lock which is taken in the callback.
Add a tracepoint which allows to analyze the expired at start situation.
Reported-by: Calvin Owens <calvin@wbinvd.org>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Tested-by: Calvin Owens <calvin@wbinvd.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20260408114951.995031895@kernel.org
Diffstat (limited to 'include')
| -rw-r--r-- | include/linux/hrtimer.h | 20 | ||||
| -rw-r--r-- | include/trace/events/timer.h | 13 |
2 files changed, 30 insertions, 3 deletions
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index 9ced498fefaa..66cc98c773cf 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -206,6 +206,9 @@ static inline void destroy_hrtimer_on_stack(struct hrtimer *timer) { } extern void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim, u64 range_ns, const enum hrtimer_mode mode); +extern bool hrtimer_start_range_ns_user(struct hrtimer *timer, ktime_t tim, + u64 range_ns, const enum hrtimer_mode mode); + /** * hrtimer_start - (re)start an hrtimer * @timer: the timer to be added @@ -223,17 +226,28 @@ static inline void hrtimer_start(struct hrtimer *timer, ktime_t tim, extern int hrtimer_cancel(struct hrtimer *timer); extern int hrtimer_try_to_cancel(struct hrtimer *timer); -static inline void hrtimer_start_expires(struct hrtimer *timer, - enum hrtimer_mode mode) +static inline void hrtimer_start_expires(struct hrtimer *timer, enum hrtimer_mode mode) { - u64 delta; ktime_t soft, hard; + u64 delta; + soft = hrtimer_get_softexpires(timer); hard = hrtimer_get_expires(timer); delta = ktime_to_ns(ktime_sub(hard, soft)); hrtimer_start_range_ns(timer, soft, delta, mode); } +static inline bool hrtimer_start_expires_user(struct hrtimer *timer, enum hrtimer_mode mode) +{ + ktime_t soft, hard; + u64 delta; + + soft = hrtimer_get_softexpires(timer); + hard = hrtimer_get_expires(timer); + delta = ktime_to_ns(ktime_sub(hard, soft)); + return hrtimer_start_range_ns_user(timer, soft, delta, mode); +} + void hrtimer_sleeper_start_expires(struct hrtimer_sleeper *sl, enum hrtimer_mode mode); diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h index 07cbb9836b91..ca82fd62dc30 100644 --- a/include/trace/events/timer.h +++ b/include/trace/events/timer.h @@ -299,6 +299,19 @@ DECLARE_EVENT_CLASS(hrtimer_class, ); /** + * hrtimer_start_expired - Invoked when a expired timer was started + * @hrtimer: pointer to struct hrtimer + * + * Preceeded by a hrtimer_start tracepoint. + */ +DEFINE_EVENT(hrtimer_class, hrtimer_start_expired, + + TP_PROTO(struct hrtimer *hrtimer), + + TP_ARGS(hrtimer) +); + +/** * hrtimer_expire_exit - called immediately after the hrtimer callback returns * @hrtimer: pointer to struct hrtimer * |
