In the Linux kernel, the following vulnerability has been resolved:
block: fix race between wbt_enable_default and IO submission
When wbt_enable_default() is moved out of queue freezing in elevator_change(), it can cause the wbt inflight counter to become negative (-1), leading to hung tasks in the writeback path. Tasks get stuck in wbt_wait() because the counter is in an inconsistent state.
The issue occurs because wbt_enable_default() could race with IO submission, allowing the counter to be decremented before proper initialization. This manifests as:
rq_wait[0]: inflight: -1 has_waiters: True
rwb_enabled() checks the state, which can be updated exactly between wbt_wait() (rq_qos_throttle()) and wbt_track()(rq_qos_track()), then the inflight counter will become negative.
And results in hung task warnings like: task:kworker/u24:39 state:D stack:0 pid:14767 Call Trace: rq_qos_wait+0xb4/0x150 wbt_wait+0xa9/0x100 __rq_qos_throttle+0x24/0x40 blk_mq_submit_bio+0x672/0x7b0 ...
Fix this by:
- Splitting wbt_enable_default() into:
- __wbt_enable_default(): Returns true if wbt_init() should be called
- wbt_enable_default(): Wrapper for existing callers (no init)
- wbt_init_enable_default(): New function that checks and inits WBT
- Using wbt_init_enable_default() in blk_register_queue() to ensure
- Move wbt_init() out of wbt_enable_default() which is only for enabling
- Removing the ELEVATOR_FLAG_ENABLE_WBT_ON_EXIT flag and its handling
This ensures WBT is properly initialized before any IO can be submitted, preventing the counter from going negative.