CVE-2024-39476: md/raid5: fix deadlock that raid5d() wait for itself to clear MD_SB_CHANGE_PENDING
In the Linux kernel, the following vulnerability has been resolved:
md/raid5: fix deadlock that raid5d() wait for itself to clear MDSBCHANGEPENDING
Xiao reported that lvm2 test lvconvert-raid-takeover.sh can hang with small possibility, the root cause is exactly the same as commit bed9e27baf52 ("Revert "md/raid5: Wait for MDSBCHANGEPENDING in raid5d"")
However, Dan reported another hang after that, and junxiao investigated the problem and found out that this is caused by plugged bio can't issue from raid5d().
Current implementation in raid5d() has a weird dependence:
1) mdcheckrecovery() from raid5d() must hold 'reconfigmutex' to clear MDSBCHANGEPENDING; 2) raid5d() handles IO in a deadloop, until all IO are issued; 3) IO from raid5d() must wait for MDSBCHANGEPENDING to be cleared;
This behaviour is introduce before v2.6, and for consequence, if other context hold 'reconfigmutex', and mdcheckrecovery() can't update superblock, then raid5d() will waste one cpu 100% by the deadloop, until 'reconfigmutex' is released.
Refer to the implementation from raid1 and raid10, fix this problem by skipping issue IO if MDSBCHANGEPENDING is still set after mdcheckrecovery(), daemon thread will be woken up when 'reconfigmutex' is released. Meanwhile, the hang problem will be fixed as well.
Other sources
Linux Kernel is vulnerable to a denial of service, caused by an error related to deadlock that raid5d() wait for itself to clear MDSBCHANGEPENDING. A local authenticated attacker could exploit this vulnerability to cause a denial of service.
— IBM
Affected Software
Remediation
Event History
Frequently Asked Questions
What is the severity of CVE-2024-39476?
The severity of CVE-2024-39476 is currently classified as medium due to its potential to cause system hangs under specific conditions.
How do I fix CVE-2024-39476?
To fix CVE-2024-39476, upgrade your Linux kernel to a version that has resolved this vulnerability.
Which Linux kernel versions are affected by CVE-2024-39476?
CVE-2024-39476 affects multiple Linux kernel versions between 4.19 and 6.9.5.
What types of vulnerabilities does CVE-2024-39476 address?
CVE-2024-39476 addresses a deadlock condition that specifically impacts the md/raid5 subsystem.
Is there a workaround for CVE-2024-39476?
No specific workaround is recommended for CVE-2024-39476; upgrading the kernel is the advised course of action.