CVE-2024-26958: nfs: fix UAF in direct writes
In the Linux kernel, the following vulnerability has been resolved:
nfs: fix UAF in direct writes
In production we have been hitting the following warning consistently
------------[ cut here ]------------ refcountt: underflow; use-after-free. WARNING: CPU: 17 PID: 1800359 at lib/refcount.c:28 refcountwarnsaturate+0x9c/0xe0 Workqueue: nfsiod nfsdirectwriteschedulework [nfs] RIP: 0010:refcountwarnsaturate+0x9c/0xe0 PKRU: 55555554 Call Trace: ? warn+0x9f/0x130 ? refcountwarnsaturate+0x9c/0xe0 ? reportbug+0xcc/0x150 ? handlebug+0x3d/0x70 ? excinvalidop+0x16/0x40 ? asmexcinvalidop+0x16/0x20 ? refcountwarnsaturate+0x9c/0xe0 nfsdirectwriteschedulework+0x237/0x250 [nfs] processonework+0x12f/0x4a0 workerthread+0x14e/0x3b0 ? ZSTDgetCParamsinternal+0x220/0x220 kthread+0xdc/0x120 ? btfnamevalid+0xa0/0xa0 retfromfork+0x1f/0x30
This is because we're completing the nfsdirectrequest twice in a row.
The source of this is when we have our commit requests to submit, we process them and send them off, and then in the completion path for the commit requests we have
if (nfscommitend(cinfo.mds)) nfsdirectwritecomplete(dreq);
However since we're submitting asynchronous requests we sometimes have one that completes before we submit the next one, so we end up calling complete on the nfsdirectrequest twice.
The only other place we use nfsgenericcommitlist() is in nfscommitinode, which wraps this call in a
nfscommitbegin(); nfscommitend();
Which is a common pattern for this style of completion handling, one that is also repeated in the direct code with getdreq()/putdreq() calls around where we process events as well as in the completion paths.
Fix this by using the same pattern for the commit requests.
Before with my 200 node rocksdb stress running this warning would pop every 10ish minutes. With my patch the stress test has been running for several hours without popping.
Other sources
In the Linux kernel, the following vulnerability has been resolved:
nfs: fix UAF in direct writes
In production we have been hitting the following warning consistently
------------[ cut here ]------------ refcountt: underflow; use-after-free. WARNING: CPU: 17 PID: 1800359 at lib/refcount.c:28 refcountwarnsaturate+0x9c/0xe0 Workqueue: nfsiod nfsdirectwriteschedulework [nfs] RIP: 0010:refcountwarnsaturate+0x9c/0xe0 PKRU: 55555554 Call Trace: <TASK> ? warn+0x9f/0x130 ? refcountwarnsaturate+0x9c/0xe0 ? reportbug+0xcc/0x150 ? handlebug+0x3d/0x70 ? excinvalidop+0x16/0x40 ? asmexcinvalidop+0x16/0x20 ? refcountwarnsaturate+0x9c/0xe0 nfsdirectwriteschedulework+0x237/0x250 [nfs] processonework+0x12f/0x4a0 workerthread+0x14e/0x3b0 ? ZSTDgetCParamsinternal+0x220/0x220 kthread+0xdc/0x120 ? btfnamevalid+0xa0/0xa0 retfromfork+0x1f/0x30
This is because we're completing the nfsdirectrequest twice in a row.
The source of this is when we have our commit requests to submit, we process them and send them off, and then in the completion path for the commit requests we have
if (nfscommitend(cinfo.mds)) nfsdirectwritecomplete(dreq);
However since we're submitting asynchronous requests we sometimes have one that completes before we submit the next one, so we end up calling complete on the nfsdirectrequest twice.
The only other place we use nfsgenericcommitlist() is in nfscommitinode, which wraps this call in a
nfscommitbegin(); nfscommitend();
Which is a common pattern for this style of completion handling, one that is also repeated in the direct code with getdreq()/putdreq() calls around where we process events as well as in the completion paths.
Fix this by using the same pattern for the commit requests.
Before with my 200 node rocksdb stress running this warning would pop every 10ish minutes. With my patch the stress test has been running for several hours without popping.
— NVD
In the Linux kernel, the following vulnerability has been resolved:
nfs: fix UAF in direct writes
The Linux kernel CVE team has assigned CVE-2024-26958 to this issue.
Upstream advisory: https://lore.kernel.org/linux-cve-announce/2024050129-CVE-2024-26958-6c15@gregkh/T
— Red Hat
Affected Software
Remediation
Event History
Frequently Asked Questions
What is the severity of CVE-2024-26958?
CVE-2024-26958 is considered a moderate severity vulnerability that involves a use-after-free issue in the Linux kernel.
How do I fix CVE-2024-26958?
To fix CVE-2024-26958, update to one of the following kernel versions: 5.10.215, 5.15.154, 6.1.84, 6.6.24, 6.7.12, or 6.8.3.
Which Linux distributions are affected by CVE-2024-26958?
Yes, there is a potential risk for exploitation of CVE-2024-26958, especially in production environments that use affected kernel versions.
What does use-after-free mean in the context of CVE-2024-26958?
Use-after-free in CVE-2024-26958 refers to a programming error where a program continues to use memory after it has been freed, leading to potential security vulnerabilities.