Executive Summary
The AI revolution is happening whether you're on board or not, and that means something you're doing requires GPUs, probably something business critical. And when NVIDIA's kernel modules leaked two privilege-escalation vulnerabilities this fall, most organizations learned about GPU attack surfaces the hard way—through CVE notices rather than telemetry.
Here's what actually broke, how attackers could've exploited it, and why waiting for patches isn't a strategy you want your organization to count on. Multiple privilege escalation and denial-of-service vulnerabilities were uncovered in NVIDIA's Linux GPU kernel drivers, in a set of issues referred to as CUDA de Grâce. In this analysis, we'll detail root cause investigations and demonstrate how exploitation attempts can be detected at runtime through Stealthium's GPU observability platform, leveraging advanced telemetry and behavioral analytics.
The vulnerabilities were identified by Valentina Palmiotti and Sam Lovejoy in NVIDIA's open-source GPU kernel modules. These issues enable unprivileged local attackers to escalate privileges and cause denial of service. Although fixes were released by NVIDIA in the October's 2025 driver update, the sophistication of these vulnerabilities underscores the importance of continuous runtime monitoring and detection, capabilities natively provided by Stealthium to protect AI and GPU-accelerated workloads in production environments.
CVE-2025-23282: Race Condition Leading to Privilege Escalation
Impact
Potential for privilege escalation from unprivileged user space to kernel resulting in breaking isolation layers from container-based sandboxing with a CVSS score of 7.0 (AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H).
A public demonstration of this vulnerability being exploited inside an Azure GPU VM environment has confirmed that the vulnerability is reliably exploitable for local privilege escalation in real-world environments.
Root Cause Analysis
Multiple race conditions were addressed in the September 2025 driver release. The specific vulnerability under analysis was located in the handling of the NV_ESC_ATTACH_GPUS_TO_FD ioctl (command 212) in the nvidia_ioctl function in kernel-open/nvidia/nv.c.
In the vulnerable code path, a buffer is allocated via NV_KMALLOC and its pointer is stored in the file-handle context nvlfp->attached_gpus:
File: kernel-open/nvidia/nv.c
Commit: 87c0b1247370e42bd22bb487a683ec513a177b3b
2528 case NV_ESC_ATTACH_GPUS_TO_FD:
2529 {
....
2546
2547 NV_KMALLOC(nvlfp->attached_gpus, arg_size);
User-provided data is then copied into the newly allocated buffer:
2553 memcpy(nvlfp->attached_gpus, arg_copy, arg_size);
2554 nvlfp->num_attached_gpus = num_arg_gpus;
Each GPU ID in the buffer is mapped to a device reference. On failure (line 2563), the allocated buffer is freed (line 2571) and nvlfp->num_attached_gpus is cleared before returning:
2556 for (i = 0; i < nvlfp->num_attached_gpus; i++)
2557 {
2558 if (nvlfp->attached_gpus[i] == 0)
2559 {
2560 continue;
2561 }
2562
2563 if (nvidia_dev_get(nvlfp->attached_gpus[i], sp))
2564 {
2565 while (i--)
2566 {
2567 if (nvlfp->attached_gpus[i] != 0)
2568 nvidia_dev_put(nvlfp->attached_gpus[i], sp);
2569 }
2570
2571 NV_KFREE(nvlfp->attached_gpus, arg_size);
2572 nvlfp->num_attached_gpus = 0;
2573
2574 status = -EINVAL;
2575 break;
2576 }
2577 }
No synchronization primitives protect this code path. As the ioctl may be invoked concurrently by multiple threads, nvlfp (and its members attached_gpus and num_attached_gpus) may be accessed or written to concurrently. This allowed interleavings in which one thread's allocation/pointer write could be overwritten by another thread before deallocation occurred, producing potential memory leaks, use-after-free conditions, and reliable double-free primitives.
Attack vectors
-
Memory-leak vector: When multiple threads concurrently invoked
NV_ESC_ATTACH_GPUS_TO_FD, earlier allocations are overwritten by subsequent assignments tonvlfp->attached_gpuswithout being freed, which could lead to kernel memory exhaustion. -
Use-after-free and double-free vector (race): A plausible interleaving was observed in which:
- Thread A allocated a buffer at address
Aand copies the user data into it. - Thread B allocates buffer
Band overwritesnvlfp->attached_gpuswithB. - Thread A enters the error path and executes
NV_KFREE(nvlfp->attached_gpus), attempting to freeAbut instead freesB. - Thread B then hits its own error path, and frees
nvlfp->attached_gpusagain, freeingBtwice.
- Thread A allocated a buffer at address
An attacker that can control the userspace data and force predictable reuse of the freed memory regions, can manipulate kernel memory and potentially achieve privilege escalation.
NVIDIA's Fix
NVIDIA addressed the issue in driver version 580.95.05 (released September 30, 2025 - Commit 2b43605 - October's 2025 Security Bulletin) by introducing synchronization primitives around the attached_gpus state using a semaphore to serialize concurrent allocations and frees:
Commit: 2b436058a616676ec888ef3814d1db6b2220f2eb
@@ -2538,8 +2544,12 @@ nvidia_ioctl(
goto done;
}
+ /* atomically check and alloc attached_gpus */
+ down(&nvl->ldata_lock);
+
if (nvlfp->num_attached_gpus != 0)
{
+ up(&nvl->ldata_lock);
status = -EINVAL;
goto done;
}
@@ -2547,12 +2557,15 @@ nvidia_ioctl(
NV_KMALLOC(nvlfp->attached_gpus, arg_size);
if (nvlfp->attached_gpus == NULL)
{
+ up(&nvl->ldata_lock);
status = -ENOMEM;
goto done;
}
memcpy(nvlfp->attached_gpus, arg_copy, arg_size);
nvlfp->num_attached_gpus = num_arg_gpus;
+ up(&nvl->ldata_lock);
+
for (i = 0; i < nvlfp->num_attached_gpus; i++)
{
if (nvlfp->attached_gpus[i] == 0)
@@ -2568,9 +2581,14 @@ nvidia_ioctl(
nvidia_dev_put(nvlfp->attached_gpus[i], sp);
}
+ /* atomically free attached_gpus */
+ down(&nvl->ldata_lock);
+
NV_KFREE(nvlfp->attached_gpus, arg_size);
nvlfp->num_attached_gpus = 0;
+ up(&nvl->ldata_lock);
+
status = -EINVAL;
break;
}
Stealthium Detection Strategy
Stealthium's GPU runtime observability platform detects exploitation attempts for CVE-2025-23282 at the lowest levels by using layered telemetry and behavioral analytics. Here's how we detect this vulnerability:
Ioctl Call Monitoring
Stealthium introspects ioctl calls on NVIDIA device files (for example, /dev/nvidiactl) by attaching eBPF probes to the kernel entry and exit points of the ioctl handler. For CVE-2025-23282, the following heuristic was implemented:
Detection heuristic: We monitor for rapid, overlapping invocations of NV_ESC_ATTACH_GPUS_TO_FD (ioctl 212) on the same underlying file object, rather than just the integer file descriptor. Since file descriptors can be inherited or passed between processes (fork, dup, pidfd_getfd, etc..), tracking the kernel file object provides a reliable way to detect this race, even when attackers coordinate across processes.
Typical Nvidia GPU workloads do not invoke this ioctl pattern, and the behavior required to exploit this vulnerability is strongly indicative of malicious activity.
Telemetry Captured:
- Kernel file object
- Ioctl command and argument size
- Timing/overlap information
- Optional: PID/TID (for context, not for detection)
By correlating activity on the same kernel file object and overlapping timestamps, a high-confidence signal can be raised when multiple threads/processes attempt NV_ESC_ATTACH_GPUS_TO_FD concurrently. Even a single misbehaving program that legitimately calls the ioctl may indicate corruption, if subsequent anomalous behaviour is observed, it can be traced to this event.
CVE-2025-23332: Incorrect ZERO_SIZE_PTR Handling
Impact
This vulnerability allows an unprivileged user to trigger a denial of service in the NVIDIA kernel driver, leading to GPU driver crashes that can disrupt other applications, including workloads running inside containers. This inefficiency has been classified with a CVSS score of 5.0 (AV:L/AC:L/PR:L/UI:R/S:U/C:N/I:N/A:H).
Root Cause Analysis
The bug originates in the nvidia_ioctl function within kernel-open/nvidia/nv.c, during the handling of the ioctl command NV_ESC_WAIT_OPEN_COMPLETE (218).
The function allocates memory to hold user-provided data (line 2438):
File: kernel-open/nvidia/nv.c
Commit: 87c0b1247370e42bd22bb487a683ec513a177b3b
2376 int
2377 nvidia_ioctl(
2378 struct inode *inode,
2379 struct file *file,
2380 unsigned int cmd,
2381 unsigned long i_arg)
2382 {
...
2405 arg_size = _IOC_SIZE(cmd);
2406 arg_cmd = _IOC_NR(cmd);
...
2438 NV_KMALLOC(arg_copy, arg_size);
2439 if (arg_copy == NULL)
2440 {
2441 nv_printf(NV_DBG_ERRORS, "NVRM: failed to allocate ioctl memory\\\\n");
2442 status = -ENOMEM;
2443 goto done_early;
2444 }
Finally, when processing NV_ESC_WAIT_OPEN_COMPLETE, it copies the open_rc and adapter_status fields into the allocated buffer:
2446 if (NV_COPY_FROM_USER(arg_copy, arg_ptr, arg_size))
2447 {
2448 nv_printf(NV_DBG_ERRORS, "NVRM: failed to copy in ioctl data!\\\\n");
2449 status = -EFAULT;
2450 goto done_early;
2451 }
And finally it copies the member variables open_rc and adapter_status to the freshly allocated memory:
2457 if (arg_cmd == NV_ESC_WAIT_OPEN_COMPLETE)
2458 {
2459 nv_ioctl_wait_open_complete_t *params = arg_copy;
2460
2461 params->rc = nvlfp->open_rc;
2462 params->adapterStatus = nvlfp->adapter_status;
2463 goto done_early;
2464 }
At first glance, nothing appears wrong in the code above. However, there's a subtle kernel behaviour at play:
When kmalloc() is called with a zero-size allocation, it does not return NULL, it instead returns a special pointer called ZERO_SIZE_PTR (defined as address 0x10). Because the driver only checks for NULL (if (arg_copy == NULL)), it fails to detect this invalid allocation and continues execution. The subsequent memcpy call does nothing since the size is zero, but the real issue appears at line 2461, where the code dereferences params->rc.
At that point, it's dereferencing ZERO_SIZE_PTR (0x10), which triggers a page fault, crashing the NVIDIA kernel driver.
NVIDIA's fix
NVIDIA addressed the issue in driver version 580.95.05 (released September 30, 2025 - Commit 2b43605 - October's 2025 Security Bulletin) by adding explicit size validation for the NV_ESC_WAIT_OPEN_COMPLETE ioctl. The new guard rejects improperly sized requests, including zero-size requests that previously produced a ZERO_SIZE_PTR dereference:
Commit: 2b436058a616676ec888ef3814d1db6b2220f2eb
@@ -2458,6 +2458,12 @@ nvidia_ioctl(
{
nv_ioctl_wait_open_complete_t *params = arg_copy;
+ if (arg_size != sizeof(nv_ioctl_wait_open_complete_t))
+ {
+ status = -EINVAL;
+ goto done_early;
+ }
+
This simple check prevents the code path from dereferencing the ZERO_SIZE_PTR when kmalloc(0) is used, and ensures that only properly formed ioctl requests are processed.`
Stealthium's Detection Strategy
Stealthium's observability stack was designed to catch both the exploit attempts that target this class of bug and the resulting impact when they succeed. Below are the pragmatic detection layers we apply for CVE-2025-23332.
Ioctl Parameter Validation
Watch ioctl invocations for NV_ESC_WAIT_OPEN_COMPLETE (ioctl 218) and flag any calls whose arg_size does not match the expected payload size (i.e., sizeof(nv_ioctl_wait_open_complete_t), typically 16 bytes). Zero-size requests are trivial to spot and are highly anomalous for this ioctl.
Telemetry Captured:
- Ioctl command number and argument size
- Calling process details (PID, UID, executable path, command line)
- NVIDIA driver version
A lightweight eBPF probe on ioctl entry is sufficient to collect these fields with minimal overhead. When we observe an incorrectly sized call to ioctl NV_ESC_WAIT_OPEN_COMPLETE from an untrusted binary or from a process on an unpatched host, we can raise a high-confidence alert.
Conclusion
The vulnerabilities discovered by security researchers in NVIDIA's GPU kernel drivers demonstrate the expanding attack surface of GPU-accelerated computing infrastructure. As AI workloads become increasingly critical to business operations, the security implications of GPU driver vulnerabilities grow correspondingly severe.
Traditional security approaches focused solely on patching are insufficient given:
- The lag between vulnerability discovery and patch deployment
- The complexity of GPU driver ecosystems across multiple branches
- The sophistication of modern exploitation techniques
- The multi-tenant nature of cloud GPU environments
Stealthium's comprehensive GPU observability platform addresses these challenges by providing:
- Real-time detection of exploitation attempts against known vulnerabilities
- Behavioural anomaly detection capable of identifying zero-day attacks
- Deep visibility across the entire NVIDIA software stack (driver, CUDA, frameworks)
- Production-safe deployment with minimal performance impact
By correlating low-level GPU telemetry with high-level workload context, Stealthium transforms raw GPU metrics into actionable security intelligence, enabling organisations to defend their AI infrastructure against both known and emerging threats.
It may be early in your understanding of GPU vulnerabilities, but your organization cannot wait for its most innovative digital assets to be compromised just because you're new to it. Stealthium can give you the confidence you need to run workloads on GPUs. Get in touch today.
