Your AI Stack Is Now an Attack Surface: NVIDIAScape, ShadowRay 2.0, and How to Get Alerted First
In 2025, attackers turned AI infrastructure into a target — a container escape in the NVIDIA toolkit that ~37% of clouds run (CVE-2025-23266), a self-propagating botnet hijacking exposed Ray clusters (CVE-2023-48022), an RCE in Ollama model servers (CVE-2024-37032), and a public DeepSeek database leaking chat history and API keys. The thread isn't exotic AI exploits — it's classic cloud-security failure at AI scale. Here are the real CVEs, with authoritative advisories, and exactly how EchelonGraph surfaces and alerts on each.
EchelonGraph
Founder
TL;DR. 2025 was the year attackers stopped *theorizing* about AI risk and started *exploiting* it at scale. A three-line container escape in the NVIDIA Container Toolkit that roughly 37% of cloud environments run (CVE-2025-23266, CVSS 9.0). A two-year-old unauthenticated RCE in the Ray AI framework, now powering a self-propagating botnet across 200,000+ exposed servers (CVE-2023-48022, CVSS 9.8). A path-traversal RCE in Ollama model servers (CVE-2024-37032, CVSS 8.8). And one of the most talked-about AI labs in the world, DeepSeek, leaving a production database open to the internet with plaintext chat history and API keys inside. None of these were exotic "AI exploits." They were classic cloud-security failures — exposed assets, unpatched known CVEs, and zero runtime detection — at AI scale. That is exactly the failure mode EchelonGraph is built to surface, correlate, and *alert* on. Talk to our team →
Why AI infrastructure became a top target in 2025
Every team raced to ship AI in 2024-2025. GPUs were scarce, deadlines were short, and a lot of inference servers, training clusters, vector databases, and model registries got stood up *fast* — often on public IPs, often with default settings, often outside the security team's inventory.
Attackers noticed. AI infrastructure is uniquely attractive: it sits on expensive GPUs (great for cryptomining), it holds proprietary models and training data (great for theft), and it is frequently *internet-reachable with weak or no authentication* (great for everything else). The throughline across the incidents below is not some novel class of "AI vulnerability." It is the oldest story in cloud security: something got exposed, something was left unpatched, and nobody was watching the runtime.
Incident 1 — DeepSeek's open database (January 2025)
In January 2025, Wiz Research discovered a publicly accessible ClickHouse database belonging to DeepSeek — one of the most-downloaded AI apps in the world at the time. No authentication. Two open ports. Over a million log lines including plaintext chat history, API keys, and backend details — with full database control available to anyone who found it.
There was no zero-day here. The database was simply reachable from the internet with no auth. This is the single most common way cloud data leaks, and AI workloads — spun up quickly, often by data scientists rather than platform teams — are especially prone to it.
> The lesson isn't "AI is uniquely dangerous." It's that an AI gold-rush produces a lot of fast-moving infrastructure, and fast-moving infrastructure gets exposed.
Incident 2 — NVIDIAScape: a three-line container escape (CVE-2025-23266)
In July 2025, Wiz disclosed NVIDIAScape — CVE-2025-23266, CVSS 9.0. A flaw in how the NVIDIA Container Toolkit handles an OCI createContainer hook let a malicious container set the LD_PRELOAD environment variable (literally three lines in a Dockerfile) and load a library that executes as root on the host.
Why it matters so much: the NVIDIA Container Toolkit is the near-universal plumbing for running GPU workloads in containers. Wiz estimated roughly 37% of cloud environments were affected — including managed AI clouds where many tenants share a GPU host. On a shared host, one escaped container can mean access to *other customers'* AI workloads. Affected: NVIDIA Container Toolkit up to and including v1.17.7, and GPU Operator up to 25.3.1. (Track it on EchelonGraph Pulse →)
Incident 3 — ShadowRay 2.0: your AI cluster, now a botnet node (CVE-2023-48022)
ShadowRay is the one that should worry every ML platform team. CVE-2023-48022 (CVSS 9.8) is an unauthenticated remote code execution in the Jobs API of Ray, the most widely used framework for distributed Python and ML workloads. Because the missing authentication is treated as a deployment responsibility, the flaw has lingered for years — and Ray dashboards keep ending up on public IPs.
In late 2025, Oligo Security reported ShadowRay 2.0: an active, global campaign turning exploited Ray clusters into a self-propagating botnet, with established malware families (RondoDox, MooBot, KmsdBot) folding the CVE into their kits for cryptojacking, DDoS, and data theft. Their scans found 200,000+ Ray servers exposed to the internet. (Track it on EchelonGraph Pulse →)
Incident 4 — Probllama: RCE in Ollama model servers (CVE-2024-37032)
Probllama — CVE-2024-37032, CVSS 8.8 — is a path-traversal-to-RCE in Ollama, the popular run-LLMs-locally server. Insufficient validation of the model digest in the /api/pull endpoint let an attacker overwrite arbitrary files and achieve remote code execution — especially dangerous in the many Docker deployments that run Ollama as root. Wiz found 1,000+ exposed Ollama instances serving models with no protection at all. (Track it on EchelonGraph Pulse →)
The pattern: three failures, every time
| Incident | CVE | CVSS | Root cause (not "AI magic") |
|---|---|---|---|
| DeepSeek DB leak | — (misconfiguration) | — | Asset exposed to the internet, no auth |
| NVIDIAScape | CVE-2025-23266 | 9.0 | Unpatched known CVE → container escape → host root |
| ShadowRay 2.0 | CVE-2023-48022 | 9.8 | Exposed service + unpatched CVE → RCE → botnet |
| Probllama | CVE-2024-37032 | 8.8 | Exposed service + unpatched CVE → RCE |
Strip away the AI branding and the same three failures repeat:
Fix those three and you defuse almost every incident above. That is precisely what EchelonGraph is built to do.
How EchelonGraph turns each failure into an alert
EchelonGraph is a cloud security graph: it maps your assets and their relationships, correlates the live CVE feed against *your* inventory, and (with the Tier 3 agent) watches the runtime — then alerts you the moment something matches.
1. It maps your AI attack surface — and flags what's exposed
EchelonGraph's agentless cloud scanner inventories your AWS, GCP, and Azure resources and flags the exact conditions behind Incidents 1, 3, and 4: assets reachable from 0.0.0.0/0, public IPs, permissive security groups, and open management ports. The Shadow AI Radar adds Certificate-Transparency-log monitoring to surface AI endpoints stood up *outside* your security team's knowledge — the shadow inference servers and dashboards that become Incident 1.
2. It correlates new CVEs to *your* workloads — and alerts you
This is the heart of it. EchelonGraph maintains a live CVE Pulse feed (NVD-sourced, enriched with an EchelonGraph priority score, EPSS exploit-probability, and CISA KEV status). When a CVE like NVIDIAScape or ShadowRay lands, EchelonGraph matches it against your software inventory — by product and version — and tells you *which of your assets are affected*, ranked by real-world exploitability. You learn that CVE-2025-23266 hits your GPU nodes before an attacker does.
3. It watches the runtime for post-exploitation (Tier 3, eBPF)
Exposure mapping and patching are prevention. Tier 3 (EcheDeep) is detection: an eBPF agent that runs inside your own cluster and flags the behaviors these exploits produce — container escapes (NVIDIAScape), cryptominers and lateral movement (ShadowRay), and anomalous process execution (Probllama). It is zero-knowledge by design: telemetry is redacted at the kernel boundary and envelope-encrypted under a key only you hold, so EchelonGraph never sees your plaintext.
4. It shows the blast radius before the attacker walks it
Because EchelonGraph is graph-based, it answers the question every responder asks: *if this exposed Ray node is compromised, what else is reachable?* The blast-radius view turns a single finding into the full picture — the GPU host, the model store, the credentials, the neighboring tenants — so you remediate the path, not just the symptom.
How you actually get alerted (step by step)
The honest part
No security tool catches everything, and we will never tell you otherwise. EchelonGraph does not replace your security program — it makes it faster and harder to surprise. What it *does* do, reliably, is shrink the three gaps these incidents exploited: it shows you what's exposed, it tells you which published CVEs actually touch your stack, and (with Tier 3) it watches the runtime for the behaviors that follow a breach. In every incident above, that is the difference between "we got an alert in seconds" and "we found out from the cryptomining bill — or from the news."
If you're running AI workloads in the cloud, the attack surface is already there. The only real choice is whether you see it first.
— *Founder, EchelonGraph*
Protect your infrastructure before the breach
Map your attack surface, automate compliance, and detect insider threats in real time.
Start free trial →