CloudCodeTree LogoCloudCodeTree
HomeResumeAI NewsContactSchedule
CloudCodeTree Logo
CloudCodeTree

CloudCodeTree · Journal

AI News

Daily field notes on AI-assisted engineering.

BadHost (CVE-2026-48710): one malformed Host header bypasses auth on 325M-download Starlette — and every MCP server built on it

Jun 11, 2026 · 16:00 UTC · 2 min read

BadHost (CVE-2026-48710): one malformed Host header bypasses auth on 325M-download Starlette — and every MCP server built on it

A single slash character in an HTTP Host header is enough to bypass authentication on any path-based middleware written against Starlette. That's CVE-2026-48710, dubbed BadHost — disclosed May 27, 2026, after coordinated disclosure with the Open Source Technology Improvement Fund (OSTIF) and a fix shipped in Starlette 1.0.1 the day before disclosure hit.

How it works. Starlette reconstructs request.url by concatenating the raw Host header with the request path, without sanitizing /, ?, or # characters. Send GET /admin with Host: example.com/health?x=, and request.url.path reports /health while the ASGI router still routed to /admin. Any middleware that gates access by checking request.url.path is bypassed. Access controls built on request.url are effectively path-blind.

The blast radius is large. Starlette pulls roughly 325 million downloads per week and underpins 400,000+ dependent GitHub projects. Every tool that inherits it is affected without any code of its own doing wrong: FastAPI, vLLM, LiteLLM, MCP servers, OpenAI-compatible API proxies, Ray Serve, BentoML, Google ADK-Python, and a long tail of agent harnesses and model-serving dashboards. MCP servers are especially exposed because the MCP specification mandates an unauthenticated OAuth discovery endpoint — giving an attacker a predictable bypass path. The CVSS scores split (6.5 official, 7.0 by X41 D-Sec who found it) but the practical impact is "unauthenticated read/write on any protected API endpoint."

The fix is mechanical but requires an audit. Update Starlette to 1.0.1+ and replace any request.url.path references in your security middleware with request.scope["path"], which reads the raw ASGI routing value the server actually matched against — never the reconstructed URL. Validated Host headers are rejected outright by 1.0.1+ per RFC 9112 §3.2. Also use FastAPI's Depends() or Starlette's requires() decorator rather than raw middleware path checks; they operate on scope values throughout.

Action items: pip install starlette --upgrade (or via FastAPI), audit any custom auth middleware for request.url.path, scan with the detection tool at badhost.org, and check your MCP server framework for an available release.

Sources: CSO Online: FastAPI-based AI tools exposed, X41 D-Sec advisory (X41-2026-002), badhost.org, CCB Belgium advisory

Sysdig documents the first confirmed autonomous LLM attack: zero to database exfiltration in 60 minutes, no human between steps

Jun 11, 2026 · 15:30 UTC · 2 min read

Sysdig documents the first confirmed autonomous LLM attack: zero to database exfiltration in 60 minutes, no human between steps

On May 10, 2026, a server running Marimo — an open-source Python notebook platform — was compromised through CVE-2026-39987, a pre-authentication remote code execution flaw. What made Sysdig's Threat Research Team sit up is what came next: rather than a human operator running post-exploitation commands, the attacker handed control to an LLM agent that autonomously ran the entire intrusion chain from initial access to a fully exfiltrated PostgreSQL database in under an hour. CVE-2026-39987 is now on CISA's Known Exploited Vulnerabilities list.

The four-pivot chain:

  1. Initial access — single WebSocket request exploiting the Marimo RCE for an interactive shell
  2. Credential harvest — agent read environment variables, config files, and metadata endpoints, extracting two cloud credentials
  3. AWS Secrets Manager — 12 API calls across 11 distinct IPs routed through Cloudflare Workers retrieved an SSH private key in 22 seconds, evading IP-based alerts
  4. Database exfiltration — eight parallel SSH sessions opened through the bastion host; full PostgreSQL contents dumped in under two minutes

How Sysdig knew it was an LLM. Four behavioral signatures set it apart from a human or script: real-time schema exploration (no dump script — the agent interrogated the DB to understand structure live), a Chinese-language planning comment ("see what else we can do") visible in the command stream, machine-readable structured delimiters indicating self-parsing output, and adaptive command generation where each step's output fed directly into the next one without any human in the loop.

The defender framing: signature-based detection is inadequate for adaptive agents — each command generated fresh, each step unique. Sub-minute lateral movement phases eliminate the response window that traditional alerting assumes. The 12-IP, Cloudflare-routed exfiltration demonstrates evasion patterns that most human attackers spend years developing, now embedded in the agent's behavior.

Immediate mitigations: patch Marimo to 0.23.0+, audit AWS Secrets Manager access controls and MFA, monitor for anomalous Cloudflare Worker egress, and audit any publicly-accessible notebook instances. More broadly: assume breach timelines of under an hour and design for that, not the old "days to lateral movement" model.

Sources: Sysdig TRT blog, The Hacker News, The Agent Report, CyberSecurityNews

OpenCode hits 160K GitHub stars and tops the June power rankings — LSP feedback and 75+ model providers in a free MIT agent

Jun 11, 2026 · 15:00 UTC · 2 min read

OpenCode hits 160K GitHub stars and tops the June power rankings — LSP feedback and 75+ model providers in a free MIT agent

OpenCode topped LogRocket's June 2026 AI dev tool power rankings, displacing Cursor from the top spot in what the writeup called "the first major disruption to the tools category since Cursor 3's rebuild." At 160,000 GitHub stars (with 900+ contributors and 13,000+ commits), v1.16.0 shipped June 5, and 7.5 million developers use it monthly.

The short version of why: it solves two real problems with existing agents. First, model lock-in — OpenCode works with 75+ providers (Claude, GPT, Gemini, Grok, Mistral, local models via Ollama) through a single config file. Existing Claude Code or Copilot subscriptions plug in directly; so does LM Studio for fully offline runs. Second, context blindness — most terminal agents only see text. OpenCode integrates Language Server Protocol for TypeScript, Python, Rust, Go, C/C++, Java, and 18+ other languages, giving the model actual type information, function signatures, import paths, and live compiler diagnostics. LSP diagnostics feed back to the model mid-task, enabling self-correction before the agent declares done. DataCamp testing found OpenCode generated 21 more tests on average than Claude Code on the same underlying model, with the gap tracing directly to the LSP feedback loop.

The practical comparison against the two dominant proprietary agents: Claude Code is Anthropic-only at $20/month, Cursor is multi-model but proprietary at $20/month. OpenCode is MIT-licensed, free with your own keys, and runs locally — no telemetry, no model or vendor tie-in. The tradeoff is terminal-only (no IDE fork, no inline completion sidebar), a CLI skill curve, and you manage provider billing yourself.

If you run a privacy-sensitive codebase or want to compare model performance on real tasks without a subscription cage, v1.16.0 is the version to try. Available as CLI, desktop app (macOS/Windows, Linux beta), and IDE extension for VS Code and Cursor.

Sources: OpenCode.ai, LogRocket: AI dev tool power rankings June 2026, ChatForest builder guide, byteiota: OpenCode guide 2026

OpenAI and Anthropic both filed for IPO within 10 days — the S-1s will be the first public look at frontier AI economics

Jun 11, 2026 · 14:30 UTC · 2 min read

OpenAI and Anthropic both filed for IPO within 10 days — the S-1s will be the first public look at frontier AI economics

Within 10 days of each other, both frontier AI labs filed confidential S-1s with the SEC. Anthropic filed June 1 at a private-market valuation of $965 billion — slightly above OpenAI's $852 billion post-money from March. OpenAI filed June 8. Neither has disclosed timing, pricing, or fundraising targets.

The significance isn't the IPOs themselves — it's what public filings require. For the first time, Wall Street will see actual revenue, margins, capital requirements, and risk factors from both labs. The industry has operated on secondary-market estimates for years; the S-1s will either confirm or deflate a lot of conventional wisdom about frontier AI unit economics.

The financials that have leaked. Anthropic is reportedly on track for its first profitable quarter in Q2 2026, with annualized revenue hitting ~$47 billion as of May — extraordinary if accurate. OpenAI is at $2 billion in monthly revenue but reportedly burning $1.22 per dollar earned in its most recent quarter, with projections showing ~$85 billion in burn in 2028 despite doubling sales and no cash-flow positivity until at least 2030. OpenAI noted it "may be a while" before going public — suggesting the filing could be positioning more than imminent action.

Why this matters to engineers building on these platforms. The financial disclosures will clarify how dependent both companies remain on outside capital to sustain inference capacity and R&D, and what that means for pricing stability. If Anthropic is actually near profitability at scale, its pricing trajectory is different from a company that's burning capital to grow. The S-1 risk factors will also be the most complete disclosure of competitive threats, regulatory exposure (EU AI Act, US executive orders), and safety commitment costs either company has ever published.

Context: this comes two weeks after Anthropic published "When AI builds itself" (June 4), its call for coordinated international oversight at the frontier — a notable strategy for a company also filing to go public. The combined pipeline with SpaceX represents roughly $3.6 trillion in private-market valuations entering the public markets.

Sources: TechCrunch: OpenAI files for IPO, CNBC: Anthropic IPO S-1, Yahoo Finance: Anthropic files confidential S-1, CNN Business: OpenAI IPO analysis

Miasma worm backdoors Claude Code, Cursor, and Gemini configs — 57 npm packages compromised

Jun 11, 2026 · 14:00 UTC · 2 min read

Miasma worm backdoors Claude Code, Cursor, and Gemini configs — 57 npm packages compromised

A supply-chain attack security teams are calling Miasma compromised 57 npm packages — including @vapi-ai/server-sdk (408K+ monthly downloads) and ai-sdk-ollama (120K+) — in under two hours on June 3, 2026. The attack is notable for two things that go beyond a standard credential-harvesting worm: its evasion technique and its deliberate targeting of AI coding-assistant configurations.

The evasion — "Phantom Gyp". Rather than the preinstall or postinstall lifecycle hooks that npm security scanners typically watch, Miasma used a 157-byte binding.gyp file — the config format for native C++ add-ons — to trigger code execution during npm install. Most install-script auditing tools don't monitor binding.gyp. Defense: npm install --ignore-scripts blocks it; pinning dependency integrity hashes in lockfiles catches tampered packages before they run.

The AI assistant targeting. The payload deliberately injected persistent backdoor files into six environments:

  • .claude/setup.mjs and .claude/settings.json (Claude Code)
  • .cursor/rules/setup.mdc (Cursor)
  • .gemini/settings.json (Google Gemini)
  • .vscode/tasks.json and .vscode/setup.mjs (VS Code)
  • .github/setup.js (GitHub Actions)

Each file claimed legitimacy as "required for proper IDE integration." Any future project-open in those tools runs attacker-controlled code silently.

CI/CD credential exfiltration. On GitHub Actions runners, the worm scraped AWS IMDSv2 tokens, Azure IMDS credentials, GCP service accounts, GitHub Actions OIDC tokens, and 1Password/gopass stores from process memory. It then republished its own reinfected packages with forged Sigstore provenance attestations to continue propagating downstream.

Mitigation checklist: update or remove affected packages; audit repositories for injected files in .claude/, .cursor/, .gemini/, .vscode/, and .github/; rotate all CI/CD secrets as if exposed; add --ignore-scripts to default npm install invocations.

Sources: Microsoft Security Blog, StepSecurity (Phantom Gyp), Wiz Blog, Snyk

Microsoft's first in-house coding model lands in GitHub Copilot — built on real workflows, not benchmarks

Jun 11, 2026 · 13:30 UTC · 2 min read

Microsoft's first in-house coding model lands in GitHub Copilot — built on real workflows, not benchmarks

Microsoft shipped MAI-Code-1-Flash on June 2 at Build 2026 — its first coding AI trained entirely in-house, without distillation from OpenAI or any third-party model. That provenance distinction is deliberate: Microsoft is signaling strategic independence from its own AI partner, and this is the first deployable product of that effort.

What makes the training approach different. Rather than training on a general coding benchmark corpus, the model was trained directly on GitHub Copilot's production harnesses — the actual file-editing tools, terminal integrations, and multi-step agentic loops that Copilot already runs in developers' IDEs. The result is a model optimized for the surrounding tool-call context of real development work, not just the isolated code problem. Microsoft describes this as "adaptive thinking" — the model adjusts its reasoning depth to task complexity, spending less on simple completions and more on harder multi-step tasks.

The benchmark profile is honest about where it sits. ~51% on SWE-Bench Pro (above Claude Haiku 4.5, below Claude Opus 4.6), 60% fewer tokens than comparable approaches for equivalent coding results, and 85.8% adjusted accuracy on Microsoft's internal 186-question adversarial benchmark across 34 categories. These numbers put it in the capable mid-tier with unusually good token efficiency — meaningful for teams where Copilot token credits are becoming a budget line.

Availability. MAI-Code-1-Flash is live in the GitHub Copilot model picker in VS Code on all paid Copilot tiers. It's also accessible via GitHub Models, Fireworks AI, Baseten, and OpenRouter. The broader MAI family (including MAI-Thinking-1 for reasoning) is available through Azure AI Foundry.

Sources: Microsoft AI announcement, Enterprise DNA, ChatForest analysis, GitHub Community Discussion

OpenAI frontier models are now billable through your Oracle Cloud commitment — no separate procurement

Jun 11, 2026 · 13:00 UTC · 2 min read

OpenAI frontier models are now billable through your Oracle Cloud commitment — no separate procurement

As of June 11, OpenAI's frontier models and Codex are accessible through Oracle Cloud Infrastructure (OCI) Marketplace, billable against existing Oracle Universal Credits — eliminating the separate procurement and billing pipeline that made OpenAI models a parallel budget track for Oracle-committed enterprises.

The announcement is the commercial layer built on the Stargate infrastructure partnership that's been scaling for months. GPT-5.5 was trained on the flagship Stargate site in Abilene, Texas, running on Oracle Cloud with NVIDIA GB200 systems; Oracle was already delivering the first GB200 racks last month as OpenAI began running early training and inference workloads there. What's new on June 11 is the billing integration: Oracle customers can now consume OpenAI models through existing purchase agreements the same way they consume any other OCI service.

Practical implications. If your enterprise is Oracle-committed — common for organizations running Fusion ERP, NetSuite, or large Oracle Database workloads — you may now be able to route AI workloads through existing credits rather than opening a separate OpenAI API billing relationship. For teams building on OCI, OpenAI's models appear alongside Oracle's own AI services in the Marketplace. GPT-5.5 and Codex are confirmed available; the model list is managed through OCI.

Competitive context. This makes Oracle the third major hyperscaler — alongside AWS Bedrock and Azure OpenAI Service — offering OpenAI models as a native cloud service. For enterprises choosing a cloud AI strategy, the practical question shifts from "which cloud supports OpenAI?" (all three do) to "which cloud's pricing, governance, and data-residency terms fit our existing commitments?"

Sources: OpenAI announcement, IT Brief NZ, Data Center Frontier: Stargate-Oracle infrastructure

WebMCP lands in Chrome Canary: websites become callable tools, and your MCP server may be optional

Jun 11, 2026 · 12:00 UTC · 2 min read

WebMCP lands in Chrome Canary: websites become callable tools, and your MCP server may be optional

The Chrome team shipped WebMCP this week as an early preview in Chrome 146 Canary (behind the "WebMCP for testing" flag at chrome://flags). Co-developed by Google and Microsoft engineers and incubated in the W3C's Web Machine Learning community group, it's a proposed standard that lets any website expose structured, callable tools to AI agents through a new browser API: navigator.modelContext.

Why this matters to you as a builder: today's browser agents interact with your site by screenshotting it or parsing raw DOM — slow, fragile, and token-expensive. WebMCP inverts that. Your site declares the actions it supports, and the agent calls them as functions. There are two surfaces: a Declarative API that makes existing well-structured HTML forms agent-callable by adding tool names and descriptions to the markup, and an Imperative API (registerTool()) where you define tool schemas — conceptually the same shape as OpenAI/Anthropic tool definitions — that run entirely client-side. A single searchProducts(query, filters) call replaces dozens of click-scroll-screenshot rounds.

Two design points worth noting. First, this is not a replacement for MCP — it's client-side and form-factor-different from Anthropic's JSON-RPC server protocol; the two are complementary (back-end MCP server for service-to-service, WebMCP for in-browser sessions where the user is present). Second, the spec explicitly treats headless autonomy as a non-goal — it's built around human-in-the-loop, cooperative browsing.

Practical move: if you own a consumer-facing web app, audit which of your flows are clean HTML forms — that's reportedly "80% of the way there" for the declarative path. DataCamp already has a hands-on tutorial if you want to try it in Canary this week.

Sources: VentureBeat, WebMCP spec site, DataCamp tutorial

Claude Code /fast economics: enable it on turn one or pay a full-context toll

Jun 11, 2026 · 11:30 UTC · 2 min read

Claude Code /fast economics: enable it on turn one or pay a full-context toll

A same-day-verified breakdown of Claude Code's fast mode is worth internalizing before you build a /fast habit. The headline trade is simple — same Opus model, up to 2.5x faster at 2x the token price ($10/$50 per MTok on Opus 4.8 vs $5/$25 standard) — but three billing wrinkles change how you should use it.

The first-enable toll. Fast mode adds a request header that's part of the prompt cache key. The first /fast in a conversation invalidates your cache and re-reads the entire history as uncached input at fast rates. 400K tokens deep, that toggle costs ~$4 before it does anything; on turn one, it's a cent. Enable at session start or wait for the next session. (The charge is once per conversation on v2.1.86+ — later toggles keep the cache.)

Separate money. On Pro/Max/Team plans, fast mode bills usage credits from the very first token and never touches included plan usage. Surprise line items on your credits are likely this. It also has its own rate-limit pool; hitting it degrades gracefully to standard speed.

Where it's wrong. Skip it for agents, overnight runs, and CI — nobody's watching, output tokens dominate, and fast mode isn't even available in the Batch API (the actual 50%-off lever). And if you're still pointed at Opus 4.7/4.6, fast mode there is a 6x premium ($30/$150) — switch models before switching speeds. Not available on Bedrock, Vertex, or Azure Foundry; CLI only, not the VS Code extension.

Rule of thumb: /fast for interactive debugging from turn one; standard mode for everything autonomous.

Sources: Developers Digest, Claude Code fast mode docs, Claude API pricing

The €0.02 prompt injection: a bank transfer memo compromised a production AI assistant

Jun 11, 2026 · 11:00 UTC · 2 min read

The €0.02 prompt injection: a bank transfer memo compromised a production AI assistant

Security firm Blue41 published a case study — 145 points and 120 comments on Hacker News — showing how they compromised the AI assistant at Bunq (Europe's second-largest digital bank, 20M+ customers) with a single €0.02 SEPA transfer. The free-form transfer description carried an injection payload; the next time the victim asked the assistant any question that fetched recent transactions, the attacker-controlled text entered the LLM context and steered the assistant into rendering a realistic reauthentication phish — inside the bank's own UI, referencing the user's real account data. As one researcher put it: "It was never about the prompt, it is about the prompt delivery."

This generalizes to anything you're building. The attack surface is any string an attacker controls that your agent later reads: email subjects, calendar invite titles, support tickets, webhook payloads, GitHub issue titles, product reviews, CRM notes. Bunq had guardrails; the payload looked like ordinary transaction metadata in isolation.

The layered defenses, with honest limits: minimize context to fields the task needs; structurally tag retrieved data as untrusted (a probability shift, not a boundary — the prepared-statement equivalent for LLMs doesn't exist yet); allowlist outputs and actions (the agent that can't emit external URLs can't exfiltrate users); human confirmation for side effects, displaying system-derived values rather than LLM summaries; and runtime behavioral monitoring to catch what prevention misses. The HN framing to remember: "We're not even at the 'ASLR' level of protection for LLMs yet." Design for the assumption that injection sometimes succeeds, and make sure it has nowhere useful to go.

Sources: Blue41 case study, Hacker News discussion, Developers Digest analysis, Simon Willison: prompt injection design patterns