Attack analysis, benchmark deep-dives, and technical writing on AI agent security.
VentureBeat just compared prompt-injection numbers across the four frontier labs. Anthropic published 31.5% raw on browser, 0.5% safeguarded. OpenAI published 0.963. Google published nothing. Here is why no two of those numbers belong on the same scoreboard, and what to do instead.
Read article →VentureBeat published a piece this week on tool poisoning. They identified the threat (injection payloads in tool descriptions, behavioral drift, no detection category) and named the fix: a verification proxy between agent and tool. Here is the technical response, plus what an actual verification proxy looks like in production.
Read article →Single-agent chatbots were the warm-up. Multi-agent systems — CrewAI, AutoGen, LangGraph — multiply every injection vector by the number of trust boundaries. Here is the threat model and the defense architecture.
Read article →Step-by-step guide with Python SDK and curl examples. Three architecture patterns: guard user messages, RAG documents, and MCP tool outputs. Free API, 100 req/day.
Read tutorialThe UK NCSC warns of a “cyber perfect storm”: AI-powered zero-day discovery meets nation-state aggression. 204 nationally significant incidents last year. But nobody is talking about the real gap — AI agents themselves as the next attack surface.
Read articleAnthropic's restricted Mythos model was accessed by unauthorized users through a vendor breach. The real lesson isn't about access control — it's about what happens when powerful AI agents process untrusted input. We break down the 4-layer defense model.
Read articleJohns Hopkins researchers stole API keys from all three AI agents via prompt injection. No CVEs were published. We walk through each attack and show which AgentShield layers would have caught them.
Read articleWe ran AgentShield against every public prompt-injection dataset we could get. Five datasets, 5,972 prompts, one decision threshold. This post covers the wins, the two failure modes we care about, and the datasets we couldn't run.
Read article