AI Security Take Aways - Defcon and Blackhat 2025

Our quick take-aways on AI Security from Blackhat and Defcon this year.

Promptware. The latest security buzzword (which I do actually like).  From emails and meeting invites to git change logs and issues the vectors for indirect prompt injection are only increasing. Ultimately if the LLM ingests it, it’s fair game for potential prompt injections. Exploitation payloads are increasingly more sophisticated, with payloads for accessing connected resources like shared drives to establishing persistence via memory to even accessing connected home automation devices

Successful prompt injections / jailbreaks are often released within hours of a new model release.

MCP (Model Context Protocol) servers while very cool for closed experiments are not ready for production.

Model capability seems to be plateauing in a number of cybersecurity domains. As the recent ChatGPT 5 release shows we’re entering the phase of increasingly smaller incremental gains, rather than exponential leaps (unless there is another fundamental architectural change).

The lexicon seems to have shifted to agents, as if that somehow addresses the core issues with LLMs. An agent is a LLM at its core, and for LLMs using transformer architecture, prompt injection is not solved and likely not solvable without architectural change. As it stands there's no separation between data and control planes, the input space is unlimited, and the problem only gets worse as models get bigger. If that's not enough, it's non-deterministic, so an attacker can keep trying until the probabilistic token predictor rolls in their favour. And conversely you can test it with the same malicious input 99 times and have it fail on the 100th. This needs repeating because I don’t think there’s enough awareness of the unique security challenges that non-deterministic systems pose.

In web terms, think of it as having an unfixable SQL injection in a core business application, that randomly shows up.

What can you do? Treat your LLM deployment like that old unpatchable Windows XP host vulnerable to MS08-67. Assume untrusted, protect the boundaries, have the LLM emit code in a sandbox and verify the code, rather than provide direct tool/resource access and whatever you do don’t give it access to anything you don’t want disclosed. Often all an attacker has to do is ask nicely - “My grandmother used to...” (or use emojis or add spaces between letters or unicode characters or use a less common language...)

Some recommended AI Security presentations from the Main Defcon Tracks (AI Security Village were not out yet):

Ben Nassi Or Yair - Stav Cohen - Invitation Is All You Need Invoking Gemini for Workspace Agents with a Simple Google Calendar Invite

Tobias Diehl - Mind the Data Voids Hijacking Copilot Trust to Deliver C2 Instructions with Microsoft Authority.

Keane Lucas - Claude - Climbing a CTF Scoreboard Near You

Jamie Baxter

Principal at Appsurent

© 2025 Appsurent Cyber Security. All rights reserved.