Security Guide for the AI Coding Era

Developers who use AI coding assistants write code faster — and ship more security vulnerabilities while feeling more confident their code is safe. This is the AI security paradox, and it is the starting point for everything in this guide.

AI tools like Claude Code can now explore your full codebase — reading files, grepping for symbols, and tracing dependencies across modules. But capability isn't the same as guarantee. They may not proactively audit every security implication unless prompted, their suggestions still reflect patterns common in training data (which skew permissive), and they can occasionally hallucinate package names that attackers have pre-registered with malicious payloads. The gap isn't visibility anymore — it's judgment about what to verify and when. Understanding where that judgment falls short is what lets you catch issues before they ship.

This guide covers three distinct threat surfaces:

  • AI coding agents (Copilot, Cursor, Claude Code) — the security vulnerabilities these tools consistently introduce into your codebase
  • AI features (chatbots, document Q&A, RAG pipelines) — attacks that are unique to applications running LLMs at runtime
  • AI agents (tools connected to an LLM that take autonomous actions) — the risks that arise when AI can read files, send emails, or modify infrastructure

General web security (SQL injection, XSS, CORS, authentication) is covered in a companion OWASP guide. This guide focuses on what is new and different in the AI era.

How to Read This Guide#

  • Complete beginners: Read all chapters in order. Chapter 1 builds the mental model that the rest of the guide depends on.
  • Developers who already use AI coding tools: Skim Chapter 1 for key term definitions, then go directly to Chapters 2–4.
  • Building an AI-powered app: Chapters 3 and 4 are your priority, with Chapter 5 as your practical takeaway.

🎬 Video Course#

We also have a YouTube channel! There is a dedicated playlist covering AI security topics — if you prefer learning by watching, you can quickly get up to speed on the key concepts from this guide through these videos:

▶️

AI Security — YouTube Playlist

Watch the video series to quickly learn the essential AI security concepts covered in this guide.

Learning Path#

Rendering diagram...
Rendering diagram...
Rendering diagram...
Rendering diagram...
Rendering diagram...

Three Threat Surfaces at a Glance#

The cards below introduce one representative vulnerability from each of the guide's three domains. Each chapter goes deeper with more examples, root-cause analysis, and mitigations.

Hardcoded Credentials

High

AI coding agents embed live secrets directly into source code when asked to connect to external services.

Ch 2 · AI Coding AgentsCWE-798

The AI generated a database connection with the password written directly into the source file.

Prompt Injection

Critical

Malicious text in user input or retrieved documents hijacks the AI's instructions, overriding its system prompt.

Ch 3 · LLM-Specific ThreatsOWASP LLM01:2025

User input is concatenated directly into the prompt. An attacker can inject new instructions after the question.

Excessive Agency

High

AI agents granted more permissions than their task requires can cause catastrophic damage if compromised or hallucinating.

Ch 4 · AI Agent SecurityOWASP LLM06:2025
How It Works

An AI agent is an LLM connected to tools that take real-world actions: a file editor, an email sender, a database client, a code runner. Every tool the agent has access to is a potential attack surface. If the agent is compromised via prompt injection — or simply hallucinates a destructive action — the damage it can cause is exactly equal to the permissions it holds. A customer support agent that also has the ability to delete user records can, through a single injected instruction, wipe the entire database.

Potential Consequences
A prompt-injected agent with write access to a data store can delete, corrupt, or exfiltrate records
An agent with email-sending capability can be used to send phishing messages from a trusted domain
An agent with infrastructure access (IAM, Terraform) can spin up resources or expose services to the internet
In multi-agent systems, a compromised sub-agent can issue malicious instructions to the orchestrator

OWASP LLM Top 10: 2025#

The OWASP Top 10 for LLM Applications is the threat taxonomy specifically for AI-powered applications. Chapter 3 covers the threats that target your deployed LLM features; Chapter 4 covers the agent-specific entries.

OWASP Top 10 for LLM Applications 2025

IDThreatWhat It Means for Developers
LLM01:2025Prompt InjectionUser input or retrieved content overrides system instructions. The most common LLM attack vector. Covered in Chapter 3.
LLM02:2025Sensitive Information DisclosureThe model leaks PII, credentials, or training data in its output. Jumped from #6 in 2023. Covered in Chapter 3.
LLM03:2025Supply ChainCompromised training data, pre-trained model weights, plugins, or third-party integrations introduce malicious behavior.
LLM04:2025Data PoisoningAttackers manipulate training or embedding data to degrade model quality, introduce backdoors, or bias outputs.
LLM05:2025Improper Output HandlingLLM output passed unsanitized to browsers, databases, or shells enables XSS, SQL injection, or code execution. Covered in Chapter 3.
LLM06:2025Excessive AgencyAgents with more permissions than needed take destructive actions when compromised or hallucinating. Covered in Chapter 4.
LLM07:2025System Prompt LeakageNew in 2025. Sensitive business logic or credentials embedded in system prompts are exposed to users. Covered in Chapter 3.
LLM08:2025Vector and Embedding WeaknessesRAG-specific: retrieval poisoning and embedding inversion risks. Covered in Chapter 3.
LLM09:2025MisinformationHallucinated but confident-sounding output leads to decisions based on false information — a reliability and safety risk.
LLM10:2025Unbounded ConsumptionUncontrolled resource usage (tokens, API calls) causes outages or runaway costs. Includes Denial of Wallet attacks. Covered in Chapter 3.

Source: OWASP Top 10 for Large Language Model Applications 2025 (genai.owasp.org)

Chapters#

Chapter 1 · The AI Security Paradox#

Why Faster Code Isn't Always Safer Code

The counterintuitive finding that AI-assisted developers ship more vulnerabilities while feeling more confident. Builds the mental model for the rest of the guide: AI output is untrusted third-party code.

Start here →

Topics covered:

  • The Trust Trap — why code that compiles cleanly is not the same as code that is secure
  • AI coding agents vs. AI features vs. AI agents — three things that sound similar but carry different risks

Chapter 2 · Security Mistakes Made by AI Coding Agents#

Topics covered:

  • Common vulnerability patterns: missing input validation, SQL injection via string concatenation, hardcoded credentials, insecure cryptography, XSS
  • Context blindness and unprotected routes — AI generates endpoints without checking for your existing authentication middleware unless prompted
  • Over-permissive configurations — why AI defaults to * CORS, wildcard IAM policies, and 0.0.0.0 bindings
  • Deprecated libraries and slopsquatting — how AI-hallucinated package names become an attack vector
  • Secret leakage from context windows — how .env files end up committed to version control
  • Prompt injection targeting your IDE — malicious instructions embedded in repository files
  • Insecure test code — disabled authentication, mocked cryptography, and silently removed assertions

Chapter 3 · AI and LLM-Specific Threats#

Topics covered:

  • Prompt injection (direct and indirect) — and why "don't follow user instructions" is not a reliable defense
  • Improper output handling — when AI output flows into HTML, SQL, or code execution sinks
  • System prompt leakage — protecting your AI's hidden instructions (and why secrets must never be stored there)
  • Sensitive information disclosure — context window leakage and training data memorization
  • Denial of Wallet — how attackers trigger massive AI API costs without taking your service offline
  • Vector and embedding weaknesses — retrieval poisoning and embedding inversion in RAG systems

Chapter 4 · AI Agent Security#

Topics covered:

  • Excessive agency and the Principle of Least Privilege — auditing an agent's tool list
  • Human-in-the-loop controls — a risk-based framework for deciding which actions require human confirmation
  • Multi-agent trust — authenticating inter-agent communication in AI pipelines
  • Context isolation — keeping users' data separate in multi-user agentic systems
  • Monitoring and observability — what to log, what not to log, and early warning signals for abuse

Chapter 5 · The AI-Safe Developer Checklist#

The AI-Safe Developer Checklist

A consolidated, actionable checklist organized by workflow stage: before coding, while prompting, during code review, before committing, when building AI features, and when deploying agents.

Read →