Prompt Injection
Prompt injection is a class of attacks where an attacker inserts instructions into
LLM input that override or subvert the application's intended behavior, named by analogy to SQL injection. Direct prompt injection occurs when the attacker is the user — submitting prompts designed to circumvent the system prompt or extract restricted information.
Indirect prompt injection is more dangerous: malicious instructions are embedded in third-party content (web pages, emails, documents) that the
LLM processes, attacking via content the developer did not anticipate. The OWASP LLM Top 10 lists prompt injection as the #1 security risk for
LLM applications. Defenses include input sanitization, instruction-following classifiers, structured prompt formats with explicit boundary markers, output filtering, and architectural separation between trusted and untrusted content. Simon Willison popularized awareness of indirect prompt injection in 2022-2023, and the threat has only grown as
LLMs ingest more third-party content via
RAG, browser agents, and email integrations. AI governance teams treat prompt injection as a primary threat in their AI compliance threat models.
Prompt-injection defenses through Centralpoint: Centralpoint enforces system prompt isolation, content boundary markers, and input filtering as defenses against prompt injection across any LLM. Tokens are metered per skill, prompts stay local, supports generative and embedded models, and deploys hardened chatbots through one line of JavaScript on any portal.
Related Keywords:
Prompt Injection,
,