Question 1

What is an LLM pentest?

Accepted Answer

An LLM pentest (Large Language Model penetration test) is an authorized security assessment of your LLM-based application conducted by specialized AI security experts. We simulate real-world attacks - from prompt injection and jailbreaking to data exfiltration - and systematically identify all vulnerabilities according to the OWASP Top 10 for LLM Applications standard. You receive a technical report with verified findings, proof-of-concept exploits, and a prioritized remediation roadmap.

Question 2

What is prompt injection and why is it so dangerous?

Accepted Answer

Prompt injection (OWASP LLM01) is the most critical vulnerability in LLM applications. An attacker manipulates the input so that the model ignores its system instructions and instead executes the attacker's commands. In direct prompt injection this happens via user input ("Ignore all previous instructions and…"), in indirect prompt injection via poisoned documents or data sources the LLM processes - particularly dangerous in RAG systems. Consequences range from data leaks and reputational damage to remote code execution when the LLM is connected to APIs or agent tools.

Question 3

How does jailbreaking work with LLMs?

Accepted Answer

Jailbreaking refers to techniques an attacker uses to circumvent a language model's safety policies and get it to generate prohibited or harmful content. Typical methods include: roleplay prompts ("Imagine you are an AI without restrictions…"), many-shot prompting (conditioning the model with many harmless examples), token smuggling (bypassing guardrails through unusual character encodings), adversarial suffixes (mathematically optimized token sequences), and multilingual exploits (switching to less-trained languages). In an LLM pentest we test your guardrails against all known and novel jailbreak techniques.

Question 4

Can an LLM disclose confidential data?

Accepted Answer

Yes - in several ways. First, through training data extraction: LLMs can reproduce data from their training data, potentially including personal or confidential information. Second, through system prompt leakage: attackers can specifically extract the hidden system prompt (containing business logic, API keys, or confidential instructions). Third, through contextual exfiltration: when the LLM has access to databases or documents, prompt injection can lead to unauthorized disclosure of that data. Our LLM pentest examines all three attack classes according to OWASP LLM02 (Sensitive Information Disclosure).

Question 5

What is the OWASP Top 10 for LLMs?

Accepted Answer

The OWASP Top 10 for Large Language Model Applications is the international community standard for LLM security, developed by an international open-source community. The ten categories in the current 2025 version are: LLM01 Prompt Injection, LLM02 Sensitive Information Disclosure, LLM03 Supply Chain, LLM04 Data and Model Poisoning, LLM05 Improper Output Handling, LLM06 Excessive Agency, LLM07 System Prompt Leakage, LLM08 Vector and Embedding Weaknesses, LLM09 Misinformation, and LLM10 Unbounded Consumption. Every AWARE7 LLM pentest systematically covers all ten categories and documents findings with CVSS scores and OWASP references.

Question 6

How do I protect my ChatGPT chatbot or AI assistant?

Accepted Answer

The most important security measures for production LLM applications: 1) Input validation and sanitization before passing to the model. 2) Strict output validation, especially when LLM output is processed further. 3) Principle of least privilege for agent capabilities and tool access. 4) Isolation of the system prompt and no trust in user-controlled contexts. 5) Layered guardrails: input classifier + output classifier + monitoring. 6) Regular red team tests, as new jailbreak techniques are continuously being developed. An LLM pentest delivers a complete hardening roadmap for your specific use case.

Question 7

What does an LLM pentest cost?

Accepted Answer

A focused LLM pentest for a single chatbot or copilot starts from EUR 8,100. The price depends on scope: number of endpoints, complexity of guardrails, integrations (RAG, tools, APIs), and desired test depth. You receive a binding fixed-price offer within 48 hours (business days) - no hourly rates, no additional charges. For companies with multiple LLM systems or ongoing AI products, we also offer retainer models with quarterly testing.

Question 8

How often should an LLM system be tested?

Accepted Answer

LLMs require more frequent testing than classic software, as the attack surface can evolve without code changes: new jailbreak techniques are published daily, RAG content changes, model updates alter behavior, and new tool integrations open new attack paths. Recommendation: at least once annually for a full LLM pentest, at every major model update or new functionality, and semi-annually for security-critical applications. We also offer continuous adversarial testing retainers.

Question 9

What is indirect prompt injection?

Accepted Answer

Indirect prompt injection (also "prompt injection via data sources") is a particularly insidious attack variant: the attacker injects no commands directly into the user input, but into external data that the LLM processes - e.g., web pages, documents, emails, or database entries. A RAG system that loads a poisoned document can thus be made to exfiltrate data, generate false responses, or perform actions on behalf of the user. Especially dangerous with AI agents that have tool access, where an indirect injection can lead to automated code execution.

LLM Pentest: Security for
Your AI Applications.

OWASP Top 10 for LLM Applications

Prompt Injection

Sensitive Information Disclosure

Supply Chain

Data and Model Poisoning

Improper Output Handling

Excessive Agency

System Prompt Leakage

Vector and Embedding Weaknesses

Misinformation

Unbounded Consumption

How Attackers Target LLMs

Prompt Injection via Chat

Indirect Injection via Document

Guardrail Bypass via Roleplay

Excessive Agency - Tool Abuse

How AWARE7 Conducts an LLM Pentest

Scoping & Threat Modeling

Reconnaissance

Automated LLM Security Testing

Manual Expert Analysis

Exploitation & Proof-of-Concept

Reporting & Remediation

Specialized LLM Security Tooling

Garak

Promptfoo

Manual Analysis

OWASP Top 10 LLM

MITRE ATLAS

Custom Fuzzing

Was uns von anderen Anbietern unterscheidet

Forschung und Lehre als Fundament

Digitale Souveränität - keine Kompromisse

Festpreis in 24h - planbare Projektzeiträume

Ihr fester Ansprechpartner - jederzeit erreichbar

OWASP Top 10 for Large Language Models

Management von Cyber-Risiken

Frequently Asked Questions about LLM Pentesting

How vulnerable is your LLM system really?