How to disable XXE processing?

by eric therond|

In my last post I talked about XXE vulnerabilities found on popular open-source projects and more generally how to assess this type of issue. Today, I’ll talk about the different strategies to disable XXE processing.

External (XXE) and internal entities are useful for building concise XML documents. The appropriate solution to prevent XXE vulnerabilities depends on your project needs. It can be as easy as completely disabling external entities or a slightly more complicated careful resolution of only the ones that you need and trust.

Since the Java language, especially the JAXP API, offers more options than in any other language we investigated, our code examples and solutions will be mainly in Java, but we show equivalent strategies for other languages as well. 

Disabling DOCTYPE

As we discussed earlier, entities are declared in the DOCTYPE of an XML document and so when DOCTYPE declarations are not required in a project, an easy and safe solution is to disable them completely. 

The disallow-doctype-decl feature when set to true instructs the XML processor to throw an exception when a DOCTYPE declaration is encountered:

factory.setFeature("", true);

Disabling external entities declarations

A less strict fix is to allow DOCTYPE declarations and only prohibit external entities declarations. Therefore, the XML processor raises an exception if an external entity is found, but processes other DTD declarations normally. Parameter and general external entities are disabled by setting both of the following features to false:

factory.setFeature("", false);
factory.setFeature("", false);

PHP's libxml library is safe by default because external entities are disabled unless the LIBXML_NOENT parameter is explicitly set to allow them:

$doc = simplexml_load_string($xml, "SimpleXMLElement", LIBXML_NOENT); // !XXE enabled!
$doc = simplexml_load_string($xml, "SimpleXMLElement"); // XXE disabled

Note: the LIBXML_NOENT parameter name is misleading as it doesn't create entity reference nodes in the DOM tree, explaining the "NOENT" suffix, but substitutes the entity with its content.

Enabling secure processing

The Java JAXP Feature for Secure Processing (FSP) can be explicitly enabled as follows:

factory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);

It is the central Java mechanism for configuring an XML processor securely by applying restrictions to prevent potential risks such as XML denial of service attacks and XXE vulnerabilities. 

By default FSP is partially enabled and prevents XML denial of service attacks. However, it is only when FSP is explicitly fully enabled, by calling the setFeature method to set the FSP property to true, that external connections are also expected to be disallowed. Unfortunately it’s not the case for all XML processors, for instance on Apache Xerces, FSP doesn’t restrict external connections and thus doesn’t protect against XXE vulnerabilities.

Therefore, be sure to test FSP behavior with regard to XXE vulnerabilities and use additional properties, such as the others we present in this post, to explicitly and directly disable or restrict XXEs.

Disabling entities references expansion

For each entity reference (&entityname;) found in the XML document, a DOM XML parser either replaces the reference with its value or creates an “empty” entity reference node in the DOM tree, depending on its configuration. The mechanism of replacing entity references with their value, also known as "expanding entity references'', can disclose sensitive information if a maliciously crafted XML file is parsed, as we discussed in the first post in this series.

In Java, the setExpandEntityReferences method of the DocumentBuilder factory is used to configure how the entity references are handled:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

When called with false, entity references are not expanded, preventing XXE vulnerabilities. 

An important thing to mention is that the Xerces processor provided with OpenJDK prior to version 13 doesn’t honor setting the expandEntityReferences property to false; entity references are always expanded. Obviously, the best course is to upgrade OpenJDK but if you can't do that, rule S2755 is able to detect them.

The equivalent feature with the C/C++ Xerces library is:

xercesc::XercesDOMParser *DOMparser = new xercesc::XercesDOMParser();

Creating entity reference nodes means that entity references are not expanded and thus don’t result in external content disclosures. Unfortunately, getting these settings configured correctly can be difficult because these method names are not self-explanatory and it is easy to get confused. For example, we recently contributed to an improvement of the OWASP C++ guidelines. Previously they wrongly recommended setting this parameter to false instead of true

Therefore, rule S2755 for C++ will be triggered if you still rely on the old OWASP recommendation. This was for example the case of the msix-packaging Microsoft open-source project, a C++ tool to pack and unpack MSIX packages:

Expanding (or not) external entity references occurs after the external content has already been fetched. So even if expansion is disabled and attackers cannot exfiltrate data, requests to external resources are still performed. In this situation a security risk exists, but it could be considered low since an attacker cannot do much more than a “blind SSRF” attack. If it’s not acceptable in your context then you should consider one of the solutions discussed above.

Note: a blind SSRF happens when an attacker can trick the server, in this case the XML processor, to perform an arbitrary request without being able to retrieve the response content. Suppose that this API is accessible to the XML processor. Then an attacker can perform a request to this API endpoint. In this example's  worst case the attacker may be able to guess the existence of a username, depending on the XML processor error handling.


In this post, we saw how to configure your XML parser to prevent XXE vulnerabilities, from disabling XXE declarations, if you don't need them at all, to disabling reference expansions, when you want to allow XXE declarations and fetching but not its substitutions. But sometimes your project may require an even more flexible and precise fix to control and limit the resolving to specific XXEs only, the ones that you expect and that are safe. This is what we'll see in a third and final blog post.

Related Blog Posts