Tuesday, May 26, 2015

Java XML Processing Vulnerabilities

Last week I was fixing issues for my pet project Scribble. I use Sonar for capturing issues in my code. Since April this year, the Findbugs plugin for Sonar includes rules for finding security bugs. Two of the bugs found were related to XML processing using Java's XML APIs for Xpath and DOM parsing. The security issue themselves were not new, both of them were discovered some years ago. But to me they were new as I was not aware of them at all. For my pet project they are not that critical as it is just a framework for writing tests and no one using that framework is kept from writing vulnerable code themselves. But for me it was a good case for studying the issues to avoid them when it really matters.


Xpath Injection

Xpath injection adheres to the same principle as SQL injection were parameter values that are used in an Xpath expression contain characters that are semantically bound to the Xpath syntax to break out from the path defined by the expression.

The Attack

Given, you have an XML document containing sensitive data

  <user id=”reader”>
  <user id=”writer”>
and an Xpath expression with a parameter that is filled in at runtime:
Lets assume, the attacker has authenticated successfully as reader and now tries to query for the private key, manipulating it's own user id to that value:


The injected value leaves the reader-user subpaths, traverses one level up and down into the writer-users subpath and thereby delivering the privateKey of that user. A variation of this attack is if the authentication data of a webapp is stored in xml, i.e. an XML database. With a forged userId the system can be tricked to authenticate without a proper password

The Defense

The only effective defense is to sanitize the user input! Typically, a regex-pattern could help with allowing only input of a certain pattern, i.e. allowing only alphanumeric characters and within a specific length range (5 to 15 characters):
if(!userId.matches([a-zA-Z0-9]{5,15}) { 
  throw new Exception(“Invalid Input“); 
If reserved characters should be allowed, you may escape them:
String escapedUserId = userId.replaceAll(“'“, “\\'“);
Although that may be prone to further injection to circumvent the escaping, so it should be thoroughly tested if self-implemented. Both pattern matching and escaping could be encapsulated in a javax.xml.xpath.XpathVariableResolver that is registered at the Xpath instance. The following example shows a sanitizing variable resolver that accepts a set of regular expressions to check the parameters that should be resolved
public class SanitizingVariableResolver implements XPathVariableResolver {
   //create a map to contain the variable values
  private Map<QName, String> variables = new HashMap<>();
  //keep a list of all valid patterns
  private final List<Pattern> validationPatterns;

  //constructor accepting regular expression patterns
  public SanitizingVariableResolver(String... regexPatterns){
    this.validationPatterns = new ArrayList<>();
    for(String regexPattern : regexPatterns) {
  //method to add variable value on which the sanity check is applied
  public void addVariable(String name, String value) {
    for(Pattern pattern : validationPatterns){
        variables.put(new QName(name), value);
    //don't accept invalid values
    throw new IllegalArgumentException("The value '" + value + "' is not 
      allowed for a variable" );
  public Object resolveVariable(QName variableName) {
    return this.variables.get(variableName);
Next, you'll have to apply this resolver to your Xpath instance and use an Xpath expression with a variable placeholder:
//create new xpath instance
final XPath xp = XPathFactory.newInstance().newXPath();

//instantiate the resolver with an alpahnumeric pattern
final SanitizingVariableResolver resolver = 
  new SanitizingVariableResolver("[a-zA-Z0-9]{4,15}");

//add the user id value
resolver.addVariable("userId", userId);

//assign the resolver to the xpath instance

//apply the xpath expression with variable

An alternative to sanitizing the input yourself, you may use alternative libraries such as Xquery that provides an abstraction layer on top of the Xpath API that provides means to sanitize parameter input.



XML External Entity (XXE)

XML document have to be well-formed and may be validated. For validation, there are two options for declaring a structure against which the document is validated: Doctype Definition (DTD) or XML Schema. A DTD may be embedded in the document itself. For XML the concepts of entities exist to describe characters or values that are parsed and replaced by the XML processor. A common example is the &-entity for describing an ampersand character ('&') because the '&' is a reserved character in Xml. Within a DTD custom entities can be declared. Values for those entities could be characters but also the content external resources indicated by an URI.

The Attack

In an XXE atttack, the attacker sends a perpared XML file containing a malicious entity. The entity points to an external resource containing a secret, i.e. /etc/passwd. Depending on what the service actually does, the attacker may easily read the secret from the parsed document.
A prepared XML document may be

<?xml version="1.0"?>
<!DOCTYPE document [
    <!-- placeholder for the attacked file url -->
    <!ENTITY xxe SYSTEM "/etc/passwd" >
    <!-- the external entity is replaced with the injected value -->>
When being processed by DocumentBuilder, the &xxe; is resolved to the content of /etc/passwd and accessible as text content of the element. The attack is also valid for processing XML with the SAX parser.

The Defense

There are several options to fix this vulnerability. Probably the easiest one is to use XML-Schemas only for XML validation and disable the Doctype Declaration feature by setting the DocumentBuilderFactory Feature http://apache.org/xml/features/disallow-doctype-decl to true:
DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);

This feature however is only supported by Xerces2. If you're on Xerces 1 or you can not disable Doctype declaration, you could disable the features

Xerces 1
Xerces 2
Sax in general

and set on the DocumentBuilderFactory the flags
Oracle proposes two alternative approaches. The first is to perform the parse operation in a privileged context with a no-permission ProtectionDomain where the java security policy is effective, preventing access to restricted system files. The second is to use an EntityResolver and allow only entities that match a certain pattern.
Further attacks against DTD, Schema and Entities and how to defend against are discussed in XML "Schema, DTD, and Entity Attacks"(pdf).


All examples, including JUnit tests that can be used as template to tests your own code can be found on https://github.com/gmuecke/whoopdicity/tree/master/examples
Post a Comment