Last week I was
fixing issues for my pet project Scribble. I use Sonar for capturing
issues in my code. Since April this year, the Findbugs plugin for
Sonar includes rules for finding security bugs. Two of the bugs found
were related to XML processing using Java's XML APIs for Xpath and
DOM parsing. The security issue themselves were not new, both of them
were discovered some years ago. But to me they were new as I was not
aware of them at all. For my pet project they are not that critical
as it is just a framework for writing tests and no one using that
framework is kept from writing vulnerable code themselves. But for me
it was a good case for studying the issues to avoid them when it
really matters.
Xpath Injection
Xpath injection adheres to the same principle as SQL injection
were parameter values that are used in an Xpath expression contain
characters that are semantically bound to the Xpath syntax to break
out from the path defined by the expression.
The Attack
Given, you have an XML document containing sensitive data
<technical-users>
<user id=”reader”>
<privateKey>ABC</privateKey>
</user>
<user id=”writer”>
<privateKey>123</privateKey>
</user>
</technical-users>
and an Xpath expression with a parameter that is filled in at
runtime:
//technical-users/user[@id='”+userId+”']/privateKey
Lets assume, the attacker has authenticated successfully as reader
and now tries to query for the private key, manipulating it's own
user id to that value:
reader']/../user[@id='writer
The injected value leaves the reader-user subpaths, traverses
one level up and down into the writer-users subpath and thereby
delivering the privateKey of that user.
A variation of this attack is if the authentication data of a
webapp is stored in xml, i.e. an XML database. With a forged userId
the system can be tricked to authenticate without a proper password
The Defense
The only effective defense is to sanitize the user input!
Typically, a regex-pattern could help with allowing only input of a
certain pattern, i.e. allowing only alphanumeric characters and
within a specific length range (5 to 15 characters):
if(!userId.matches([a-zA-Z0-9]{5,15}) {
throw new Exception(“Invalid Input“);
}
If reserved characters should be allowed, you may escape them:
String escapedUserId = userId.replaceAll(“'“, “\\'“);
Although that may be prone to further injection to circumvent
the escaping, so it should be thoroughly tested if self-implemented.
Both pattern matching and escaping could be encapsulated in a
javax.xml.xpath.XpathVariableResolver that is registered at the Xpath
instance. The following example shows a sanitizing variable resolver
that accepts a set of regular expressions to check the parameters
that should be resolved
public class SanitizingVariableResolver implements XPathVariableResolver {
//create a map to contain the variable values
private Map<QName, String> variables = new HashMap<>();
//keep a list of all valid patterns
private final List<Pattern> validationPatterns;
//constructor accepting regular expression patterns
public SanitizingVariableResolver(String... regexPatterns){
this.validationPatterns = new ArrayList<>();
for(String regexPattern : regexPatterns) {
this.validationPatterns.add(Pattern.compile(regexPattern));
}
}
//method to add variable value on which the sanity check is applied
public void addVariable(String name, String value) {
for(Pattern pattern : validationPatterns){
if(pattern.matcher(value).matches()){
variables.put(new QName(name), value);
return;
}
}
//don't accept invalid values
throw new IllegalArgumentException("The value '" + value + "' is not
allowed for a variable" );
}
@Override
public Object resolveVariable(QName variableName) {
return this.variables.get(variableName);
}
}
Next, you'll have to apply this resolver to your Xpath instance
and use an Xpath expression with a variable placeholder:
//create new xpath instance
final XPath xp = XPathFactory.newInstance().newXPath();
//instantiate the resolver with an alpahnumeric pattern
final SanitizingVariableResolver resolver =
new SanitizingVariableResolver("[a-zA-Z0-9]{4,15}");
//add the user id value
resolver.addVariable("userId", userId);
//assign the resolver to the xpath instance
xp.setXPathVariableResolver(resolver);
//apply the xpath expression with variable
xp.evaluate("//technical-users/user[@id=$userId]/privateKey",source);
An
alternative to sanitizing the input yourself, you may use alternative
libraries such as Xquery that provides an abstraction layer on top of
the Xpath API that provides means to sanitize parameter input.
References
XML External Entity (XXE)
XML document have to be well-formed and may be validated. For
validation, there are two options for declaring a structure against
which the document is validated: Doctype Definition (DTD) or XML
Schema. A DTD may be embedded in the document itself. For XML the
concepts of entities exist to describe characters or values that are
parsed and replaced by the XML processor. A common example is the
&-entity for describing an ampersand character ('&')
because the '&' is a reserved character in Xml. Within a DTD
custom entities can be declared. Values for those entities could be
characters but also the content external resources indicated by an
URI.
The Attack
In an XXE atttack, the attacker sends a perpared XML file
containing a malicious entity. The entity points to an external
resource containing a secret, i.e. /etc/passwd. Depending on what the
service actually does, the attacker may easily read the secret from
the parsed document.
A prepared XML document may be
<?xml version="1.0"?>
<!DOCTYPE document [
<!-- placeholder for the attacked file url -->
<!ENTITY xxe SYSTEM "/etc/passwd" >
]>
<document>
<!-- the external entity is replaced with the injected value -->>
<property>&xxe;</property>
</document>
When being processed by DocumentBuilder, the &xxe; is
resolved to the content of /etc/passwd and accessible as text content
of the
element.
The attack is also valid for processing XML with the SAX parser.
The Defense
There are several options to fix this vulnerability. Probably the
easiest one is to use XML-Schemas only for XML validation and disable
the Doctype Declaration feature by setting the
DocumentBuilderFactory
Feature
http://apache.org/xml/features/disallow-doctype-decl
to true:
DocumentBuilderFactory f = DocumentBuilderFactory.newInstance();
f.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
This feature however is only supported by Xerces2. If you're
on Xerces 1 or you can not disable Doctype declaration, you could
disable the features
Xerces 1
http://xerces.apache.org/xerces-j/features.html#external-general-entities
http://xerces.apache.org/xerces-j/features.html#external-parameter-entities
Xerces 2
http://xerces.apache.org/xerces2-j/features.html#external-general-entities
http://xerces.apache.org/xerces2-j/features.html#external-parameter-entities
Sax in general
http://xml.org/sax/features/external-general-entities
http://xml.org/sax/features/external-parameter-entities
and set on the DocumentBuilderFactory the flags
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);
Oracle proposes two alternative approaches. The first is to
perform the parse operation in a privileged context with a
no-permission ProtectionDomain where the java security policy is
effective, preventing access to restricted system files. The second
is to use an EntityResolver and allow only entities that match a
certain pattern.
Further attacks against DTD, Schema and Entities and how to defend
against are discussed in XML "
Schema, DTD, and Entity Attacks"(pdf).
References
All examples, including JUnit tests that can be used as template to tests your own code can be found on
https://github.com/gmuecke/whoopdicity/tree/master/examples