Skip to main content

Fixing XXE Vulnerabilities in Nokogiri

Written by:
Tim Kadlec

Tim Kadlec

February 14, 2017

0 mins read

We recently added a pair of high-severity XML External Entities (XXE) vulnerabilities found in the Nokogiri library to our vulnerability database. This post explains how the vulnerability works and discusses how to fix the exploit in your application.

Nokogiri is a very popular library for parsing and extracting data from XML, SAX, Reader or HTML documents. Nokogiri uses libraries like libxml2 and libxslt to enable users to easily parse these documents using XPath or even CSS3 selectors.

Understanding XML External Entities attacks

To parse a string as XML, you first pass it to Nokogiri using the XML method:

1    xml = <<-EOX 
2    <?xml version="1.0" encoding="UTF-8"?> 
3    <!DOCTYPE root> 
4    EOX` 
5
6    doc = Nokogiri::XML(xml) 
7    puts doc.to_xml 
8
9    #outputs # --- xml: --- 
10    # --- xml: ---
11    # <?xml version="1.0" encoding="UTF-8"?> 
12    # <!DOCTYPE root>

The XML standard supports something called external entities, which can be defined using a link. When an XML document is being parsed, the parser can make a request to these links and include the content at the specified URI inside of the XML document. For example, we could include an external entity located at http://0.0.0.0:8000/evil.dtd:

1xml = <<-EOX 
2<?xml version="1.0" encoding="UTF-8"?> 
3<!DOCTYPE root [ <!ENTITY % remote SYSTEM "http://0.0.0.0:8080/evil.dtd"> %remote;]> 
4EOX

If an XML parser is set to include external entities, it opens the door for attackers to execute an XML External Entities attack by injecting a malicous entity. The results can be significant: XXE attacks have lead to denial of service, port scanning, and the disclosure of confidential information.

The safest way to prevent an XXE attack is to configure your XML parser to not include external DTD’s at all. Unfortunately the underlying library that Nokogiri uses for XML parsing (libxml2) can leave applications wide open to this attack.

The vulnerability

In versions of Nokogiri prior to 1.5.4, when you attempt to parse a string with an XXE defined, Nokogiri will make a request for any XXE defined by default. So if we attempt to parse the previously mentioned string, a request will be made to http://0.0.0.0:8080 for evil.dtd.

Versions of Nokogiri greater than 1.5.4 have put some safeguards in place to limit the exposure of the vulnerability. There are two options that Nokogiri provides that are related to this attack.The first is the DTDLOAD option, which defines whether or not Nokogiri should attempt to load any DTD’s discovered while parsing XML.

The other option is the NONET option. If the NONET option is set, then no unknown documents can be loaded from the network.

For versions of Nokogiri greater than 1.5.4 the default configuration has theDTDLOAD option set to false, and the NONET option set to true. In other words, by default, Nokogiri will not attempt to load any DTD’s defined, and will also not load documents over the network — which means the vulnerability cannot be exploited.

However, if a user were to set the DTDLOAD option to true and also set the NONET option to false, then the vulnerability is open to be exploited by any malicious attackers.

How to remediate

The issues were discovered by the Snyk security research team and disclosed to Nokogiri on January 11th. The Nokogiri team quickly triaged the issue, but unfortunately in this case Nokogiri is sort of stuck. The issue isn’t with Nokogiri itself, but with the underlying libxml2 library. Nokogiri is waiting for them to patch the issue so that they can update accordingly.

In the meantime, if you discover that your project includes this vulnerability, there are a few steps you can take to mitigate the issue.

First, make sure that you’re using Nokogiri version 1.5.4 or later. As we discussed, versions prior to 1.5.4 are vulnerable by default.

Once you’ve updated Nokigiri, double check your settings to make sure that you haven’t configured DTDLOAD to be true and NONET to false. By default you should be set, but if you’ve made those changes you’re currently vulnerable. It’s worth noting, you can set one or the other — the vulnerability is only exposed if both have been configured.

Taking these steps now will protect you from the vulnerability. If you’re monitoring your project, we’ll alert you when a fix becomes available.

Get started in capture the flag

Learn how to solve capture the flag challenges by watching our virtual 101 workshop on demand.

Snyk Top 10: Vulnerabilites you should know

Find out which types of vulnerabilities are most likely to appear in your projects based on Snyk scan results and security research.