XPath and Default Namespace handling
A lot of questions about XPath expressions not returning the expected results seem to be related to the (ab)use of Namespaces and mostly by so-called "Default Namespaces". This article will try to explain the problem and provides solutions using 3 popular XPath implementations: Jaxen, the JAXP XPathFactory and XSLT.
What's the Problem?
Let's assume the following XML:
<catalog> <cd> <artist>Sufjan Stevens</artist> <title>Illinois</title> <src>http://www.sufjan.com/</src> </cd> <cd> <artist>Stoat</artist> <title>Future come and get me</title> <src>http://www.stoatmusic.com/</src> </cd> <cd> <artist>The White Stripes</artist> <title>Get behind me satan</title> <src>http://www.whitestripes.com/</src> </cd> </catalog>
'//cd'
not declared in a namespace.Now let's take the same XML however now defining all elements in the 'http://www.edankert.com/examples/'
namespace.
And instead of prefixing all the different elements (although this would cause the same problem), we're declaring a so-called default namespace at the root element.
So the XML now looks like:
<catalog xmlns="http://www.edankert.com/examples/"> <cd> <artist>Sufjan Stevens</artist> <title>Illinois</title> <src>http://www.sufjan.com/</src> </cd> <cd> <artist>Stoat</artist> <title>Future come and get me</title> <src>http://www.stoatmusic.com/</src> </cd> <cd> <artist>The White Stripes</artist> <title>Get behind me satan</title> <src>http://www.whitestripes.com/</src> </cd> </catalog>
'//cd'
, we notice that nothing is returned. This is because the specified XPath returns all cd elements that have not been declared in a namespace and in the example above all the 'cd'
elements are declared in the'http://www.edankert.com/examples/'
namespace.Namespace-Prefix mappings
We need some kind of way to specify in our XPath expression that we are looking for all 'cd'
elements in the'http://www.edankert.com/examples/'
namespace.
To handle this, the XPath specification allows us to use a QName to specify an element or an attribute. A QName can be either a name on its own 'element'
or a name with a prefix 'pre:element'
. This prefix however needs to be mapped to a Namespace URI. So mapping the 'pre'
prefix to the 'http://www.edankert.com/test'
Namespace URI should allow us to find all 'element'
elements defined in the 'http://www.edankert.com/test'
namespace.
In this case for instance we could use the 'edx'
prefix and map this prefix to the 'http://www.edankert.com/examples/'
namespace URI. This would result in the following XPath expression that should return all 'cd'
elements that are declared in the'http://www.edankert.com/examples/'
namespace: '//edx:cd'
.
All XPath processors allow you to specify prefix-namespace mappings, however how depends on the specific implementation. See below for examples of how to map namespaces and prefixes using Jaxen (JDOM/dom4j/XOM), JAXP and XSLT.
Jaxen and Dom4J
The following code reads a XML Document from the file system in a org.dom4j.Document
and searches this document for 'cd'
elements defined in the 'http://www.edankert.com/examples/'
namespace.
try { SAXReader reader = new SAXReader(); Document document = reader.read( "file:catalog.xml"); HashMap map = new HashMap(); map.put( "edx", "http://www.edankert.com/examples/"); XPath xpath = new Dom4jXPath( "//edx:cd"); xpath.setNamespaceContext( new SimpleNamespaceContext( map)); List nodes = xpath.selectNodes( document); ... } catch ( JaxenException e) { // An error occurred parsing or executing the XPath ... } catch ( DocumentException e) { // the document is not well-formed. ... }
The first step is to create a SAXReader
, which is used to read the 'catalog.xml
' document from the file system and create a dom4j specific Document
from it.
The next step is the same for all Jaxen implementations, this is to create a HashMap
of prefix and namespace-uris.
To be able to use the Jaxen XPath functionality with dom4j we need to create a dom4j specific XPath object (Dom4jXPath
) passing our XPath expression into the constructor.
Now we have created the XPath
object, we can provide the map with prefix and namespace-uris to the XPath engine, wrapping this map in the SimpleNamespaceContext
object, the default implementation of the Jaxen NamespaceContext
interface.
The last step is to perform the search, calling the 'selectNodes()
' method on the XPath, passing the complete dom4j Document
as the context node for this method.
Jaxen and XOM
XOM is the newest kid on the block of the simplified Java DOM APIs, it's design promises an easy to use and to learn interface.try { Builder builder = new Builder(); Document document = builder.build( "file:catalog.xml"); HashMap map = new HashMap(); map.put( "edx", "http://www.edankert.com/examples/"); XPath xpath = new XOMXPath( "//edx:cd"); xpath.setNamespaceContext( new SimpleNamespaceContext( map)); List nodes = xpath.selectNodes( document); ... } catch ( JaxenException e) { // An error occurred parsing or executing the XPath ... } catch ( IOException e) { // An error occurred opening the document ... } catch ( ParsingException e) { // An error occurred parsing the document ... }
We need to create a Builder
object, to read the 'catalog.xml
' document from the file system and to create a XOM specific Document
.
Next we create the HashMap
of prefix and namespace-uris.
We need to create a XOM specific XPath object (XOMXPath
) passing our XPath expression into the constructor to be able to use the Jaxen XPath functionality with XOM.
After we have created the XPath
object, we again provide the map with prefix and namespace-uris to the XPath engine, wrapping this map in the SimpleNamespaceContext
object.
Finally we perform the search by calling the 'selectNodes()
' method on the XPath object, passing the XOM Document
as the context node for this method.
Jaxen and JDOM
JDOM, the first of the simplified XML APIs.try { SAXBuilder builder = new SAXBuilder(); Document document = builder.build( "file:catalog.xml"); HashMap map = new HashMap(); map.put( "edx", "http://www.edankert.com/examples/"); XPath xpath = new JDOMXPath( "//edx:cd"); xpath.setNamespaceContext( new SimpleNamespaceContext( map)); List nodes = xpath.selectNodes( document); ... } catch ( JaxenException e) { // An error occurred parsing or executing the XPath ... } catch ( IOException e) { // An error occurred opening the document ... } catch ( JDOMException e) { // An error occurred parsing the document ... }
First we create a JDOM specific Document
using the SAXBuilder
object.
Next we create a JDOM specific XPath object (JDOMXPath
.
After this, we can provide the map with prefix and namespace-uris to the XPath engine, wrapping this map in theSimpleNamespaceContext
object.
Finally we perform the search by calling the 'selectNodes()
' method on the XPath object, passing the JDOM Document
as the context node for this method.
JAXP XPathFactory
Since version 1.3, JAXP also provides a generic mechanism to perform XPath searches on XML Object Models.try { DocumentBuilderFactory domFactory = DocumentBuilderFactory.newInstance(); domFactory.setNamespaceAware( true); DocumentBuilder builder = domFactory.newDocumentBuilder(); Document document = builder.parse( new InputSource( "file:catalog.xml")); XPathFactory factory = XPathFactory.newInstance(); XPath xpath = factory.newXPath(); xpath.setNamespaceContext( new NamespaceContext() { public String getNamespaceURI( String prefix) { if ( prefix.equals( "edx")) { return "http://www.edankert.com/examples/"; } else if ... ... } return XPathConstants.NULL_NS_URI; } public String getPrefix( String namespaceURI) { if ( namespaceURI.equals( "http://www.edankert.com/examples/")) { return "edx"; } else if ... ... } return null; } public Iterator getPrefixes( String namespaceURI) { ArrayList list = new ArrayList(); if ( namespaceURI.equals( "http://www.edankert.com/examples/")) { list.add( "edx"); } else if ... ... } return list.iterator(); } }); Object nodes = xpath.evaluate( "//edx:cd", document.getDocumentElement(), XPathConstants.NODESET); ... } catch ( ParserConfigurationException e) { ... } catch ( XPathExpressionException e) { ... } catch ( SAXException e) { ... } catch ( IOException e) { ... }
First we build a org.w3c.dom.Document
using the JAXP DocumentBuilderFactory
functionality, making sure namespace processing is enabled.
We can now search this document by creating a XPath
object using the XPathFactory
.
To provide a map with prefix and namespace-uris to the XPath
engine we need to implement the NamespaceContext
interface, there is currently no default implementation available. This means implementing the getNamespaceURI, getPrefix and getPrefixes methods, making sure the methods return the correct values, also for the 'xmlns' and 'xml' namespace prefixes.
After we have provided the NamespaceContext
to the XPath
engine, we can evaluate our XPath expression using the evaluate
method, providing our XPath expression, using the root element as the starting context and specifying a NodeList
as the desired return type.
XSLT
XPath was originally designed to be used with XSLT, this (and maybe because XSLT is an XML vocabulary) might explain why declaring prefix namespace-uri mappings in XSLT seems very natural.
<xsl:stylesheet version="1.1" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="//edx:cd" xmlns:edx="http://www.edankert.com/examples/"> <xsl:apply-templates/> </xsl:template> </xsl:stylesheet>
To specify the prefix namespace-uri we can simply specify a namespace-uri for the 'edx' prefix, using the normal XML mechanism.
To get the same output as for the previous examples, we can use a xsl:template that matches our //edx:cd
XPath expression.
Conclusion
So, to be able to use XPath expressions on XML content defined in a (default) namespace, we need to specify a namespace prefix mapping. As we have seen, it does not matter what prefix the namespace is mapped to.
This same mechanism can also be used to search for elements that have been defined using a different prefix. This means that the above examples will also work on the following XML where instead of using a default namespace, the namespace has been mapped to the 'examples' prefix:
<examples:catalog xmlns:examples="http://www.edankert.com/examples/"> <examples:cd> <examples:artist>Sufjan Stevens</examples:artist> <examples:title>Illinois</examples:title> <examples:src>http://www.sufjan.com/</examples:src> </examples:cd> <examples:cd> <examples:artist>Stoat</examples:artist> <examples:title>Future come and get me</examples:title> <examples:src>http://www.stoatmusic.com/</examples:src> </examples:cd> <examples:cd> <examples:artist>The White Stripes</examples:artist> <examples:title>Get behind me satan</examples:title> <examples:src>http://www.whitestripes.com/</examples:src> </examples:cd> </examples:catalog>
Using the XPath expression '//edx:cd'
and namespace prefix mapping from the examples above will again return all 'cd'
elements that are declared in the 'http://www.edankert.com/examples/'
namespace.
Sample Code
Download any of the archives to try out the examples above.
The archives consist of the ./catalog.xml
document and 4 Java code examples (in the ./src
directory) to search the document using DOM, JDOM, dom4j and XOM.
To run these examples, please use the following command-line options:
Model | Command Line |
---|---|
DOM | java -cp xpath-examples.jar com.edankert.examples.dom.XPathExample |
JDOM | java -cp xpath-examples.jar;lib/jdom.jar;lib/jaxen-1.1.1.jar com.edankert.examples.jdom.XPathExample |
dom4j | java -cp xpath-examples.jar;lib/dom4j-1.6.1.jar;lib/jaxen-1.1.1.jar com.edankert.examples.dom4j.XPathExample |
XOM | java -cp xpath-examples.jar;lib/xom-1.0.jar;lib/jaxen-1.1.1.jar com.edankert.examples.xom.XPathExample |
The archive also contains the example XML Stylesheet (./catalog.xsl
). To process the XML with the stylesheet please invoke your favorite XML Processor from the command-line or use the transform.xhp
project included in the ./xmlhammer-projects
directory.
To be able to process the transform.xhp
and the also included xpath.xhp
project, you will need to have the XML Hammerapplication installed. This can be downloaded from:
Resources
- Extensible Markup Language (XML) 1.0 (Third Edition)
http://www.w3.org/TR/REC-xml/ - Namespaces in XML
http://www.w3.org/TR/REC-xml-names/ - XML Path Language (XPath) Version 1.0
http://www.w3.org/TR/xpath - XSL Transformations (XSLT) Version 1.0
http://www.w3.org/TR/xslt - dom4j
http://www.dom4j.org/ - XOM
http://www.xom.nu/ - JDOM
http://www.jdom.org/ - Jaxen
http://www.jaxen.org/ - Java 5.0
http://java.sun.com/j2se/1.5.0/
출처 - http://www.edankert.com/defaultnamespaces.html#Jaxen_and_Dom4J
'Development > XML' 카테고리의 다른 글
xml - XSLT (0) | 2013.08.10 |
---|---|
XQuery (0) | 2013.02.20 |
XSL(Extensible Stylesheet Language) (0) | 2013.02.19 |
XPath (0) | 2013.02.19 |
XML - XML 스키마(XSD) 및 xsi 접두어 의미 (0) | 2012.10.02 |