打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
XPath tutorial
This article or chapter is incomplete and its contents need further attention.Some information may be missing or may be wrong, spelling and grammar may have to be improved, use your judgment!
Web technology tutorials series
Intermediate
Page created byDaniel K. Schneider, 25 November 2007
Lastmodified byDaniel K. Schneider, 11 September 2015
Contents
[]
1 Introduction
XPath is a language for addressing parts of an XML document. Basic understanding of XPath is needed forXSLT andXQuery programming. In this piece, we shall show how to create XSLT stylesheets that use some moderately complex XPath s
Tip: Read the free O'Reilly book:http://commons.oreilly.com/wiki/index.php/XPath_and_XPointer
Learning Objectives
Understand how to use XPath expressions, in order use XSLT and XQuery more effectively
Learn some XSLT programming constructions (conditions and loops)
Being able to cope with most XML to HTML transformations
Prerequisites
Editing XML tutorial (being able to use a simple DTD).
XSLT Tutorial - Basics, introductory XSLT (xsl:template, xsl:apply-templates and xsl:value-of)
Next steps
XSLT to generate SVG tutorial
XQuery tutorial - basics
Materials
We use the same XML document in most examples. You can download the file or look at a wiki page:
xpath-jungle.xml
XPath tutorial - basics/XML example code
Directory:http://tecfa.unige.ch/guides/xml/examples/xpath/
Disclaimer
This is an unfinished, not very nice, introductoryXPath tutorial. Cut/paste from slides with a few fixes. It needs more work, since right now it's more like a list of XPath features ... -Daniel K. Schneider
There may be typos (sorry) and mistakes (sorry again)
2 Introduction to XML Path Language
2.1 Definition and history
XPath is a language for addressing parts of an XML document, In support of this primary purpose, it also provides basic facilities for manipulation of strings, numbers and booleans.
Its 2.0 editition was described as “XPath 2.0 is an expression language that allows the processing of values conforming to the data model defined inXQuery/XPath Data Model (XDM). The data model provides a tree representation of XML documents as well as atomic values such as integers, strings, and booleans, and sequences that may contain both references to nodes in an XML document and atomic values. The result of an XPath expression may be a selection of nodes from the input documents, or an atomic value, or more generally, any sequence allowed by the data model. The name of the language derives from its most distinctive feature, the path expression, which provides a means of hierarchic addressing of the nodes in an XML tree.” (XML Path Language (XPath) 2.0,W3C Recommendation 23 January 2007, retrieved 16:38, 9 February 2010 (UTC).
XPath uses a compact non-XML syntax (to facilitate use of XPath within URIs and XML attribute values).
XPath gets its name from its use of a path notation as in URLs for navigating through the hierarchical structure of an XML document.
XPath was defined at the same time as XSLT (nov 1999)
Initally, it was developped to support XSLT and XPointer (XML Pointer Language used for XLink, XInclude, etc.). Today it is also used by XQuery and other applications. Many programming languages support an XPath library, e.g. PHP5.
In plain English, XPath allows retrieving parts of an XML document. XPath expressions can be as simple as the name of an XML element or so complicated that only experts can understand it ....
Specifications
XPath 1.0 (nov 1999) (http://www.w3.org/TR/xpath)
XPath 2.0 (Jan 2007) (http://www.w3.org/TR/xpath20/)
XPath 2.0 Functions and Operatorshttp://www.w3.org/TR/xquery-operators/
XQuery 1.0 and XPath 2.0 Data Model (XDM)http://www.w3.org/TR/xpath-datamodel/
XPath 1.0 is used by XSLT 1.0, i.e. the XSLT processor included in virtually every web browser as of Jan 2010 and since the early 2000's in IE/Mozilla.
XPath 2.0 is a superset of XPath 1.0 and is used by XSLT 2.0,XQuery and other specifications.
Reference manuals
Transforming XML with XSLT (includes a useful overview of XSLT/XPath functions)
2.2 XSLT, XQuery and XPath
Each time a given XSLT or XQuery instruction needs to address (refers to) parts of an XML document, we use XPath expressions. XPath expressions also can contain functions, simple math and boolean expressions.
WithinXSLT, XPath expressions are typicially used in match, select and test attributes:
Xpath expressions in an XSLT template
Below is an XQuery example taken from theXQuery tutorial - basics
for $t in fn:doc("catalog09.xml")//c3msbricklet $n := count($t//c3mssoft)where ($n > 1)order by $nreturn <result> {$t/title/text()} owns {$n} bricks </result>
2.3 Playtime
Most XML editors will display the XPath of a selected element. In addition, you can search with XPath expressions, i.e. test an expression that you then would use in your XSLT or XQuery code.
Below is a screenshot of the XML Exchanger editor. As you can see we entered the //participant expression in the search box. The result is a so-called node-set that is displayed in the XPath pane at the bottom.
Xpath seach in the Exchanger editor
3 The XPath Syntax and the document model
3.1 Xpath Syntax
XPath expressions can be quite simple or very complex. An Xpath expression may include a location path (i.e. "where to look for"), node tests (i.e. identifying a node) and predicates(i.e. additional tests).
There are two notations for location paths.
(1) abbreviated
This simple notation allows to locate itself, children, parents, attributes and combinations of these. Going up or down is called an axis. The abbreviated form has limited axis.
para means "all "para" child elements of the current context
(2) unabbreviated
Unabbreviated location path allows to search in more axis then just parents, children, and siblings.
child::para" is identical to para above.
The abbreviated location path look a bit like file paths. E.g. the following expression:
/section/title
means: find all title nodes below section nodes
Syntax overview of the primary (relatively simple) XPath expressions
The picture below is not entirely correct, i.e. the "green" elements are part of the so called node-test
The result of an Xpath expression can be various data types, e.g. sets of nodes, a single node, a number, etc. Most often the result, is a set of nodes.
The formal specification of an XML Path
is very complex, i.e. has about 39 clauses and is very difficult to understand
Some expressions shown here are beyond the scope of this tutorial, don't panic !
3.2 The document model of XPath
XPath sees an XML document as a tree structure
Each information (XML elements, attributes, text, etc.) is called a node. This is fairly similar to the W3C DOM model an XML or XSLT processor would use.
Nodes that XPath can see
root node ATTENTION: The root is not necessarily the XML root element. E.g. processing instructions like a stylesheet declaration are also nodes.
Elements and attributes
Special nodes like comments, processing instructions, namespace declarations.
Nodes XPath can't see
XPath looks at the final document, therefore it can't see entities and document type declarations....
The XML context
What a given XPath expression means, is always defined by a given XML context, i.e. the current node in the XML tree a processor is looking at.
4 Using simple location path
Below, we present a few expressions for locating nodes using the simple abbreviated syntax. As we said before, location paths can be horribly complex, but simple location path look a bit like file path that you would use in HTML links or in an operating system like Unix, Windows or MacOS.
4.1 List of simple location path
Document root node - returns the document root (which is not necessarily the XML root!)
/
Direct child element
XML_element_name
Direct child of the root node
/XML_element_name
Child of a child
XML_element_name/XML_element_name
Descendant of the root
//XML_element_name
Descendant of a node
XML_element_name//XML_element_name
Parent of a node
../
A far cousin of a node
../../XML_element_name/XML_element_name/XML_element_name
4.2 Example - Extracting titles from an XML file with XSLT
Let us recall that we use the same XML document in most examples. You can download the file or look at a wiki page:
xpath-jungle.xml
XPath tutorial - basics/XML example code
<?xml version="1.0"?><project> <title>The Xpath project</title> <participants> <participant> <FirstName>Daniel</FirstName> <qualification>8</qualification> <description>Daniel will be the tutor</description> <FoodPref picture="dolores_001.jpg">Sea Food</FoodPref> </participant> <participant> <FirstName>Jonathan</FirstName> <qualification>5</qualification> <FoodPref picture="dolores_002.jpg">Asian</FoodPref> </participant> <participant> <FirstName>Bernadette</FirstName> <qualification>8</qualification> <description>Bernadette is an arts major</description> </participant> <participant> <FirstName>Nathalie</FirstName> <qualification>2</qualification> </participant> </participants> <problems> <problem> <title>Initial problem</title> <description>We have to learn something about Location Path</description> <difficulty level="5">This problem should not be too hard</difficulty> </problem> <solutions> <item val="low">Buy a XSLT book</item> <item val="low">Find an XSLT website</item> <item val="high">Register for a XSLT course and do exercices</item> </solutions> <problem> <title>Next problem</title> <description>We have to learn something about predicates</description> <difficulty level="6">This problem is a bit more difficult</difficulty> </problem> <solutions> <item val="low">Buy a XSLT book</item> <item val="medium">Read the specification and do some exercises</item> <item val="high">Register for a XPath course and do exercices</item> </solutions> </problems></project>
Task
We would like to get a simple list of problem titles
Solution
XSLT template (file:xpath-jungle-1.xsl
<?xml version="1.0" encoding="utf-8"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">  <xsl:output method="html"/>  <xsl:template match="/project"> <html> <body bgcolor="#FFFFFF"> <h1><xsl:value-of select="title" /></h1> Here are the titles of our problems: <ul> <xsl:apply-templates select="problems/problem" /> </ul> </body> </html> </xsl:template> <xsl:template match="problems/problem"> <li><xsl:value-of select="title" /></li></xsl:template> </xsl:stylesheet>
(1) XSLT template for the root element
The XPath of the "match" means: apply the template to the project element node, which is a direct child of the root node
The execution context of this template is therefore the element "project"
The xsl:apply-templates expression will now tell the processor to find a rule for the problems/problem descendant.
(2) XSLT template for the problems/problem element
This second rule will be triggered by the first rule, because problems/problem is indeed a location that can be found in project element. The processor then can extract the value of the title element.
Alternatively we could have written this rule as:
<xsl:template match="problems/problem/title"> <li><xsl:apply-templates/></li></xsl:template>
or
<xsl:template match="problems/problem/title"> <li><xsl:value-of select="."/></li></xsl:template>
(3) Result HTML
<html> <body bgcolor="#FFFFFF"> <h1>The Xpath project</h1> Here are the titles of our problems: <ul> ''<li>Initial problem</li>'' ''<li>Next problem</li>'' </ul> </body> </html>
Live example:
xpath-jungle-1.xml
4.3 Attribute Location Paths
Of course, XPath also also to locate attributes. We shall show the principle, using a few examples.
(1) To find an attribute of a child element in the current context use:
@attribute_name
Example:
@val
(2) Find attributes of an element in a longer location path starting from root
/element_name/element_name/@attribute_name
Example:
/project/problems/solutions/item/@val
(3) Find attributes in the whole document: //@attribute_name
As you can see you can combine element location with attribute identification.
4.4 Example - Create an html img link from an attribute
XML fragment
Same as above
Task
Display a list of First Names plus their food preferences
XSLT (Filexpath-jungle-2.xsl
<?xml version="1.0" encoding="utf-8"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">  <xsl:output method="html"/>  <xsl:template match="/"> <html> <body bgcolor="#FFFFFF"> <h1>What do we know about our participants ?</h1> Here are some food preferences: <ul> <xsl:apply-templates select=".//participant" /> </ul> </body> </html> </xsl:template> <xsl:template match="participant"> <li><xsl:value-of select="FirstName"/> <xsl:apply-templates select="FoodPref"/> </li></xsl:template> <xsl:template match="FoodPref"> prefers <xsl:value-of select="."/>. <img src="{@picture}"/> <br clear="all"/></xsl:template> </xsl:stylesheet>
The second rule will display names of participants and launch a template for FoodPref
Note: Not all participants have a FoodPref element. If it is absent it will just be ignored.
The third rule (FoodPref) displays the text (contents) of FoodPref and then makes an HTML img tag
Parts of the result
<h1>What do we know about our participants ?</h1> Here are some food preferences: <ul> <li>Daniel prefers Sea Food. <img src="dolores_001.jpg"><br clear="all"></li> <li>Jonathan prefers Asian. <img src="dolores_002.jpg"><br clear="all"></li> <li>Bernadette</li> <li>Nathalie</li> </ul>
Live example:
xpath-jungle-2.xml
4.5 Location wildcards
Sometimes (but not often!), it is useful to work with wildcards
You have to understand that only one rule will be applied per element. Rules with wildcards have less priority and btw. this is why "your rules" are applied before the system default rules.
Find all child nodes of type XML element
*
Find all child nodes (including comments, etc.)
node()
Find all element attributes
@*
Find all text nodes
text()
Combine locations
use the "|" operator, an example is just below.
Example: XSLT includes two built-in default rules. They rely on using these wildcards.
This rule applies to the document root and all other elements
<xsl:template match="*|/"> <xsl:apply-templates/> </xsl:template>
Text and attribute values are just copied
<xsl:template match="text()|@*"> <xsl:value-of select="."/> </xsl:template>
5 XPaths with predicates
Let us now scale up a bit.
A predicate is an expression that can be true or false
It is appended within [...] to a given location path and will refine results
More than one predicate can be appended to and within (!) a location path
Expressions can contain mathematical or boolean operators
Find element number N in a list
Syntax: XML_element_name [ N ]
/project/participants/participant[2]/project/participants/participant[2]/FirstName Find elements that have a given attribute
Synatx: XML_element_name [ @attribute_name ]
//difficulty[@level] Find elements that have a given element as child
Syntax XML_element_name [ XML_element_name ]//participant[FoodPref]
Note: this is not the same as //participant/FoodPref. The latter would return a list of FoodPref elements, whereas the former returns a list of participant
Mathematical expressions
Use the standard operators, except div instead of / (for obvious reasons)
- * div mod mod is interesting if you want to display a long list in table format
5 mod 2 returns 1, as will "7 mod 2" and "3 mod 2" Boolean operators (comparison, and, or)
List of operators (according to precedence)
<=, <, >=, >
=, !=
and, or
Examples
Return all exercise titles with a grade bigger than 5.
//exercise[note>5]/title
Find elements that have a given attribute with a given value
Recall of the Syntax: XML_element_name [ @attribute_name="value"]
//solutions/item[@val="low"]
Example XSLT template that will match all item elements with val="low".
<xsl:template match="'//item[@val='low']"> <xsl:value-of select="." /></xsl:template>
Usually expressions also contain functions as we shall see below, examples:
Return the last five elements of a list
author [(last() - 4) <= position()) and (position() <= last())] Return all Participant nodes with a contents of FirstName bigger than 7 characters:
"//Participant[string-length(FirstName)>=8]"5.1 Example: Retrieve selected elements
The following example will retrieve the following:
All persons who do have a food preference
All items that have an a "priority" attribute set to "high"
The XSLT stylesheet (filexpath-jungle-3.xsl)
<?xml version="1.0" encoding="utf-8"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">  <xsl:output method="html"/>  <xsl:template match="/"> <html> <body bgcolor="#FFFFFF"> <h1>Retrieve selected elements</h1> Here is the name of participant two: <ul><li><xsl:value-of select=".//participant[2]/FirstName"/></li></ul> Here are all participant's firstnames that have a food preference: <ul><xsl:apply-templates select=".//participant[FoodPref]"/></ul> Here are all items that have a value of "high" <ul><xsl:apply-templates select=".//item[@val='high']"/></ul> </body> </html> </xsl:template> <xsl:template match="participant"> <li><xsl:value-of select="FirstName"/></li></xsl:template> <xsl:template match="item"> <li><xsl:value-of select="."/></li></xsl:template> </xsl:stylesheet>
HTML result
<html> <body bgcolor="#FFFFFF"> <h1>Retrieve selected elements</h1> Here is the name of participant two: <ul> <li>Jonathan</li> </ul> Here are all participant's firstnames that have a food preference: <ul> <li>Daniel</li> <li>Jonathan</li> </ul> Here are all items that have a value of "high" <ul> <li>Register for a XSLT course and do exercices</li> <li>Register for a XPath course and do exercices</li> </ul> </body></html>
Live example:
xpath-jungle-3.xml
xpath-jungle-3.xsl
Below is a more complex example. One in short notation and the other (to do) in it equivalent long notations:
/outputTree/command/pivotTable/dimension//category[@text='Measurement']/dimension/category/cell[@text='Nominal']
6 XPath functions
XPath defines a certain number of functions. You can recognize a function because it has appended "()".
Functions are programming constructs that will return various kinds of informations, e.g.
true / false
a number
a string
a list of nodes
It is not obvious to understand what all of these functions do. For example, there are restrictions on how you can use functions (stick to examples or the reference)
last()
last() gives the number or nodes within a context
position()
position() returns the position of an element with respect to other children in the same parent
Warning: The result will include empty nodes create from whitespaces between elements. To avoid this you should strip the parent nodes using
<xsl:strip-space elements="name_of_the_parent_node"/>
count(node-set)
count gives the number of nodes in a node set
We got <xsl:value-of select="count(//problem)"/> problems, i.e. it will count N elements retrieved and not the child elements of problem
Live example:xpath-jungle-6.xml -xpath-jungle-6.xsl
starts-with(string, string)
returns TRUE if the second string is part of the first and starts off the first
//Participant[starts-with(Firstname,'Berna')]" contains(string, string)
returns TRUE if the second string is part of the first
//Participant[contains(FirstName,'nat')] string-length(string)
returns the length of a string
number(string)
transforms a string into a number
sum(node-set)
computes the sum of a given set of nodes.
If necessary, does string conversion with number()
round(number)
round a number, e.g. 1.4 becomes 1 and 1.7 becomes 2
translate(string1, string2, string3)
translates string1 by substituting string2 elements with string3 elements
6.1 Example: Computation of an average
We would like to compute the average of participant's qualifications
<participant><FirstName>Daniel</FirstName> ''<qualification>8</qualification>'' </participant>
The XSLT stylesheet (filexpath-jungle-4.xsl
We compute the sum of a node-set and then divide by the number of nodes
<?xml version="1.0" encoding="utf-8"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">  <xsl:output method="html"/>  <xsl:template match="/"> <html> <body bgcolor="#FFFFFF"> <h1>Qualification level of participants</h1> Average is <xsl:value-of select="sum(.//participant/qualification) div count(.//participant/qualification)"/> </body> </html> </xsl:template></xsl:stylesheet>
HTML result
<html> <body bgcolor="#FFFFFF"> <h1>Qualification level of participants</h1> Average is 5.75 </body> </html>
Live example:
xpath-jungle-4.xml
6.2 Example: Find first names containing 'nat'
The XSLT stylesheet (filexpath-jungle-5.xsl
<?xml version="1.0" encoding="utf-8"?><xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">  <xsl:output method="html"/>  <xsl:template match="/"> <html> <body bgcolor="#FFFFFF"> <h1>Do we have a "nat" ?</h1> First Names that contain "nat": <ul><xsl:apply-templates select=".//participant[contains(FirstName,'nat')]"/></ul> First Names that contain "nat" and "Nat": <ul><xsl:apply-templates select=".//participant[contains(translate(FirstName,'N','n'),'nat')]"/></ul> </body> </html> </xsl:template> <xsl:template match="participant"> <li><xsl:value-of select="FirstName"/></li></xsl:template> </xsl:stylesheet>
Live example
xpath-jungle-5.xml
7 Union of XPaths
Union Xpaths combine more than one XPath (and all the resulting nodes are returned). A typical example that we already introduced above is the default rule which means that the template matches either the root element (i.e. "/" or just any element),
<xsl:template match="*|/"> <xsl:apply-templates/> </xsl:template>
Often, this construct is used to simplify apply-templates or even templates themselves. E.g. the following rules applies to both "description" and "para" elements.
<xsl:template match="para|description"> <p><xsl:apply-templates/></p> </xsl:template>
8 List of commonly used XPath expressions
Syntax
element
(Type of path)
Example path
Example matches
name
child element name
project
<project> ...... </project>
/
child / child
project/title
<project> <title> ... </title>
/
(root element)
//
descendant
project//title
<project><problem> <title>....</title>
//title
<root>... <title>..</title> (any place)
*
"wildcard"
*/title
<bla> <title>..</title> and <bli> <title>...</title>
|
"or operator
title|head
<title>...</title> or <head> ...</head>
*|/|@*
All elements: root, children and attributes
.
current element
.
../
parent element
../problem
<project>
@attr
attribute name
@id
<xyz id="test">...</xyz>
element/@attr
attribute of child
project/@id
<project id="test" ...> ... </project>
@attr='value'
value of attribute
list[@type='ol']
<list type="ol"> ...... </list>
position()
position of element
in parent
position()
last()
number of elements within a context
last()
position()!=last()
Important:
The XML standard requires that an XML parser returns emtpy space between elements as empty nodes. In other words, functions such as location() or last() will also count empty nodes ! Tell XSLT to remove these within the elements within which you need to count.
Good code:
<xsl:strip-space elements="name_of_the_parent"/>
Bad code:
you forgot to use this ..
Example fragment where position in a list is used to compute position on the screen with SVG in HTML5:
<xsl:strip-space elements="list"/><xsl:template match="list"> <rect x="10" y="105" width="{10 * count(item)}" height="5" fill="black" stroke="red"/> <xsl:apply-templates/></xsl:template>
Live example:
intro-html5.xml
9 Links
9.1 Introductory tutorials
Xpath (Wikipedia)
Zvon tutorial (lots of examples)
XPath for .NET Developers by Darshan Singh
9.2 Other
XPath Visualizer A windows program you can install to train. Alternatively, just use a XML editor with Xpath support.
Liquid XML has anXPath builder
本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
【热】打开小程序,算一算2024你的财运
XML认证教程,第 5 部分: XPATH
XML复习题
JSTL与Struts的结合(五)
xslt轻松入门(zt)
用于数据的 XML: XPath 2.0 有哪些新特性?
欢迎光临 - 琳婕小筑-老猫的理想 - XSLT轻松入门 -
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服