Reading xml with the dom parser in java is a lot of work, hard to read and very inflexible. In this example we will show how using the Groovy convenience class XMLParser to simplify parsing xml from a file.
The San Fransisco bay area rapid transit system exposes their public data through an API for google transit and other developers using the General Transit Feed Specification (GTFS). We will get the Dublin/Pleasanton Station train route information by calling http://api.bart.gov/api/route.aspx?cmd=routeinfo&route=11 an examle xml is provided below. We made one small modification to include an attribute of val
for the snippets below.
Route xml to parse
<?xml version="1.0" encoding="UTF-8"?>
<root>
<uri><![CDATA[http://api.bart.gov/api/route.aspx?cmd=routeinfo&route=11]]></uri>
<sched_num>34</sched_num>
<routes>
<route>
<name>Dublin/Pleasanton - Daly City</name>
<abbr>DUBL-DALY</abbr>
<routeID>ROUTE 11</routeID>
<number>11</number>
<origin>DUBL</origin>
<destination>DALY</destination>
<direction />
<color>#0099cc</color>
<holidays>1</holidays>
<num_stns>17</num_stns>
<config>
<station val="1">DUBL</station>
<station val="2">WDUB</station>
<station val="3">CAST</station>
<station val="4">BAYF</station>
</config>
</route>
</routes>
<message />
</root>
XMLParser in action
Finding the path of the xml file to read using the Java 7 syntax, we will pass the file into the parse()
method. The parser will return a groovy.util.Node
and using Gpath expressions we can walk through the tree. Highlighting common operations the first assert gets the name of the root element, the second walks the dom to get the routID text contained in the node, the third statement validates that each stations value is contained in the list while the last finds the station with the attribute of one.
@Test
void parse_xml_file() {
Path xmlFilePath = Paths
.get("src/test/resources/com/levelup/groovy/xml/bart-route-dublin.xml")
.toAbsolutePath()
def root = new XmlParser().parse(xmlFilePath.toFile())
// get the root nodes name
assert "root" == root.name()
// get text of
assert "ROUTE 11" == root.routes.route[0].routeID.text()
// verify that each station is in list
assert root.routes.route[0].config.station.each {
assert it.text() in ['DUBL', 'WDUB', 'CAST', 'BAYF'] }
//get attribute with value
assert "DUBL" == root.routes.route[0].config.station.find{it['@val'] == '1'}.text()
}