XML

Extensible Markup Language (XML) is a data format commonly used in modern web applications. XML uses custom tags to describe data and is encoded as plain text, making it both flexible and easy to work with. The GWT class library contains a set of types designed for processing XML data.

  1. XML types
  2. Parsing XML
  3. Building an XML document

XML types

The XML types provided by GWT can be found in the com.google.gwt.xml.client package. In order to use these in your application, you’ll need to add the following <inherits> tag to your module XML file:

<inherits name="com.google.gwt.xml.XML" />

Parsing XML

To demonstrate how to parse XML with GWT, we’ll use the following XML document that contains an email message:

<?xml version="1.0" ?>
<message>
  <header>
    <to displayName="Richard" address="rick@school.edu" />
    <from displayName="Joyce" address="joyce@website.com" />
    <sent>2007-05-12T12:03:55Z</sent>
    <subject>Re: Flight info</subject>
  </header>
  <body>I'll pick you up at the airport at 8:30.  See you then!</body>
</message>

Suppose that you’re writing an email application and need to extract the name of the sender, the subject line, and the message body from the XML. Here is sample code that will do just that (we’ll explain the code in just a bit):

private void parseMessage(String messageXml) {
  try {
    // parse the XML document into a DOM
    Document messageDom = XMLParser.parse(messageXml);

    // find the sender's display name in an attribute of the <from> tag
    Node fromNode = messageDom.getElementsByTagName("from").item(0);
    String from = ((Element)fromNode).getAttribute("displayName");
    fromLabel.setText(from);

    // get the subject using Node's getNodeValue() function
    String subject = messageDom.getElementsByTagName("subject").item(0).getFirstChild().getNodeValue();
    subjectLabel.setText(subject);

    // get the message body by explicitly casting to a Text node
    Text bodyNode = (Text)messageDom.getElementsByTagName("body").item(0).getFirstChild();
    String body = bodyNode.getData();
    bodyLabel.setText(body);

  } catch (DOMException e) {
    Window.alert("Could not parse XML document.");
  }
}

The first step is to parse the raw XML text into an XML DOM structure we can use to navigate the data. GWT’s XML parser is contained in the XMLParser class. Call its parse(String) static method to parse the XML and return a Document object. If an error occurs during parsing (for example, if the XML is not well-formed), the XMLParser will throw a DOMException.

If parsing succeeds, the Document object we receive represents the XML document in memory. It is a tree composed of generic Node objects. A node in the XML DOM is the basic unit of data in an XML document. GWT contains several subinterfaces of Node which provide specialized methods for processing the various types of nodes:

  • Element - represents DOM elements, which are specified by tags in XML: <someElement></someElement>.

  • Text - represents the text between the opening and closing tag of an element: <someElement>Here is some text.</someElement>.

  • Comment - represents an XML comment: <!-- notes about this data -->.

  • Attr - represents an attribute of an element: <someElement myAttribute="123" />.

Refer to the documentation for the Node interface for a complete list of types that derive from Node.

To get to the DOM nodes from the Document object, we can use one of three methods. The getDocumentElement() method retrieves the document element (the top element at the root of the DOM tree) as an Element. We can then use the navigation methods of the Node class from which Element derives (e.g., getChildNodes(), getNextSibling(), getParentNode(), etc.) to drill down and retrieve the data we need.

We can also go directly to a particular node or list of nodes using the getElementById(String) and getElementsByTagName(String) methods. The getElementById(String) method will retrieve the Element with the specified ID. If you want to use ID’s in your XML, you’ll need to supply the name of the attribute to use as the ID in the DTD of the XML document (just setting an attribute named id will not work). The getElementsByTagName(String) method is useful if you want to retrieve one or more elements with a particular tag name. The list of elements will be returned in the form of a NodeList object, which can be iterated over to get the individual Nodes it contains.

In the example code, we use the getElementsByTagName(String) method to retrieve the necessary elements from the XML containing the email message. The sender’s name is stored as an attribute of the <from> tag, so we use getAttribute(String). The subject line is stored as text inside the <subject> tag, so we first find the subject element, and then retrieve its first (and only) child node and call getNodeValue() on it to get the text. Finally, the message body is stored in the same way (text within the <body> tag), but this time we explicitly cast the Node to a Text object and extract the text using getData().

Building an XML document

In addition to parsing existing documents, the GWT XML types can also be used to create and modify XML. To create a new XML document, call the static createDocument() method of the XMLParser class. You can then use the methods of the resulting Document to create elements, text nodes, and other XML nodes. These nodes can be added to the DOM tree using the appendChild(Node) and insertBefore(Node, Node) methods. Node also has methods for replacing and removing child nodes (replaceChild(Node, Node) and removeChild(Node), respectively).