Symmetric Queries in XML


Also in the XML session, Shuohao Zhang from Washington State University spoke on Symmetrically Exploiting XML. This paper was nominated as a best student paper. (See the paper.)

XML queries are asymmetric because they're hierarchical. Rearranging the hierarchy requires changing the query. This work is aimed at making a single query work across multiple structures. This is useful when you don't know what the schema is, for heterogeneous or irregular data, or when the schema evolves.

Axes (parent, child, ancestor, descendent, preceding, following, etc.) are all directional. This work proposes a non-directional axes called closest. The semantics is a function that takes the context node and returns a sequences of closest nodes. You can ask for closest::* and get all the nodes on any axis that are closest. Asking for closest::price will return the closest price node on any axis. Node selection is limited by a minimum distance between nodes.

The naive approach to implementation is to compute Closest for every node. This has time complexity O(sn2) where s is the number of nodes in the signature and n is the number of nodes.

A better approach is to convert non-directional expressions into directional queries. The advantage of this is that you can use an existing directional query engine a the basis for implementing this. Conversion is linear and fast enough relative to the query to add very little to the overall computation.

This looks like a pretty simple and efficient addition to XPath (and hence into XQuery and XSLT) that gives much increased flexibility.


Please leave comments using the Hypothes.is sidebar.

Last modified: Thu Oct 10 12:47:18 2019.