It looks like the xpointer library here:
http://www.cs.unibo.it/projects/xslt%2B%2B/
is a viable starting point for Topaz's development. From the PLoS One perspective, we would need it to be able to support the identification of arbitrary ranges within a document, which could potentially span nodes. An example of such an expression is:
start-point(string-range(id("x20060728a")/p[1],"",288,1))/range-to(end-point(string-range(id("x20060801a")/h3[1],"",39,1)))
For a single region (i.e. same parent node for start and end points):
string-range(/article[1]/body[1]/sec[1]/p[2],"",194,344)
Note that due to potential differences in white space treatment in the various browsers, this may have to be augmented slightly to get accurate identification of points. It is possible we may want to put in a word or phrase instead of the empty string and identify which occurrence it is to help with white space problems.
The current implementation doesn't support the empty argument "" in string-range, which according to the spec means it should count from the beginning of the region. It also needs to be brought up to be compatible with jdk 1.4 (or 1.5) and to possibly support dom level 3. Also, it is relying on some outdated libraries which have since been folded into other projects or have changed names. I was able to get things running by using some older versions of the libraries. Updating to use the latest Xalan libs may also be a task.