Topaz Public REST Interface

This page is focused on discussing the design of making Topaz expose this REST style programming paradigm.

We are planning to allow applications that build on top of the Topaz platform (example PLoS Publishing System) to be able to quickly expose REST-like interfaces to the Web world. We further believe this is quite feasible given the new architecture of Topaz. This new architecture falls in a new category of software we call OTM (Object to Triples Mapping) which is analogous to ORM (Object to Relational Mapping) except the persistent store being an Triples database (RDF/RDFS etc.). Application developers define their business objects and "annotate" them with appropriate markers to allow Topaz to persist the objects in the underlying RDF store. A very useful side-effect of this is that Topaz has a meta-level view of the business objects, and by this we mean that Topaz has a very good understanding of the application's Java objects, including fields and methods.

Please note that the URI's specified are not syntactically correct and are to convey the general idea for now.

Hopefully this will be clearer with a specific example. PLoS Publishing System has a business object called "Article" which is roughly designed as follows:

public class Article {
    private String          title;
    private List<String>    authors;
    private String          abstract;
    private Annotations[]   annotations;
}

Once this class is registered with Topaz, it is aware that the Article has the fields authors, title, annotations, abstract, etc. which are of type List<String>, String, Annotations and String respectively. This knowledge allows for very powerful constructions of dynamic URIs (the key to REST interfaces). For example, http://www.plos.org/articles/author="John Smith" could be attached to dynamic generated action which fetches all articles authored by John Smith. This is possible because Topaz knows that author is a field of Article and can potentially generate the query dynamically to return the object(s). Other alternate syntaxes like http://www.plos.org/Articles.author="John Smith" are also possible. This type of "convention based" programming style is becoming quite popular (example Ruby on Rails) and will make it easier for external developers to exploit the information within the Topaz platform. These convention based rules we create should cover two key aspects:

  • The URI design to access the information
  • Format of the return data value(s)

For the first release we will mostly focus on READ/GET operations, except in the one case of maybe allowing external parties to be able to add annotations. More on this specific case later.


Example from NCBI

What is this thing?

The NCBI Resource Locator provides stable, uniform addressing for NCBI content, making it easy to link to individual records. Some NCBI resources also provide services (like search) through these URLs.

How does it work?

Each URL has the form

http://view.ncbi.nlm.nih.gov/<noun>/<verb>/<expression>

Where:

  • <noun> is an NCBI resource (e.g., pubmed, gene, nucleotide, etc.)
  • <verb> is the action to perform (e.g., search, get, etc.). If <verb> is missing, the default verb "get" is used.
  • <expression> is data used by the action to perform the request

Some examples:


READ/GET URIs

The design idea behind this is to keep majority of read operations intuitive where developers do not have to *remember* URIs. Based on the knowledge of the business objects, they can test and build the URIs on the file. Some ideas:

wget http://www.plos.org/article/ could return the name and type of fields within the Article object

wget http://www.plos.org/article/graph could return the name of the Mulgara model we store articles in

This simple structure will hopefully cover the most commons use cases. For the more complex we should allow users (within defined security limits) to be able to execute OQL/iTQL queries. Some ideas:

wget http://www.plos.org/query?itql="select...."

wget http://www.plos.org/query?oql="select...."

Of course, if someone knows of a good and user friendly way of embedding logical expressions within a URI, the two above different design patterns could be merged.

PUT/POST URIs

This is a harder one to deal with as creating a generic PUT/POST is a lot harder. Unfortunately we already have a very useful case for this one. We would like external parties (such as the Protein Database folks) to be able to retrieve PLoS article XMLs, run them through their tools and create a list of annotations providing an annotated link from the Protein name to the protein molecule within the Protein DB. It would be nice to be able to provide them with a simple URI using which (of course with appropriate security cordons) they can upload the annotations back to PLoS associated with the article. If this process can be automated PLoS can form partnerships with quite a few useful other databases.


URI Grammar

It is possible to model the resources PLoS exposes, Journals, Articles, Users and Annotations, their resources & relations, and so on as a resource structure graph. This provides a way to formalize a URI tree with the primary paths being walked to derive a URI.

  • each node type may have multiple representations
  • each resource type has defined actions

More concretely:

URI => (Journals | Articles | Users | Annotations) ["/" Representation]

Articles => "/article" ["/" Criteria]

Criteria => (Doi | Query)
Doi => "info:doi/10.1371." whatever
Query => Field "=" ValueList (";" Field "=" ValueList)*
Field => "category" | "author" | "startDate" | "endDate"
ValueList => LITERAL | ("'" LITERAL (";" LITERAL)* "'")

Representation => "atom" | "xml" | "pdf"

GETS on /article

RequestResponse
/article Atom Publishing Protocol Service Discovery Document
/article/${digital object identifier}article with doi=${digital object identifier} as Atom Feed
/article/info%3Adoi%2F10.1371%2Fjournal.pone.0000457/xml article with doi=info:doi/10.1371/journal.pone.0000457 in XML
/article/category='Ecology'current articles in Ecology as Atom Feed

Note: digital object identifiers contain chars that must be escaped in URIs:

:%3A
/%2F

Get Atom Feed

RequestResponse
http://plosone.org/article/new articles, all categories
http://plosone.org/article/category/${category}new articles, category=${category}
http://plosone.org/article/category/Ecologynew articles, category=Ecology
http://plosone.org/article/category/Ecology;Nutritionnew articles, category=Ecology and Nutrition
http://plosone.org/article/category/or(Ecology;Nutrition)new articles, category=Ecology or Nutrition

Get Article Search Results

Request/Response

RequestResponse
http://plosone.org/article/${parm}/${value}/${parm}/${value}/...articles that meet the search criteria in the specified or default format

Parms

parmvaluedefault
categorysingle category | ";" separated list of categories to "and" | or(list of categories)all categories
authorsingle author | ";" separated list of authors | or(list of authors)all authors
startDatedate, returned article's >= dateall dates
endDatedate, returned article's <= dateall dates
maxResultsnumber, maximum number of results to returnno maximum
representationAtom | XMLAtom

Get "Object"

Request/Response

RequestResponse
http://plosone.org/${object}/${digital object identifier}${object} with doi=${digital object identifier} in default format
http://plosone.org/${object}/${digital object identifier}/representation/${representation}${object} with doi=${digital object identifier} in ${representation} format

Parms

${object}${digital object identifier}${representation}
articleinfo:doi/10.1371/journal.pone.#######XML(default) | Atom | PDF
annotationinfo:doi/10.1371/annotation/#XML(default) | Atom
userinfo:doi/10.1371/account/#XML(default) | Atom

TODO: support semantic bootstrapping, e.g. a way to get the list of fields/types in an Object.


Generic Query

RequestResponse
http://plosone.org/${object}/oql/${query}query results in default format
http://plosone.org/${object}/oql/${query}/representation/${representation}query results in ${representation} format

Parms

parmvaluedefault
objectarticle | annotation | userrequired
querywhere <condition> [<order>] [<limit>] [<offset>]required
representationarticle objects support XML | Atom | PDF, annotations & users support XML | AtomXML

Conventions

  • dates should be formated as YYYY-MM-DD (ISO 8601)
  • a list of values is an implicit "and"
  • or(value;value) is used to "or" a list of values
  • ";" is used to separate values where order is not important
  • "," is used to separate values where order is important

Experimental

This interface is experimental and may change before the 1.0 release of PLoS.

We are actively soliciting feedback on how to expose PLoS ONE in a REST-like fashion to meet your needs. Also, we enjoy hearing about how you are developing on top of PLoS ONE. Please contact us with your ideas.