Topaz Public REST Interface
This page is focused on discussing the design of making Topaz expose this REST style programming paradigm.
We are planning to allow applications that build on top of the Topaz platform (example PLoS Publishing System) to be able to quickly expose REST-like interfaces to the Web world. We further believe this is quite feasible given the new architecture of Topaz. This new architecture falls in a new category of software we call OTM (Object to Triples Mapping) which is analogous to ORM (Object to Relational Mapping) except the persistent store being an Triples database (RDF/RDFS etc.). Application developers define their business objects and "annotate" them with appropriate markers to allow Topaz to persist the objects in the underlying RDF store. A very useful side-effect of this is that Topaz has a meta-level view of the business objects, and by this we mean that Topaz has a very good understanding of the application's Java objects, including fields and methods.
Please note that the URI's specified are not syntactically correct and are to convey the general idea for now.
Hopefully this will be clearer with a specific example. PLoS Publishing System has a business object called "Article" which is roughly designed as follows:
public class Article {
private String title;
private List<String> authors;
private String abstract;
private Annotations[] annotations;
}
Once this class is registered with Topaz, it is aware that the Article has the fields authors, title, annotations, abstract, etc. which are of type List<String>, String, Annotations and String respectively. This knowledge allows for very powerful constructions of dynamic URIs (the key to REST interfaces). For example, http://www.plos.org/articles/author="John Smith" could be attached to dynamic generated action which fetches all articles authored by John Smith. This is possible because Topaz knows that author is a field of Article and can potentially generate the query dynamically to return the object(s). Other alternate syntaxes like http://www.plos.org/Articles.author="John Smith" are also possible. This type of "convention based" programming style is becoming quite popular (example Ruby on Rails) and will make it easier for external developers to exploit the information within the Topaz platform. These convention based rules we create should cover two key aspects:
- The URI design to access the information
- Format of the return data value(s)
For the first release we will mostly focus on READ/GET operations, except in the one case of maybe allowing external parties to be able to add annotations. More on this specific case later.
Example from NCBI
What is this thing?
The NCBI Resource Locator provides stable, uniform addressing for NCBI content, making it easy to link to individual records. Some NCBI resources also provide services (like search) through these URLs.
How does it work?
Each URL has the form
http://view.ncbi.nlm.nih.gov/<noun>/<verb>/<expression>
Where:
- <noun> is an NCBI resource (e.g., pubmed, gene, nucleotide, etc.)
- <verb> is the action to perform (e.g., search, get, etc.). If <verb> is missing, the default verb "get" is used.
- <expression> is data used by the action to perform the request
Some examples:
- http://view.ncbi.nlm.nih.gov/pubmed/12345 Show the PubMed? record with pmid 12345
- http://view.ncbi.nlm.nih.gov/pubmed/search/cancer Search PubMed? for "cancer"
- http://view.ncbi.nlm.nih.gov/gene/search/human+p53 Search PubMed? for "human p53"
- http://view.ncbi.nlm.nih.gov/homologene/search/dystrophin Search Homologene for "dystrophin"
READ/GET URIs
The design idea behind this is to keep majority of read operations intuitive where developers do not have to *remember* URIs. Based on the knowledge of the business objects, they can test and build the URIs on the file. Some ideas:
wget http://www.plos.org/article/ could return the name and type of fields within the Article object
wget http://www.plos.org/article/graph could return the name of the Mulgara model we store articles in
This simple structure will hopefully cover the most commons use cases. For the more complex we should allow users (within defined security limits) to be able to execute OQL/iTQL queries. Some ideas:
wget http://www.plos.org/query?itql="select...."
wget http://www.plos.org/query?oql="select...."
Of course, if someone knows of a good and user friendly way of embedding logical expressions within a URI, the two above different design patterns could be merged.
PUT/POST URIs
This is a harder one to deal with as creating a generic PUT/POST is a lot harder. Unfortunately we already have a very useful case for this one. We would like external parties (such as the Protein Database folks) to be able to retrieve PLoS article XMLs, run them through their tools and create a list of annotations providing an annotated link from the Protein name to the protein molecule within the Protein DB. It would be nice to be able to provide them with a simple URI using which (of course with appropriate security cordons) they can upload the annotations back to PLoS associated with the article. If this process can be automated PLoS can form partnerships with quite a few useful other databases.
URI Grammar
It is possible to model the resources PLoS exposes, Journals, Articles, Users and Annotations, their resources & relations, and so on as a resource structure graph. This provides a way to formalize a URI tree with the primary paths being walked to derive a URI.
- each node type may have multiple representations
- each resource type has defined actions
More concretely:
URI => (Journals | Articles | Users | Annotations) ["/" Representation]
Articles => "/article" ["/" Criteria]
Criteria => (Doi | Query)
Doi => "info:doi/10.1371." whatever
Query => Field "=" ValueList (";" Field "=" ValueList)*
Field => "category" | "author" | "startDate" | "endDate"
ValueList => LITERAL | ("'" LITERAL (";" LITERAL)* "'")
Representation => "atom" | "xml" | "pdf"
GETS on /article
| Request | Response |
| /article | Atom Publishing Protocol Service Discovery Document |
| /article/${digital object identifier} | article with doi=${digital object identifier} as Atom Feed |
| /article/info%3Adoi%2F10.1371%2Fjournal.pone.0000457/xml | article with doi=info:doi/10.1371/journal.pone.0000457 in XML |
| /article/category='Ecology' | current articles in Ecology as Atom Feed |
Note: digital object identifiers contain chars that must be escaped in URIs:
| : | %3A |
| / | %2F |
Get Atom Feed
| Request | Response |
| http://plosone.org/article/ | new articles, all categories |
| http://plosone.org/article/category/${category} | new articles, category=${category} |
| http://plosone.org/article/category/Ecology | new articles, category=Ecology |
| http://plosone.org/article/category/Ecology;Nutrition | new articles, category=Ecology and Nutrition |
| http://plosone.org/article/category/or(Ecology;Nutrition) | new articles, category=Ecology or Nutrition |
Get Article Search Results
Request/Response
| Request | Response |
| http://plosone.org/article/${parm}/${value}/${parm}/${value}/... | articles that meet the search criteria in the specified or default format |
Parms
| parm | value | default |
| category | single category | ";" separated list of categories to "and" | or(list of categories) | all categories |
| author | single author | ";" separated list of authors | or(list of authors) | all authors |
| startDate | date, returned article's >= date | all dates |
| endDate | date, returned article's <= date | all dates |
| maxResults | number, maximum number of results to return | no maximum |
| representation | Atom | XML | Atom |
Get "Object"
Request/Response
| Request | Response |
| http://plosone.org/${object}/${digital object identifier} | ${object} with doi=${digital object identifier} in default format |
| http://plosone.org/${object}/${digital object identifier}/representation/${representation} | ${object} with doi=${digital object identifier} in ${representation} format |
Parms
| ${object} | ${digital object identifier} | ${representation} |
| article | info:doi/10.1371/journal.pone.####### | XML(default) | Atom | PDF |
| annotation | info:doi/10.1371/annotation/# | XML(default) | Atom |
| user | info:doi/10.1371/account/# | XML(default) | Atom |
TODO: support semantic bootstrapping, e.g. a way to get the list of fields/types in an Object.
Generic Query
| Request | Response |
| http://plosone.org/${object}/oql/${query} | query results in default format |
| http://plosone.org/${object}/oql/${query}/representation/${representation} | query results in ${representation} format |
Parms
| parm | value | default |
| object | article | annotation | user | required |
| query | where <condition> [<order>] [<limit>] [<offset>] | required |
| representation | article objects support XML | Atom | PDF, annotations & users support XML | Atom | XML |
Conventions
- dates should be formated as YYYY-MM-DD (ISO 8601)
- a list of values is an implicit "and"
- or(value;value) is used to "or" a list of values
- ";" is used to separate values where order is not important
- "," is used to separate values where order is important
Experimental
This interface is experimental and may change before the 1.0 release of PLoS.
We are actively soliciting feedback on how to expose PLoS ONE in a REST-like fashion to meet your needs. Also, we enjoy hearing about how you are developing on top of PLoS ONE. Please contact us with your ideas.
