Citation: Bourne PE, Beran B, Bi C, Bluhm W, Dunbrack R, Prlić A, et al. (2010) Will Widgets and Semantic Tagging Change Computational Biology? PLoS Comput Biol 6(2): e1000673. doi:10.1371/journal.pcbi.1000673
Editor: Ruth Nussinov, National Cancer Institute, United States of America and Tel Aviv University, Israel
Published: February 26, 2010
Copyright: © 2010 Bourne et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This article received no specific funding.
Competing interests: The authors have declared that no competing interests exist.
We argue here, through the use of several examples from our work in support of structural biology, that the answer to the question posed by the title of this Perspective is a resounding yes. The discussion that follows is aimed primarily at those of the journal's readers who are biological resource developers and Web page developers interested in developing the richest possible Web pages. However, those of you who simply use biological resources might find this a helpful discussion in understanding what is on the horizon. Whatever your interest, please let us hear your opinion on the question posed by this Perspective through the associated comment feature.
We define a widget as a simple piece of code that can be embedded into a Web page or desktop to provide functionality that is derived from another Web site. To put widgets into perspective with other technologies, widgets share the portability and usability of an applet but are typically simpler. Similarly, widgets provide some of the functionality of products like Microsoft SharePoint, but are usually nonproprietary. Here a semantic tag is defined as a specific type of widget that brings some semantic information into the Web page or desktop from another Web site. Consider several simple examples (one desktop widget and the rest launched through a Web browser) we have developed recently that can be found (with others) on the RCSB Protein Data Bank (PDB)  Web site at http://www.pdb.org/pdb/static.do?p=widgets/widgetShowcase.jsp to illustrate the point.
The first example widget was developed by the Protein Structure Initiative (PSI) Knowledgebase (KB) project (http://kb.psi-structuralgenomics.org/) . The KB widget automatically detects new articles, structures, and features when the site updates on the third Thursday of each month. Embedded in a Web page it provides immediate access to the new features at the KB from any Web page. The second example widget is a dashboard widget contributed by Brian Weitzner and Roland Dunbrack (http://dunbrack.fccc.edu/dashpdb/). The Mac-OSX dashboard widget provides a simple way of querying the RCSB PDB or downloading a specific structure from the Macintosh dashboard application and as such represents a useful shortcut. The third example is illustrated in more detail in Figure 1.
Assuming widgets are a compelling development for the field, what is the potential and why would people use them? Addressing the latter question, Web resource developers are driven by getting eyeballs on their site; good Web statistics are a prerequisite for grant renewals. Although with widgets the eyeballs may no longer be directed at the site, but at a small component of the site integrated remotely. Nevertheless, their work is potentially made accessible to a larger number of users than would otherwise be the case and it registers Web hits on their own site. Addressing the former question, the potential, in our opinion, is to provide the opportunity to make some order out of the chaos that exists today. Users of computational biology resources face a bewildering array of resources, different interfaces, and a lot of features they will likely never use. Google obviously know how to do it best as proven by their search engine interface. But such simplicity applied to complex biological applications is not always possible, although extraction of suitable subsets of functionality into widgets may be possible. Further, and a dream perhaps, a few standards for widget development both on the server side and the presentation side could provide a productivity gain for the life sciences community who are increasingly dependent on these computational resources. Being optimistic, it might even bring to light in new ways the most authoritative and dependable resources. Being very optimistic we might see an end to the lack of persistence in computational biology resources that has been discussed previously . Resources would use persistent URLs (PURLs) and their Application Program Interface (APIs) would conform to agreed upon standards.
Talk is cheap; consider the specific example of semantic tagging, which we believe makes the argument for widgets even more compelling (Figure 2). Again here is an example from the RCSB Protein Data Bank.
The box labeled usage shows how these tags are used in the html document. On the left is such a Web page (in italics) that has been semantically tagged. The Web page was created by David Goodsell as part of the RCSB PDB “Molecule of the Month” feature.
There are four tags used in the document that illustrate how new life and comprehension can be bought to Web pages. This is done without any loss of context. As the author mouses over the Web page itself new information is bought forth from other resources defined by the Web page author. The author tag tied to Stanley, W.A. will return all entries in the PDB database that have been authored by the same person. The menu tag tied to entry 2tmv, a specific structure in the PDB, brings up a menu of options associated with access to that entry, for example display the sequence of the entry, display a summary page from the PDB describing the structure, and so on. The keyword tag attached in this case to RNA will search for all instances of that keyword in the PDB and return associated structures. Finally the rcsb_id tag attached here to the term capsid protein will provide a thumbnail view of the molecule that can be selected for a more detailed view from the RCSB PDB. The intent here is to illustrate that a simple text document can be enriched to benefit the reader. Certainly such tags can be ignored, but they can also provide additional insights into the work described in the document through direct and contextual information from elsewhere, which in this case just happens to be a database of protein structures. If hyperlinks are considered powerful and the core of what makes the World Wide Web, this type of semantic tagging adds a new dimension to the Web. If semantic tagging were to take off, issues of name space might appear, but for now having exemplars that illustrate the power of the medium would seem an excellent first step. Imagine the day when such tags are added to research articles as they are written and are carried through to the final published paper. Perhaps the promise of the semantic Web will be realized. Time will tell, for now it would be good to see more computational biologists embrace and promote this technological development. What do you think?
Thanks to Scott Markel, Richard Cave, and John Westbrook for their insightful input. The RCSB PDB is operated by Rutgers, The State University of New Jersey, and the San Diego Supercomputer Center and the Skaggs School of Pharmacy and Pharmaceutical Sciences at the University of California, San Diego. It is supported by funds from the National Science Foundation, the National Institute of General Medical Sciences, the Office of Science, Department of Energy, the National Library of Medicine, the National Cancer Institute, the National Institute of Neurological Disorders and Stroke, and the National Institute of Diabetes and Digestive and Kidney Diseases.
- 1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, et al. (2000) The Protein Data Bank. Nucleic Acids Res 28: 235–242.
- 2. Berman HM, Westbrook JD, Gabanyi MJ, Tao W, Shah R, et al. (2009) The protein structure initiative structural genomics knowledgebase. Nucleic Acids Res 37: D365–D3688.
- 3. Veretnik S, Fink JL, Bourne PE (2008) Computational biology resources lack persistence and usability. PLoS Comp Biol 4: e1000136. doi:10.1371/journal.pcbi.1000136.