HTML.

That’s it, the simple HyperText Markup Language, the backbone of the Web, will be the BioGPS data exchange protocol.

First, let’s step back a moment. We’ve always had the goal of extensibility as one of the two central themes of BioGPS, since the lack of effective extensibility has led to the fragmented landscape of gene annotation resources. Although DAS can be used to extend some gene portals, utilization in the broader biology community has been pretty limited. We wanted a protocol that would be simple and flexible, and DAS is neither of those.

So, why use HTML? Well, first consider that it doesn’t get any simpler than HTML for presenting data. A complete novice can figure out in an afternoon how to make their first HTML web page. And a beginner programmer can probably figure out in a week how to programmatically create web pages using basic CGI. Thousands of biologists (or more?) have these skills, and it’s not just limited to the hardcore bioinformaticians.

Second, HTML is as flexible as the web is. Modern biology utilizes data from a range of technology platforms, and gene annotation can’t be solely expressed as coordinates on a genome assembly. Have novel predictions of gene function? Great, display them in free-text or a table. If your data is best expressed as a network diagram, HTML allows you to display images. If users need to interact with that network, no problem, create a Java/DHTML/Flash interface. An HTML interface can handle anything that can be rendered in a browser.

So what does it mean that we’re using HTML as our data interface? Sure, there are some downsides to using a completely unstructured format, but means that BioGPS is incredibly extensible. More on this in a future post, or check out the BioGPS plugin library