Recently, GenomeWeb did a pretty extensive write-up of BioGPS. I think it’s a pretty detailed and accurate view of how BioGPS fits into the sphere of online biological resources.

The only part I take a bit of issue with is the series of quotes by Larry Moran, who then followed it up with a more detailed blog post. Among his comments,

Larry Moran, a biochemist at the University of Toronto, told BioInform by e-mail that he had looked at a few of his “favorite genes” in the portal. “I don’t think it’s a very useful database,” he said, since it is a summary of information gleaned from other databases with “no attempt at annotation.”

This is a great opportunity to clarify here BioGPS’s use cases and our target audience…

Larry has focused his entire scientific career on the study of HSP70 family genes. For people like Larry who only care about a handful of genes, they really don’t have a great need for gene portals. They know their genes backward and forward, and they get their information directly by following the primary literature. Relative to that, every gene portal will be missing important information.

But BioGPS targets researchers doing genome-scale science, which has become increasingly popular in the last decade or so. Suppose you’ve done a microarray experiment comparing tumor tissue to a matched control, and maybe you’ve identified 100 differentially expressed genes. You can’t pick in advance which genes you’ll find, and undoubtedly the list will include genes you’ve never studied before. What kind of resources help you quickly evaluate which are the most promising for follow-up studies? Gene portals.

And gene portals are not just for gene expression analyses. Scientists doing copy-number analysis, methylation profiling, proteomics, or functional genomics all face this similar issue. For people doing “data-driven science”, gene portals are essential for quickly learning about unfamiliar genes.

Larry goes on to say:

The point is whether taking the expression data and adding links from other sources makes BioGPS a valuable resource.

Not as far as I can see.

Well, Larry has essentially described SymAtlas (BioGPS’s precursor), and that site gets about 1.7 million hits per year worldwide. So yes, many people do think that’s useful. Moreover, BioGPS extends that model by enabling users to aggregate data from multiple gene portals, emphasizing community extensibility and user customizability. Based on his characterization above, it seems Larry hasn’t explored these features yet.

Larry’s right, we’re not attempting to do annotation, so BioGPS might not be useful to him. But he’s not our target audience either…