We’ve been hard at work mining the logs for the Dizeez game (see past posts for context). To summarize the take home message, the Dizeez game resulted in the identification of several novel gene-disease annotations.
We used a psuedo-gold standard set of 3439 candidate gene-disease links that were mined from the Gene Wiki as the input data set for the Dizeez game. The game randomly selects one of these links (the “right” answer) and hides the disease among four randomly chosen diseases. Whether or not the player guesses the right answer determines whether they get points in the game, but all answers are logged by the system as gene-disease “assertions”. Within one week of the game release, over 350 games had been played to completion by over 100 unique individuals. Summed over all games, players provided 3069 guesses that spanned 2762 unique gene-disease assertions.
We started by examining the gene-disease assertions that were provided most often by game players. Two assertions were provided six times by game players. First, the gene WRN (“Werner syndrome, RecQ helicase-like”) was linked to the disease Werner syndrome. Second, the gene CRYGC (“crystallin, gamma C”) was linked to cataracts. Both of these annotations were previously known via OMIM. Great, nothing new discovered, but finding these examples at the top of the list was a good positive control.
We then filtered out all gene-disease links that had been previously annotated in either OMIM or PharmGKB. After removing those entries, there were six assertions that were provided by players four or more times. Five out of those six could easily be verified by a quick literature search. Those results are summarized here:
|Gene Symbol||Disease name||Disease Ontology ID||PubMed support|
|SOX8||mental retardation||DOID:1059||18076105, 10662550|
|ITGAL||leukocyte adhesion deficiency||DOID:6612||11753075, 7628754|
|TG||Graves’ Disease||DOID:12361||18385936, 11788684|
To summarize, even after very limited game playing, the Dizeez game resulted in the identification of several novel gene-disease annotations. These gene-disease links are well-established in the literature, but not reflected as structured annotations which would allow them to be used in bioinformatics and statistical analyses.
Clearly there is more to be done to improve the game, but we are growing increasingly excited about games as a mechanism to harness community intelligence to structure biological knowledge.