A simple way to write Wikidata bots

Introduction: Sulab has proven it’s love for Wikipedia by creating the GeneWiki[1]. Wikipedia is primarily a collection of free-text pages which also can contain some structured data in the form of infoboxes. In order to increase the abilities of handling and representing structured data in the MediaWiki universe, Wikidata was conceived and finally rolled out to the community in late 2012....
read more

Detecting Cheaters in a Crowdsourcing Task

Similar to how Amazon Mechanical Turk was used to help us learn about concept recognition in the context of crowdsourcing, our lab has been utilizing the CrowdFlower platform to learn about relationship extraction. Unfortunately, we found a small subset of task participants (henceforth workers) who we suspected to be cheating and not actually doing the requested task. With his Ipython Notebook...
read more

Network of Biothings Hackathon at UC San Diego

Can you code? Are you interested in the intersection of computer science and biology (bioinformatics) ? Do you want to meet interesting people? Are you excited about building new pieces of software that could change the face science and medicine? Do you want to win a cash prize for your open source code? Then its clearly time to: Sign up for the next Network of Biothings Hackathon! Location:  UC...
read more

Distribution of GO annotations among human genes

Recently, someone emailed me to ask how I got the data behind this figure that I often use in the introductions of my talks: This figure shows that while there are few genes that are very well annotated (100s of GO annotations manually annotated by biocurators), that degree of annotation falls off very quickly. At the time I last did the analysis, 65% of genes had 5 or fewer GO annotations, and...
read more