As seen in our previous spotlight on InterMine, FlyMine is the prototypical InterMine database which is one of two mines currently being maintained by the InterMine team. This week, we’ll delve into FlyMine and how this resource has added value to the fly and mosquito research communities. Yo Yehudi (developer), and Rachel Lyne (biologist) were kind enough to answer our questions for this post.
- In one tweet or less, introduce us to your tool:
FlyMine integrates Drosophila data from multiple sources, covering a broad range of curated and high-throughput data.
- Who is your target audience?
Our aim is to be easy-to-use for any biologist interested in fly gene data in general. Perhaps more specifically, though, we provide advanced data querying mechanisms (like our query builder and templates) and web APIs with client libraries tailored to allow bioinformaticians – or anyone who is comfortable with programming – to retrieve and manipulate our data.
- Why did you create your tool?
FlyMine was developed to addresses one of the important challenges of modern biology: how to integrate and make use of the diversity and volume of current biological data.
- Why is your tool unique and special? How does FlyMine compare/interact with other fly resources like FlyAtlas?
Other fly resources are tremendously valuable data sources, and we couldn’t do what we do without them. The main difference is that we load a subset of their data, and integrate it with several other data sources (about 30, see our Data Sources page for the full list). This lets people search and investigate with lots of different data types using our advanced searching tools.
We also link back to the originating data sources; whilst FlyMine provides an aggregated overview, sometimes the best way to delve deeper into specifics is to go directly to the source itself.
- What level of traffic does FlyMine typically see?
Looking at data for the past year, we average around 10,000 page views a month – sometimes a bit more, sometimes a bit less. December was our quietest month, unsurprisingly.
One really cool thing we noticed while looking up the stats for this question is that we’ve had visitors literally from all over the world. Nigeria, Chile, Indonesia – you name it, people from almost every country have visited us.
- What is your greatest success story so far?
One of the biggest successes of FlyMine was probably migrating it from a purpose-built database focusing solely on flies, and genericizing it to be used for multiple different biological data types and organisms, eventually becoming InterMine with (currently) over 28 public mines, covering plants, animals, mitochondrial proteins, drug targets, DNA repeats, etc. There are also other mines that are run privately.
- What improvements are coming in the future?
Possibly some of the most important updates and improvements are the updates to our data, which might be almost invisible to the end user. Data sources and aggregators have a real responsibility to ensure their data is still correct and relevant – a recent paper pre-print showed that in 2015, 80% of papers used data sources that were old enough to only have data on 20% of currently known pathways.
We run a new ‘build’ of data – aggregating our various sources using the most up-to-date version and putting it into FlyMine – every few months. This is often a lot more work than it looks. Some of our data build process is automated, but when you have a large number of varying sources, it usually requires some human intervention to integrate them all successfully. Invariably at least one of our sources will have changed the way they format their data since the last build, as well!
- Who is the team behind your tool?
This is a tricky one to answer – we mentioned the current team in our previous spotlight article, but there’s more to the story. FlyMine is actually quite mature as bioinformatics tools go. Its earliest incarnations began in 2002, meaning that the team has changed faces a few times over the years. In addition, we’ve had a lot of help from others in the community, who contribute source code pull requests.
In total, our github contributors page says we have 20 contributors, but there are over 15 other past FlyMine members who don’t appear on the list. Of course, many of the important contributions to InterMine come from non-coders, too, in the form of ideas, guidance, and support.
Thanks to Yo and Rachel, for guiding us through this extremely useful and FREE tool. Be sure to check out their plugin in the plugin library. If you use FlyMine in your research, be sure to cite their publication:
FlyMine: an integrated database for Drosophila and Anopheles genomics
Rachel Lyne, Richard Smith, Kim Rutherford, Matthew Wakeling, Andrew Varley, Francois Guillier, Hilde Janssens, Wenyan Ji, Peter Mclaren, Philip North, Debashis Rana, Tom Riley, Julie Sullivan, Xavier Watkins, Mark Woodbridge, Kathryn Lilley, Steve Russell, Michael Ashburner, Kenji Mizuguchi and Gos Micklem. Genome Biology 2007, 8:R129 doi:10.1186/gb-2007-8-7-r129
pubmed id: 17615057
It’s open access so you can actually read up on the nitty gritty details of this great resource at your leisure!