Science Game Lab: tool for the unification of biomedical games with a purpose

Posted by on Jun 16, 2017 in games, genegames, gwaps, Science Game Lab, SGL, sulab | 0 comments

Scripps team: Benjamin M. Good, Ginger Tsueng, Andrew I Su
Playmatics Team: Sarah Santini, Margaret Wallace, Nicholas Fortugno, John Szeder, Patrick Mooney, 
With helpful ideas from: Jerome Waldispuhl, Melanie Stegman
Games with a purpose and other kinds of citizen science initiatives demonstrate great potential for advancing biomedical science and improving STEM education.  Articles documenting the success of projects such as and Eyewire in high impact journals have raised wide interest in new applications of the distributed human intelligence that these systems have tapped into.  However, the path from a good idea to a successful citizen science game remains highly challenging.  Apart from the scientific difficulties of identifying suitable problems and appropriate human-powered solutions, the games still need to be created, need to be fun, and need to reach a large audience that remain engaged for the long-term.  Here, we describe Science Game Lab (SGL) (, a platform for bootstrapping the production, facilitating the publication, and boosting both the fun and the value of the user experience for scientific games with a purpose.  
Ever since the project famously demonstrated that teams of human game players could often outperform supercomputers at the challenging problem of 3d protein structure prediction, so-called ‘games with a purpose’ have seen increasing attention from the biomedical research community.  A few other games in this genre include: Phylo for multiple sequence alignment, EteRNA for RNA structure design, Eyewire for mapping neural connectivity, The Cure for breast cancer prognosis prediction, Dizeez for gene annotation, and MalariaSpot for image analysis.  Apart from tapping into human intelligence at scale, these efforts have also produced valuable educational opportunities.  Many of these games are now used to introduce their underlying concepts in classroom settings where games in all forms are increasingly working their way into curriculums.  Concomitant with the rise of these ‘serious games’, citizen science efforts such as the Zooniverse and Mark2Cure have sought similar aims but have packaged their work as volunteer tasks, analogous to unpaid crowdsourcing tasks, rather than as elements of games.  
Many of these initiatives have succeeded in independently addressing challenging technical problems through human computation, improving science education, and generally raising scientific awareness.  However, with so much interest from the scientific community and a booming ecosystem of game developers, there are actually relatively few of these games in operation now.  Recognizing the opportunity, various groups have attempted to push the area forward through new funding opportunities and through various ‘game jams’ such as the one that produced the game ‘genes in space’ for use in analyzing microarray data in cancer.  Here, we take a different approach towards expanding the ecosystem of games with a scientific purpose.  Rather than attempting to seed the genesis of specific new game-changing games, we hope to lower the barrier to entry for new games and related citizen science tasks to generally promote the development of the entire field.  With this high-level aim in mind, we developed Science Game Lab (SGL) to make it easier for developers to create successful scientific games or game-like learning and volunteer experiences.  Specifically, SGL is intended to address the challenges of recruiting players and volunteers, keeping them engaged for the long term, and reducing the development costs associated with creating a scientific gaming experience.
The Science Game Lab Web application
SGL is a unique, open-source portal supporting the integration of games and volunteer experiences meant to advance science and science education (  Unlike other related sites that act more like volunteer management and/or project directory services, such as SciStarter and Science Game Center, SGL is not simply a listing of related websites.  Rather, it is an attempt to create a user experience that takes place directly within the SGL context yet still incorporates content from third parties.  The system is largely inspired by game industry portals such as Kongregate that enable developers to incorporate their games directly into a unified metagame experience .
Players can use the portal to find and play games with their achievements within the games tracked on site-wide high score lists and achievement boards (Figure 1).  Players can earn the SGL points that drive these leaderboards for actions taken in different games.  In this way, SGL provides developers with access to a metagame that can be used to encourage players in addition to the incentives offered within individual games (Figure 2).  This metagame can also be used by the system administrators to help direct the player community’s attention to particular games or particular tasks within games.  For example, actions taken on new games might earn more points than actions taken on more established games as a way to ‘spread the wealth’ generated by successful games.    

Figure 1.  SGL home page demonstrating site-wide high score list, game listing, and links to achievements, help, and user profile information.
Figure 2.  Badges displayed on user’s profile page.  Available badges not yet achieved are greyed out.
 Developers interact with SGL by incorporating a small javascript library into their application and using the SGL ‘developer dashboard’ to pair up events in their game with points, badges and quests managed by the SGL server.  At this time, SGL only supports games that operate online as Web applications.  The games are hosted by the developers and rendered in the SGL context within an iframe.  The SGL iframe provides a ‘heads up display’ that provides real time feedback to game players with respect to events sent back to the SGL server such as earning points, gathering badges, or progressing through the stages of a quest (Figure 3).  This display provides developers with the ability to add game mechanics to sites that are not overtly games.  For example, Wikipathways incorporated a pathway editing tutorial into SGL, using the heads up display to reward users with SGL points and badges for completing various stages of the tutorial.   The tutorial also took advantage of the SGL quest-building tool (Figure 4).  Games are submitted by developers for approval by SGL administrators.  Once approved, the games appear in the public view and can be accessed by any player.  

Figure 3.  The heads up display provided by the SGL iframe.  Shows events captured by the API and provides users with immediate feedback.   

Figure 4.  Tasks in SGL can be grouped into quests.  The figure shows a particular user’s progress through various quests available within the system.

If a critical initial mass of effective games can be integrated, SGL could strongly benefit new developers by providing immediate access to a large player population.  Site-level status, identity and community features can help with the even greater challenge of long-term player engagement, a noted problem in the field.  Within the context of science-related gaming, such status icons might eventually be used as practically useful, real-world marks of achievement inline with the notion of ‘Open Badges.  As demonstrated by the Wikipathways tutorial application, SGL can be used to replace the need for developers to host their own login systems, user tracking databases, and reward systems – all of which can be accomplished using the SGL developer tools. Citizen scientists are not homogenous in their motivations. Designing to be inclusive of gamers and non-gamers can be challenging. By offering an alternative means of experiencing a web-based citizen science application, SGL allows developers to cater to both their gaming and non-gaming contributor audience. Together, these features unite to raise the overall potential for growth within the world of citizen science and scientific gaming.  
Future directions
SGL is currently functional, but so far has attracted only a small number of developers willing to integrate their content into the portal.  Future work would need to address the challenge of raising the perceived value of integration with the site while lowering the perceived difficulty.  Looking forward, key challenges for the future of SGL include better support for:
  • games meant for mobile devices
  • development of quests that span multiple games
  • teachers to build SGL-focused lesson plans and track student progress
  • creating new ‘SGL-native’ games
  • integration with external authentication systems
None of these are insurmountable challenges, but they all require significant continued investment in software development.  As an open source project, we encourage contributions from anyone that shares in our vision of spreading and doing science through the grand unifying principle of fun.

Building communities of knowledge with Wikidata

Posted by on Jun 16, 2017 in crowdsourcing, Gene Wiki, semantic wikipedia, sulab, wiki, Wikidata, wikipedia | 0 comments

As the Wikimedia Movement works to define its strategy for the next fifteen years, it is worthwhile to consider how its recent product Wikidata may fit into that strategy.  As its homepage states,
Wikidata is a free and open knowledge base that can be read and edited by both humans and machines.”
Wikidata is a particular kind of database designed to capture statements about items in the world with references that support those statements.  Because Wikidata is a database, its contents are meant to be viewed in the context of software that retrieve the data through queries and then renders the data to meet the needs of a user in a certain context.  The same data can thus be viewed on Wikidata-specific pages such as and in the infoboxes of Wikipedia articles such as  Importantly, Wikidata content can also be used in applications outside of the Wikimedia family such as   
Examples of Wikidata use now include:
The molecular biology community (and in particular the Gene Wiki group) has embraced Wikidata as a global platform for knowledge integration and distribution.  To help envision how Wikidata may fit into the strategic vision of the WMF movement, it is worth taking a look at how and why this particular community is using Wikidata.  
History of the Gene Wiki initiative
The sequencing of the human genome at the beginning of this century and the consequent rush of data and new technology for producing even more data fundamentally changed how research in biology is conducted.  Before the year 2000, research typically proceeded with a single gene focus.  A typical PhD thesis would entail the analysis of the genetics or function of one gene or protein at a time.  A few years after the first genome however, it became possible to measure the activity of ten’s of thousands of genes at once resulting in an omnipresent problem of generating interpretations of experimental results containing hundreds of genes.  While a scientist may come to grasp the literature surrounding a single gene quite well, it is not possible to know everything there is to know about all 20,000+ genes in the genome – particularly when this knowledge is expanding on a minute by minute basis.  As a consequence, there arose a need to produce summaries of what was known about each gene so that researchers could quickly grasp its nature and easily find links to more detailed references as needed.  By 2008, many different research groups published wikis attempting to allow the scientific community to generate the required articles, e.g. WikiProteins, WikiGenes, and the Gene Wiki.  The Gene Wiki project was unique among this group as it anchored itself directly to Wikipedia and, likely as a result of that decision, has enjoyed long term success.  This initiative works within the English Wikipedia community to encourage and support the collection of articles about human genes.  Its main contributions are the infobox seen on the right hand side of of these articles and software for generating new article stubs using that template.  
Wikidata and the Gene Wiki project
For the past several years, the Gene Wiki core team (funded by an NIH grant) has focused primarily on seeding Wikidata with biomedical knowledge.  In comparison to managing data via direct inclusion and parsing of infobox templates as before, this makes the data much easier to maintain automatically and, importantly, opens it up for use by other applications.  As a result, Wikipedia isn’t the only application that can use this structured information.   One of the first products of that process was a new module (Infobox_gene) that draws all the needed data to render the gene infobox dynamically from Wikidata, greatly reducing the technical challenge of keeping the data presented there in sync with primary sources.  
In addition to the relatively simple collection of gene identifiers and links off to key public databases that are presented in the infoboxes, Wikidata now has an extensive and growing network of knowledge linking genes to proteins, proteins to drugs, drugs to diseases, diseases to pathogens, pathogens to places, places to events, events to people, and so on and so on.  This unique, open, referenced, knowledge graph may eventually become the closest thing to ‘the sum of all human knowledge’.  Capturing knowledge in this structured form makes it possible to use it in all kinds of applications, each with their own community-specific user experiences.  As a case in point, the Gene Wiki group created Wikigenomes based primarily on data loaded into Wikidata.  This was followed quickly by Chlambase, an application specifically focused on distributing and collecting knowledge about different Chlamydia genomes.  These applications provide domain-specific user interface components such as genome browsers that are needed to present the relevant information effectively and thereby attract the attention of specialist users.  These users, in turn, have the opportunity to contribute their knowledge back to the broader community through contributions to Wikidata that can be mediated by the same software.  
Wikidata and the world
The molecular biology research community, as represented by the Gene Wiki project, are early adopters of Wikidata as a community platform for the collaborative curation and distribution of structured knowledge, but they are not alone.  The same fundamental patterns are already being applied by other communities, e.g. those interested in digital preservation and open bibliography.  In each case, we see communities working to transition from the current dominant paradigm of private knowledge management towards the knowledge commons approach made possible by wikidata.  This is not unlike the transition from the world of the Encyclopedia Britannica to the world of Wikipedia.  The only important difference is that the knowledge in question is structured in a way that makes it easier to reuse in different ways and in different applications.  

Wikidata provides a mechanism for massively increasing the global good generated by the Wikimedia Foundation’s work by capturing knowledge in a form that can be agilely used to empower all manner of software with the sum of human knowledge.  

Happy Memorial Day weekend!

Posted by on May 26, 2017 in citizen science, CitSci2017, Cochrane Crowd, conference, mark2cure, MedLitBlitz, poster, presentation | 0 comments

The last few weeks have been a bit hectic, so we've got plenty of news and info to share with you.

First of all, if you haven't seen it yet, Cochrane Crowd has posted about about our joint webinar and the #MedLitBlitz. If you missed the webinar or had technical difficulties/time zone issues with it, it's available on youtube. The prize packages for the top three participants of #MedLitBlitz are packed and will be shipped either today or early next week (depending on whether or not shipments have been picked up for today or not).

Secondly, Mark2Cure was at the Citizen Science Association conference from 2017.05.15-2017.05.20, and was fortunate enough to share about YOUR work to an audience of scientists who LOVE citizen science! More than a few researchers stopped to introduce themselves to me and spoke highly of our community! Although it's always weird to hear a recording of your own voice, I recorded my presentation because it wouldn't be fair to talk about the amazing work you've done without sharing it with you! You can find my presentation for the biomedical session in our youtube channel. On a side note, I know the audio quality isn't the best which is why I've transcribed it using youtube's captioning software. If you have trouble hearing the presentation (because of the poor audio quality), please turn on the closed captions.

Max also delivered two lightning talks for the event, which I hope to upload soon.
Not available yet, but will be soon In addition to the talks, we had a poster for Mark2Cure and a table at two public events.
Max spreading the love for Mark2Cure

We were especially pleased to be so close to our buddy at Cochrane Crowd for this event
Cochrane Crowd looking good

Lastly, it looks like one of the missions was completed just as I was settling back in after the conference. A HUGE thanks to everyone that helped complete the carpingly mission. A new mission has been launched in its place, so check it out if you have some free time.

MedLit Blitz, Mark2Curathon Results and More

Posted by on May 16, 2017 in citizen science, Cochrane Crowd, mark2cure, MedLitBlitz | 0 comments

Mark2Curathon Results

MedLit Blitz, Mark2Curathon Results and More

Sorry for the delay, the Mark2Curathon results are finally in! During the Mark2Cure portion of MedLit Blitz, we had 34 participants contribute over 16,000 annotations. Because both the entity recognition and the relationship extraction tasks are very different from Cochrane's screening task, we had to take some additional considerations when tallying the results.

For the Relationship Extraction module, multiple annotations per abstract were possible as each abstract could have any number of concept pairings. Hence, for the relationship extraction module each annotation submitted counted as one task unit

For the Entity Recognition module, only one submission was possible per abstract, but users needed to identify three different types of entities. Hence, each abstract completed counted as three task units (one for each concept type--genes, treatments, diseases). Additionally, a tiered bonus multiplier (of an additional 2% to 15%) was applied for users who submitted high quality annotations.

The RE and ER tasks units were then added together for each user, and sorted from highest to lowest in order to determine user ranking for the event. Without further ado, these were the top 15 participants in the Mark2Curathon:
1. ckrypton
2. Dr-SR
3. TAdams
4. hwiseman
5. Kien Pong Yap
6. skye
7. ScreenerDB
8. priyakorni
9. Judy E
10. pennnursinglib
11. Calico
12. AJ_Eckhart
13. uellis
14. sueandarmani
15. nclairoux

A huge thanks to you all, and everyone who participated for making our first adventure with Cochrane Crowd so successful!

To qualify for the MedLit Blitz prize, Mark2Curators had to have contributed to the Cochrane Screening Challenge as well.

MedLit Blitz Results

We are in the process of contacting the winners and hope to have an update about this soon.

Mark2Cure at Citizen Science Association Conference 2017

Max and I have arrived in Twin Cities, Minnesota for the Citizen Science conference. Mark2Cure was accepted as part of the symposium on biomedical citizen science. Additionally, Mark2Cure was also accepted for a poster presentation and for the project slam. If that doesn't sound busy enough, Mark2Cure was accepted for a table at the 'Night in the Cloud' event (open to the public). If you are in town, please stop by our table!

About the prizes

Winners will receive a Mark2Cure mug, marker, novelty item, in addition to any prizes that Cochrane has prepared for this event.

The Mark2Curathon starts now!

Posted by on May 11, 2017 in citizen science, Cochrane Crowd, mark2cure, MedLitBlitz | 0 comments

The Mark2Curathon starts now!

Our anniversary celebration with Cochrane Crowd is well under way. #MedLitBlitz started with a webinar on Monday, and was followed by the Cochrane screening challenge from Tuesday to Wednesday. During that challenge, over 100 MedLit Blitzers screened 29,494 citations--over nine THOUSAND more than the initial goal of 20,000!

But the celebrations aren't over yet. It's now time for the Mark2Curathon portion of #MedLitBlitz!

For this part, we've launched 3 new missions in the Entity Recognition module. To be clear, all annotations (regardless of whether they were submitted via the Entity Recognition or Relationship Extraction module) will count towards #MedLitBlitz as long as they fall within the time frame of the event. If you don't see the new ER missions, log out, clear your cache and log back in.

As with Cochrane Crowd, we will be active on twitter; however, we know that many of our most ardent Mark2Curators do not use twitter. For this reason, we will also be sharing updates via our chat channel. As with our previous Mark2Curathons, no sign up is required to chat on this channel, and we encourage you to join us there.

For ease of tracking, here's the countdown till the end of the event:

If you participated in the Cochrane screening challenge as part of #MedLitBlitz we'd love to hear about it! It's been really fun working with Anna and Emily over at Cochrane Crowd, we'll definitely look forward to working with them in the future. If you've enjoyed our collaborative effort, feel free to ping some praise to @AnnaNoelStorr and @cochrane_crowd.

Webinar, Mark2Curathon, and more

Posted by on Apr 28, 2017 in citizen science, Cochrane Crowd, mark2cure, scistarter | 0 comments

Webinar, Mark2Curathon, and more

It’s citizen science season and we’re in the thick of it!

First off, welcome new users! If you found us from the latest SciStarter campaign, feel free ping us on twitter to let us know so we can pass our thanks to the @SciStarter team! We’re very excited to be featured as part of SciStarter’s recent focus/feature on biomedical citizen science! Note, if you complete your SciStarter profile this month, the SciStarter team will send you a free digital copy of The Rightful Place of Science: Citizen Science. See their post for more details

Citizen science has enormous potential, and we’re glad that Mark2Curators are helping us explore its application towards biomedical discovery.

As mentioned last week, we’re not the only ones who need your help for dealing with the biomedical literature. Cochrane Crowd is reaching its first anniversary in joining this domain of citizen science, and we’re celebrating together! We will be jointly hosting a webinar on May 8th and there will be two 24hr screening challenges. There will be prizes for the top three contributors who take part in both the Cochrane Crowd and Mark2Cure screening challenges. Here are the details:

Mark2Cure/Cochrane Crowd Webinar:

Date/Time: May 08, 2017, 9:00am – 10:00am PDT

Tentative agenda:

  1. Intro (5 minutes)
  2. Mark2Cure presentation (15 mins)
  3. Cochrane Crowd presentation (15 mins)
  4. MedLit Blitz (5 minutes)
  5. Audience Q&A (15-20 mins)

Interested in participating in the webinar? You’ll need to register first! Hurry, space is limited (due to limitations/licensing restrictions) of the webinar software. Register here

Medlit Blitz (2 x 24 hr screening challenges):

Cochrane Challenge: Help Cochrane Crowd identify studies that provide the best possible evidence of the effectiveness of a health treatment. Once identified by the Crowd the studies go into a central register where health researchers and practitioners can access them. The more studies identified by the Crowd, the more high-quality evidence is available to help health practitioners treat their clients.

Challenge Start: May 9th, 2017 10am GMT + 1 (UK time zone) / 2am (PDT)

Challenge Finish: May 10th, 2017 10am GMT + 1 (UK time zone) / 2am (PDT)

Mark2Curathon: Join the search for clues on a rare disease by identifying genes, diseases, drugs, and the relationships between these based on literature surrounding the NGLY1.

Challenge Start: May 11th, 2017 7pm GMT + 1 (UK time zone) / 11am (PDT)

Challenge Finish: May 12th, 2017 7pm GMT + 1 (UK time zone) / 11am (PDT)

Get ready to use your reading skills to make a difference in biomedical science and health!!!

Celebrating the application of citizen science towards biomedical literature

Posted by on Apr 14, 2017 in citizen science, mark2cure | 0 comments

Upcoming event — Med Lit Blitz!

Extracting information from biomedical literature is a huge problem that many researchers are trying to solve computationally. Mark2Cure approaches the biomedical literature problem with citizen science in hopes of enhancing computational methods. Happily, we are no longer alone in this regard! In fact, a year after Mark2Cure officially launched, another citizen science project (Cochrane Crowd), officially launched in order to identify randomized control trial papers from the biomedical literature. Since both projects were launched in May, Mark2Cure and Cochrane Crowd will be celebrating our anniversaries together!

Join us in celebrating the project anniversaries and the amazing way citizen scientists and volunteers have been helping to address issues in biomedical literature. We will be having a slew of joint events with Cochrane Crowd the week of May 8th, which will include a webinar, a Cochrane crowd marathon, and a Mark2Curathon. Details on the webinar and Med Lit Blitz should be announced next week.

New Entity Recognition Mission now available:

Peripheral Myelin Protein 22 (PMP22) is an N-glycosylated transmembrane protein that is mainly found in the nervous system. It was identified by multiple users in many docs spread across several different missions. Perturbations in this protein's homeostasis have been linked to Charcot Marie Tooth (CMT) disease. Many cases of CMT are actually caused by a PMP22 gene duplication which results in the over expression of the gene. NGLY1 functions to de-glycosylate cytoplasmic proteins enabling them to be recycled. Can we learn about the mechanisms behind NGLY1's neurological symptoms from the literature on N-glycosylated proteins like PMP22? Help us explore the literature around an interesting clue that YOU found.

The Disease Ontology license converted to CC0

Posted by on Apr 12, 2017 in bio-ontologies, classification, disease ontology, Ontologies, Ontologies | 0 comments

[Editor note: This guest blog post is from Lynn Schriml, who is an Associate Professor at the University of Maryland School of Medicine, the PI of the Disease Ontology, and a close collaborator on the Gene Wiki project.]

Licensing of bioinformatics resources through Creative Commons licensing enables their free distribution of the content of the resource thus enabling open sharing, use, and expansion (derivative works) of the content.

This month (as of April 5, 2017) the Disease Ontology (DO) project has updated our data content licensing from CC BY 3.0 (Attribution) to CC0 (the most open license) to enhance collaboration and sharing. While we will continue to encourage users of the DO to cite our publications (available on our DO website:, broader licensing will encourage greater usage of this biomedical ontology.  It is important to point out that attribution  demonstrates usage of bioinformatic resources, which is critical to demonstrate utility and a broad user community for grant applications to fund project development. But we are convinced by the argument that requests for attribution that are encoded in the legal license are both ineffective and counterproductive.  (And other links in this discussion reinforced our conclusion.)

With the development of CC0 licensing and recent adoption by other important biomedical resources (e.g., CIVIC, WikiPathways, ECO) , it has become clear that for our biomedical ontology to be most useful, it has to be free of content licensing restrictions. The DO was created and shared for it to be used, thus open content licensing is the most appropriate license for this project. Classification of human diseases is a complex endeavor, one that is best approached in an open, collaborative and community-data-driven environment.


New Entity Recognition Mission Available

Posted by on Mar 31, 2017 in citizen science, mark2cure | 0 comments

Pilocarpine is a drug that was frequently identified by our Mark2Cure citizen scientists and volunteers in the previous sets of biomedical literature. It was often found in the context of seizures or tears—both of which are symptoms associated with NGLY1 deficiency. In humans, pilocarpine is used to pre-operatively treat some forms of glaucoma, and to treat the dry eyes associated with Sjögren's syndrome. Pilocarpine is also used to stimulate secretion of saliva and sweat and a “paucity of sweating” was noted in one case of an NGLY1-deficient patient. Are there underlying mechanisms in the pilocarpine literature that might help elucidate the symptoms of NGLY1-deficiency? Help us find out.
Pilocarpine appeared in the previously completed missions: ATGS, MATG, HSPS, Alacrima, HSPM, MATGS, HSPG and was identified by over 48 users such as AJEckhart, AnxietyAttacked, aprilwent, BridgetDS, cbologa, cburkham, cheryllaos, chu2k, CitizenSubflexa2, Ckrypton, Darkversev, Dmatsumoto, GaboGR, gajin4065, ggoom, GrantRVD, hallm21, hampton11235, HArielle, Isabelle, Jbm, JudyE, Klgmd, kuhno1980, LaraineAitken, lcb123, ldouglas5, Manabu, mariomar_it, metaphor, ok8080, rkaramch, sciguy29, skye, socalpam, sueandarmani, TAdams, Vsmalladi, Yanggan, and more!

Interested in viewing the knowledge networks for the missions (ie- document sets) listed above? Just copy/paste this url ( to your browser and append the mission’s symbol.

Eg- To view the description and network for the ATGS mission, go to:

Regarding Greg’s side project which was mentioned in the previous newsletter. Greg wasn’t able to release the preliminary interface due to identifier import issues prior to his trip to the Biocuration conference. Upon his return, he's fixed the issue, and is making some final adjustments before turning this loose.

Happy St. Patty’s day!

Posted by on Mar 17, 2017 in citizen science, mark2cure | 0 comments

Current progress in Mark2Cure

Thanks to our wonderful community of citizen scientists and volunteers, the NFE2L1 entity recognition mission is now 99% complete! If you haven’t already completed quests in this mission, please help us finish it! The other entity recognition missions are also over two thirds complete and are in need of your reading skills. If entity recognition is too easy, please try out our relationship extraction module which can be quite challenging and may require more critical reading skills.

New project in development needs Mark2Curators!

Greg Stupp, a research scientist in the Su Lab here at TSRI is working on structuring clinical/drug indication data which is currently in free-form, unstructured text. This data has important implications for bioinformatics research aimed at drug repurposing, but we can’t build this database without your help. Greg will be building a new MediaWiki-based platform for crowdsourcing annotating clinical/drug indication text, which has the added advantage of structuring information so that it can be widely (and openly) disseminated by importing it into Wikidata.

What needs to be done?

Greg is currently working on the interface, and will need a few volunteers to provide feedback on the beta version as soon as it is available. After that, we will need our community of Mark2Curators to put what they’ve learned into practice. Annotating clinical/drug indication text will need Mark2Curators with BOTH the entity recognition AND relationship extraction skills. The rules for entity recognition are expected to be very similar for this task; however, we expect that more detailed relationships will be available so additional relationship extraction training may be necessary.
The first data set is expected to be generated from 1,100 paragraphs which will be very short but densely packed with information.

An example text may look like this:

MEKINIST® is indicated, as a single agent or in combination with dabrafenib, for the treatment of patients with unresectable or metastatic melanoma with BRAF V600E or V600K mutations, as detected by an FDA-approved test [see Clinical Studies (14.1)].

How will this work?

The interface is still under construction, but will primarily use queries and drop down selection menus so that you can select the best representation of the concepts in the text and how they are related. As a Mark2Curator, you’ve already been trained to recognize most of the entities involved in this task; however, the interface may require a little more detail in your selection. We’ll delve into the interface a little more as it gets fleshed out (it’s still very preliminary at this point).

How can you help?

If you are interested in providing feedback for improving the interface, please contact me. I will be compiling a list of beta testers for improving the user experience prior to the launch of the task. Given that the project is being built in the mediawiki framework, there may be huge limitations in the features we can modify; however, your feedback will be crucial in ensuring that we provide users with the information they need to successfully complete the task in spite of any constraints/limitations caused by the interface.