When artificial intelligence (AI) is unable to achieve required accuracy levels in biomedical analysis involving very large data sets, the models are often discarded and the research is severely hindered. To solve this problem we created a crowd-powered citizen science game, Stall Catchers, to integrate the cognition of public volunteers with AI methods to achieve rapid, expert-like analysis of Alzheimer's research data aimed at accelerating the discovery of a therapeutic treatment.
In scientific research, a huge bottleneck in testing a hypothesis can be the time required by an expert to manually analyze very large datasets. In our case, Alzheimer’s research data collected over the course of a single week would take trained laboratory technicians six months to a year to analyze. Sometimes automated methods, such as machine learning, are used to speed up the analysis, but these requires very large volumes of training data to be sufficiently accurate. When such training data aren’t available, an alternative for speeding up the analysis involves a hybrid approach, which combines the complementary capabilities of humans and machines. The Human Computation Institute specializes in such approach to solving problems, and much of this innovation has been compelled by our flagship project, Stall Catchers.
Stall Catchers is an online citizen science game that uses crowd-power to accelerate Alzheimer's disease reseach at Cornell University, funded by the U.S. National Institutes of Health. Since its inception in 2016, 45,000 global volunteers have contributed 13.5 million annotations accomplishing two decades of data analysis in 4 years while producing expert-like data analysis and supporting several discoveries reported in top tier journals, including the identification of a therapeutic agent that restores memories in transgenic Alzheimer's mice.
Two key innovations have been fundamental to these successes. The first innovation involved the development of a theoretically optimal consensus algorithm that quadrupled analytic throughput. The second one is a hybrid intelligence approach that enables imperfect machine learning (ML) models to automate part of the analysis. So while our game, Stall Catchers, is a crowdsourced hybrid-intelligence innovation to the challenge of data analysis volume, our second innovation brought to life in 2021, called “CrowdBots,” derives utility from imperfect machine learning models that were previously discarded.
Stall Catchers players analyze brain capillaries through an online virtual microscope to determine if they are flowing or stalled. We combine answers from several different people about the same blood vessel to produce a single expert-like crowd answer. This approach allows individual errors to get washed out to ensure consistently high data quality. Improvements to the methods we use to combine answers allowed us to reduce the number of individual answers we had to collect for each movie from 20 to approximately five. The combination of more users and efficiency improvements increased our analytic throughput effectively speeding up the research time by a factor of four. Our goal, however, is to reach a ten-fold improvement over the lab in order to reduce the potential time to a treatment target down to just a few years.
Achieving this goal may be possible via our new CrowdBots innovation. To invesigate the “CrowdBots” idea we ran a MathWorks-sponsored machine learning (ML) competition in partnership with DrivenData, which attracted 900 ML developers and resulted in 70 diverse ML models that were trained using 12.4 million crowd-generated annotations from Stall Catchers. Some of these models demonstrated unprecedented accuracy on our Alzheimer’s classification task compared with past automation approaches, but still did not meet our stringent data quality requirements. To derive utility from these models, we investigated a new mode of human/machine collaboration, which endows these models with agency alongside humans in our crowdsourcing systems. This “CrowdBot” approach speeds up data analysis, reduces reliance on human annotators, and improves data quality, while providing open source tools that reduce barriers for using ML in biomedical and other domain-specific data analysis.
Stall Catchers was inspired by the stardust@home project, where space scientists were trying to detect interstellar dust particles embedded in aerogel that was brought back to earth by a satellite flown through the tail of a comet. This required looking through a million microscopic images of aerogel, which the scientists estimated would take approximately 100 years. They created an online activity that allows volunteer participants to look through a virtual microscope to try to find with Stardust. With over 30,000 so-called “dusters” participating, they were able to sort through one million images and find seven interstellar dust particles in just a few years. A finding reported in the journal Science.
Recalling my stardust@home experience, I wondered if the virtual microscope could be adapted to look at blood flow in mouse brains. I got so excited about this that I phoned the project leader, Andrew Westphal, and told him about the Alzheimer’s research and how we might be able to use his virtual microscope interface to help speed up Cornell’s research. He agreed that it could work. Having lost his father to Alzheimer’s, he was also personally motivated to help.
What Makes Your Project Innovative?
Our "CrowdBots" innovation combines existing analytic methods (Crowdsourcing and Machine Learning) to reduce reliance on human annotators and augment analytic capacity by creating a hybrid crowd of humans and bots that work together. Machine Learning (ML) has been used to accelerate data analysis but relies upon a large training corpus, which isn't always available. In the absence of such training data, ML prediction accuracy may be insufficient to support research needs.
In such cases, crowdsourcing via citizen science has been used to analyze very large datasets that lend themselves to being decomposed and gamified into accessible microtasks. However, curating a large community of volunteers to participate in citizen science is a time-consuming and costly endeavor that is difficult to sustain. By applying our consensus methods to cohorts of humans and ML-powered bots, we have been able to reuse old data, increase platform sustainability, and boost analytic throughput.
What is the current status of your innovation?
Stall Catchers is at the stage of diffusing lessons, so that other scientific researchers may employ our methods and even see the benefit of adding our process to their research agenda. For CrowdBots, we have completed two validation studies and now seek to extend this work to new theories/proposals for ways in which hybrid intelligence ensembles can tackle different biomedical imagery problems.
Some specific goals are to (1) Develop an open source toolkit for transforming ML models into citizen science “bots” that enables a direct pathway for effectively integrating substandard ML models into an existing crowd-powered analytic pipeline without extensive re-engineering. (2) Investigate various human/AI partnership modalities/configuration such as relative expertise, AI intervention sequence, ensembles vs human/AI dyads (Prototype our Human-AI experimentation toolkit alongside this endeavor.) (3) Test the generality of CrowdBots on new analytic tasks and datasets in our other projects.
Collaborations & Partnerships
Stall Catchers collaborator organizations included Andrew Westphal (stardust@home) from UC Berkeley, EyeWire creators Sebastian Seung and Amy Robinson (Princeton U.), the Schaffer-Nishimura Lab at Cornell University, SciStarter, and BrightFocus Foundation. The CrowdBots machine learning challenge was run with DrivenData and sponsored by MathWorks. Our subsequent CrowdBot validation studies were sponsored by the National Institute of Aging at the U.S. National Institutes of Health.
Users, Stakeholders & Beneficiaries
Currently Stall Catchers brings over 45,000 global citizens on six continents together across cultural and geographic boundaries to tackle a common human problem: Alzheimer's disease. Stall Catchers has analyzed over 30 research datasets and has formal commitments from four new universities to support disease research using Stall Catchers as well as new hybrid platforms we are developing that build on Stall Catchers to accelerate sickle cell disease and cerebral small vessel disease.
Results, Outcomes & Impacts
Alzheimer's disease affects 24 million people worldwide and is the only "top 5 killer" without an effective treatment or cure. Via crowdsourcing and hybrid intelligence innovations, our Stall Catchers citizen science platform has accelerated Cornell University's Alzheimer's research, leading to the discovery of prospective Alzheimer's treatment targets that are FDA drug analogs. These drug analogs reduce capillary stalling in mouse models of Alzheimer's and restore memories and other cognitive functions.
Our innovations made it possible to reduce the time needed to make these discoveries from approximately 20 years to 5 years. Related biomedical results have been reported in top tier journals, including Nature, Nature Neuroscience, PLOS One, and Brain (https://www.nature.com/articles/s41598-020-65908-y). We are in the process of generalizing these methods to other biomedical analyses to accelerate dementia and other disease research including sickle cell disease.
Challenges and Failures
To validate Stall Catchers before using it on new research data, we applied it to an old dataset that had been previously analyzed by experts. We were surprised to discover that the crowd answers did not have as much agreement with the experts as we anticipated based on our pilot studies. To address this we convened a meeting with the experts who had generated the gold standard data used in the validation study.
The experts acknowledged that in almost every case of disagreement, the crowd answer was correct and the expert had been wrong. Moreover, we learned that there was greater agreement between the crowd and each individual expert than there was agreement between the experts.
Another challenge we faced was that our "wisdom of crowd" consensus methods required 20 different people to evaluate the same blood vessel, which seems very inefficient. We therefore improved our methods to more sensibly aggregate answers from different players, which reduced that number from 20 to 5.
Conditions for Success
Key success conditions entailed 1) choosing a societal problem to tackle that affects many people and has impacted stakeholders across sectors, 2) identifying win-win-win opportunities, 3) leveraging media visibility to boost participation, 4) maintaining a culture of humor and humility, 5) always listen and respond to members of our community, 6) we do not view our projects as our own, but projects that belong to humanity, 7) we view our role in the development and maturation of these projects as a privilege.
Several researchers including Oliver Bracko at University of Miami and David Boas at Boston University have obtained federal grants (NIH) that include provisions for using Stall Catchers to accelerate the analysis of their research data, and for extending Stall Catchers to include new types of data analysis. Additionally, we developed an experimentation platform based on Stall Catchers, which was used by Microsoft Research to collaborate with us on human-AI partnership studies (publication in TOCHI).
NIH has funded us to develop a new platform based on Stall Catchers to support Alzheimer's research at UC Davis and UC San Francisco that uses crowd-power to assess amyloid burden on whole slide images of human brain tissue to understand why Alzheimer's is more prevalent in Hispanics than white non-Hispanics. In collaboration with U. Pittsburgh we are developing a three-stage crowd-based analysis for 7Tesla MRI data to support sickle cell disease and dementia gender disparity research.
I used to think I needed permission to succeed. I worked in a highly innovative environment where I expressed some of my ideas about hybrid intelligence. These ideas were met with skepticism and I was discouraged from pursuing them. Eventually, I realized there was nothing preventing me from pursuing them on my own, so I began to build a community around these ideas of combining human and machine intelligence through a book project with 117 authors from 23 countries.
Next I won a grant to organize a 3-day workshop at the Wilson Center with White House participation, and subsequently formed the Human Computation Institute, where these ideas are now accepted by the scientific community and being actively used to accelerate biomedical research. The lesson is: give yourself permission to succeed, and then you won't need to find it elsewhere.
An interesting next step for the CrowdBots innovation is to enable continuous learning of machine learning models that are members of a hybrid crowd. This would result in systems that gradually become faster and more autonomous over time.
- Evaluation - understanding whether the innovative initiative has delivered what was needed
- Diffusing Lessons - using what was learnt to inform other projects and understanding how the innovation can be applied in other ways
9 November 2022