Crowdsourcing, Scientific Method and Intellectual Property

The consequences of digital networking for our ways and means of processing complex information are only beginning to emerge. Yet one can see already with great clarity that digital networking will not only change the type of problems that may be addressed, but also method and credit for it. Concepts of intellectual property will never be the same. By acknowledging the substantial and often critical contribution of others to the evolution of thoughts, ideas, questions and solutions, we are led to depart from a “star system” that glorifies individual genius and contribution toward a more realistic acknowledgment of multiple credits for a potentially vast number of contributors, without whom certain problems may not find an answer without engaging vastly greater resources of time and funding.

In science, crowdsourcing means to out-source research and development tasks to a mass of voluntary but sometimes unaware users, in some instances through “games” that superficially serve an entertainment purpose. Crowdsourcing works particularly well if scientific knowledge can be transferred to an application in so elegant a manner that users need not understand it.

With crowdsourcing, individual leadership and ingenuity takes on a different dimension and purpose: turning into more of a managerial task, the emphasis shifts to finding a way to harness intellectual resources of the masses and finding a quid-pro-quo that permits accessing them. In a digitally networked world, it reflects “open innovation,” a changed view of the scientific process, one that anticipates the participation of as many individuals as possible in processes of research and development as an increasingly natural form of an efficient division of labor. This is especially true with regard to superficially tedious routine work. Zooniverse is a good example: it enables laymen to analyze cell tissue for cancer research, categorize galaxies, or sort through weather records in 19th century marine log entries for purposes of climate research. Sometimes tasks outsourced to the masses of users are rewarded monetarily, for example by Amazon Mechanical Turk or Crowdflower.

Relying on the contributions of many is hardly new in human endeavors: the pyramids, the Panama Canal or Neil Armstrong’s moon walk each engaged a collaborative effort of approximately 100,000 individuals. Crowdsourcing, that is relying on the intellectual resources of internet users, may enable the involvement of 100 million or more. Duolingo, a platform offering language learning resources in English, French, German, Italian, Spanish and Portuguese, is free but has an ulterior motive: by practicing, students “translate the Internet,” especially Wikipedia, into the languages they aim to discover.

In 2008, David Baker at the University of Washington created a three-dimensional “puzzle” named Foldit, a “game with a purpose,” which is to fold proteins in a spatial dimension,  a three-dimensional "asks to a mass of voluntary but sometimes unaware usersa task that requires immense computing resources but somehow comes a lot easier to humans. At least to some of them: among 100,000 Foldit aficionados worldwide playing regularly, some particular talents turned out to be 13 years old and intuitively performing tasks pushing supercomputers to their limits. Fifty years of molecular biology are packed into Foldit – but users only need to turn their models in a variety of directions on their computer screens.

Protein structures may be conceived as networks, and it is conceivable that users could be tasked with changing protein networks in a way that strips them of their characteristics in cancerous cells, thereby inaugurating a breakthrough in cancer therapy. If this concept is similarly successful as Foldit, it could result in a 100 times greater output.

There is no shortage of “citizen science” projects: MIT seeks to enable users to “map” the brain through Eyewire. The University of Munich has created Artigo, which creates a competition between users to provide keywords for cataloging archived works of art. With Geo-wiki, the International Institute for Applied Systems Analysis addressed a notorious deficiency in automated analysis of aerial photographs for the classification of land in connection with potential use for ethanol production. This in turn inspired the creation of computer games designed to draw a broader user base. Recaptcha, by now acquired by Google, is a method based on the reverse application of captcha, the technology used to authenticate human users online by identifying distorted signs or words. Recaptcha has been designed by Luis von Ahn at Carnegie Mellon University to harness involuntarily the resources of 750 million computer users world-wide to digitize annually 2.5 million books that cannot be machine-read.

Game design needs to be based on reward and recognition of performance. To date, this is typically achieved when different users arrive at identical solutions. Needless to say, this creates a risk of rewarding congruent nonsense, an outcome for which non-trivial solutions have yet to be designed. In spite of such shortcomings, game results can still improve data quality.

It is easily imaginable that Open Innovation will eventually require a revolutionary change in the protection and reward of intellectual property thus created. Some of the difficulties this presents is the relative anonymity of the web, the small size of individual contribution, and the random, haphazard, or playful nature of at least some, if not most of the contributions. But similar challenges have already been resolved in the design of the class action system: there, benefits to individual plaintiffs are also typically too small and negligible to justify pursuit by traditional methods, and the reward largely accrues to the organizers of the effort. But the social purpose, namely the disgorgement of profits of a mass tortfeasor, may well be compared to the creation of another social good in the form of R&D resulting from, or at least significantly augmented by, a large number of only marginally interested contributors.

Collective reasoning and collaborative creativity may yet ring in an era of division of labor and profit by a mass collective that is organized not along political ideology, but around the opportunities and incentives created by networked technology and pooled human talent.

