Understanding information diversity in the era of repurposable crowdsourced data
Files
Date
Authors
Keywords
Degree Level
Advisor
Degree Name
Volume
Issue
Publisher
Abstract
Organizations successfully leverage information technology for the acquisition of knowledge for decision-making through information crowdsourcing, which is gathering information from a group of people about a phenomenon of interest to the crowdsourcer. Information crowdsourcing has been used to drive business insight and scientific research, providing crowdsourcers access to information outside their traditional reach. Crowdsourcers seek high-quality data for their information crowdsourcing projects and require contributors who can provide data that meet predetermined requirements. Crowdsourcers recruit contributors with high levels of relevant knowledge or train contributors to ensure the quality of data they collect. However, when crowdsourced data needs to fit more than a single usage scenario because the requirements of the project changed or the data needs to be repurposed for tasks other than the one(s) for which it was initially collected, the ability of contributors to provide diverse data that can meet multiple requirements is also desirable. In this thesis, I investigate how the domain knowledge a contributor possesses affects the diversity and quality of data they report. Using an experiment in which 84 students randomly assigned to three knowledge conditions reported information about artificial stimuli, I found that explicitly trained contributors provided less diverse data than either implicitly trained or untrained contributors. In addition, I looked at the longitudinal effect of knowledge on the diversity of data reported by contributors. Using review data from Amazon.com and organism sighting data from NLNature.com (a citizen science data crowdsourcing platform), I studied the impact of knowledge on the diversity and quality of crowdsourced data. The results show that experience reduced the diversity and usefulness of contributed data. The study provides insights for crowdsourcers in industry and academia on how to manage and utilize their crowds effectively to collect high-quality reusable data.
