We are Database Group at Tsinghua University. We are delightedly to share our research results about crowdsourcing. You can find our research papers, talks, tutorials, books, source codes, systems, and other useful resources on this page. Now, enjoy yourself!
If you have any questions or concerns, please contact Prof. Guoliang Li.


dataset preview

Datasets in Crowdsourcing

Nov 2, 2019 (last update)

This is a collection of dataset in crowdsourcing with two parts:

  • Part 1: Datasets with ground truth and workers' answers
  • Part 2: Datasets with only ground truth ( no workers' answers )



ChinaCrowds: a Crowdsourcing Database Platform

ChinaCrowds aims to address machine-hard queries, e.g. labeling and translation. Requester can crowdsource their requirements, e.g., needed services, ideas, or content, and get answers from a large group of people, and especially from an online community, rather than from traditional employees or suppliers. Worker can make money by answering requesters's questions.


CDB: a Crowd-powered Database System

CDB is a crowd-powered database system that supports crowd-based query optimizations with focus on join and selection. CDB has fundamental differences from existing systems. First, CDB employs a graph-based query model that provides more fine-grained query optimization. Second, CDB adopts a unified framework to perform the multi-goal optimization based on the graph model. We have implemented our system and deployed it on Amazon Mechanical Turk, CrowdFlower and ChinaCrowd.


CrowdOTA: an Online Task Assignment System in Crowdsourcing

We develop an online task assignment system, CrowdOTA. When a worker requests tasks, CrowdOTA on-the-fly selects k tasks to the worker. CrowdOTA implements multiple online task assignment algorithms and requesters can select any algorithm to assign their tasks.