The Problem of Majority Voting in Crowdsourcing with Binary Classes

Joni Salminen, Ahmed Mohamed Sayed Kamel, Soon-Gyo Jung, Bernard James Jansen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

When there are two classes, a majority vote can always be obtained with three labelers. Researchers can utilize this property to obtain a false sense of confidence in their ground truth labels. We demonstrate such a case with 3000 crowdsourced labels for an online hate dataset. Evaluating with percentage agreement, Gwet’s AC1, and Krippendorff’s alpha, results show that using more raters teases out the hidden nuances in raters’ preferences. We show that full agreement among the raters monotonically decreases from three raters (28.4%) to nine raters (19.5%). Ten raters have a higher agreement than any other number of raters, which supports the idea of increasing the number of raters for subjective labeling tasks. Nevertheless, while beneficial, increasing the number of raters cannot be considered as a fundamental solution to the issue of agreement in subjective crowdsourcing tasks, as even with ten raters, there is a non- negligible number of ties (4.11%). We suggest having a small sample of the data labeled by five or more raters to evaluate the stability of agreement among the raters.
Original languageEnglish
Title of host publicationProceedings of 19th European Conference on Computer-Supported Cooperative Work
Publication statusPublished - 2021

Fingerprint

Dive into the research topics of 'The Problem of Majority Voting in Crowdsourcing with Binary Classes'. Together they form a unique fingerprint.

Cite this