Crowdsourcing Image Extraction and Annotation: Software Development and Case Study

作者:Ana Jofre , SUNY Polytechnic
Vincent Berardi , Chapman University
Kathleen P.J. Brennan , University of Queensland
Aisha Cornejo, Chapman University
Carl Bennett, General Motors
John Harlan, Shiprite Software

转载来源:DIgital Humanities Quarterly, 14.2, 2020, http://www.digitalhumanities.org/dhq/vol/14/2/000469/000469.html

本文描述了基于网络的软件开发过程,该软件有助于在数字人文学科感兴趣的以图像为主的语料库中进行大规模的、众包的图像提取和注释。其次,通过一个案例研究,详细评估了该软件的应用,该案例研究部署在Amazon Mechanical Turk中,从《时代》杂志的档案中提取并注释人脸。注释标签包括年龄、性别和种族等类别,这些类别随后被用于训练机器学习模型。在这个案例研究中,本文详细介绍了众包数据收集和工人质量验证程序的系统化。本文概述了一种数据验证方法,该方法使用验证图像,并且只需要对每张图像进行两次注释就可以生成高保真数据,其效果堪比对每张图像进行5次注释的效果。最后,本文对如何个性化使用该软件进行了说明,以满足其他研究的需求,旨在为研究人员提供这种资源,在其他图像大量的档案中进行目标分析。

作者简介:

Ana Jofre 

Dr. Ana Jofre is an Assistant Professor in Creative Arts and Technology at SUNY Polytechnic in Utica NY. She has a PhD in Physics from the University of Toronto and an MFA in Interdisciplinary Arts Media and Design from OCAD University. Her publications and conference presentations cover a wide range of intellectual interests, from physics to critical theory, and she has exhibited her artwork internationally. Her creative and research interests include figurative sculpture, interactive new media, internet art, human-computer interaction, and data visualization.

Vincent Berardi 

Dr. Vincent Berardi is an Assistant Professor of Computational Psychology at Chapman University (Orange, CA) and is the director of the Computational Analysis of Health Behavior Laboratory (CAHB Lab). His works focuses on identifying trends in intensive longitudinal data, in both digital humanities studies and within health behavior interventions.

Kathleen P.J. Brennan 

Dr. Kathleen P.J. Brennan is a Postdoctoral Research Fellow in the School of Political Science and International Studies (POLSIS) at the University of Queensland (Brisbane, Australia). She completed her PhD in Political Science at the University of Hawai’i at Mānoa in 2016 and her MSc in International Relations Theory at the London School of Economics in 2009. Her work draws on the intersections of political theory, IR, popular culture, and media studies.

Aisha Cornejo

 Aisha Cornejo is a recent graduate of Chapman University, with a double major in psychology and philosophy.

Carl Bennett 

Carl Bennett is a recent graduate of SUNY Polytechnic, with a BS in Computer Science, and is currently a software developer at General Motors.

John Harlan 

John Harlan is a recent graduate of SUNY Polytechnic, with a BS in Interactive Media and Game Design. He is computer programmer, currently at Shiprite Software, specializing in procedural design and user interfaces.