Leveraging Complementary Contributions of Different Workers for Efficient Crowdsourcing of Video Captions

論文URL:http://dl.acm.org/citation.cfm?doid=3025453.3026032

論文アブストラクト:Hearing-impaired people and non-native speakers rely on captions for access to video content, yet most videos remain uncaptioned or have machine-generated captions with high error rates. In this paper, we present the design, implementation and evaluation of BandCaption, a system that combines automatic speech recognition with input from crowd workers to provide a cost-efficient captioning solution for accessible online videos. We consider four stakeholder groups as our source of crowd workers: (i) individuals with hearing impairments, (ii) second-language speakers with low proficiency, (iii) second-language speakers with high proficiency, and (iv) native speakers. Each group has different abilities and incentives, which our workflow leverages. Our findings show that BandCaption enables crowd workers who have different needs and strengths to accomplish micro-tasks and make complementary contributions. Based on our results, we outline opportunities for future research and provide design suggestions to deliver cost-efficient captioning solutions.

日本語のまとめ:

ネイティブから聴覚障害者までの4つのグループでyoutubeの字幕作成を行う。聴覚障害者は句読点について、第二言語話者は文法について、ネイティブはジョークやことわざなどの言い回しについて貢献できることがわかった。

(106文字)

発表スライド: