Session:「Talking with machines」

Single or Multiple Conversational Agents?: An Interactional Coherence Comparison

論文URL: http://dl.acm.org/citation.cfm?doid=3173574.3173765

論文アブストラクト: Chatbots focusing on a narrow domain of expertise are in great rise. As several tasks require multiple expertise, a designer may integrate multiple chatbots in the background or include them as interlocutors in a conversation. We investigated both scenarios by means of a Wizard of Oz experiment, in which participants talked to chatbots about visiting a destination. We analyzed the conversation content, users' speech, and reported impressions. We found no significant difference between single- and multi-chatbots scenarios. However, even with equivalent conversation structures, users reported more confusion in multi-chatbots interactions and adopted strategies to organize turn-taking. Our findings indicate that implementing a meta-chatbot may not be necessary, since similar conversation structures occur when interacting to multiple chatbots, but different interactional aspects must be considered for each scenario.

日本語のまとめ:

様々なchatbotの機能を統合しようとする場合、人:chatbotは1:1(single)または1:複数(multi)が考えられる。今回ユーザはその違いを感じるかを調査した。

Exploring the Role of Conversational Cues in Guided Task Support with Virtual Assistants

論文URL: http://dl.acm.org/citation.cfm?doid=3173574.3173782

論文アブストラクト: Voice-based conversational assistants are growing in popularity on ubiquitous mobile and stationary devices. Cortana, as well as Google Home, Amazon Echo, and others, can provide support for various tasks from managing reminders to booking a hotel. However, with few exceptions, user input is limited to explicit queries or commands. In this work, we explore the role of implicit conversational cues in guided task completion scenarios. In a Wizard of Oz study, we found that, for the task of cooking a recipe, nearly one-quarter of all user-assistant exchanges were initiated from implicit conversational cues rather than from plain questions. Given that these implicit cues occur in such high frequency, we conclude by presenting a set of design implications for the design of guided task experiences in contemporary conversational assistants.

日本語のまとめ:

スマートスピーカー等の使用時における非明示的な会話のきっかけ(あるステップ終了後に“Next step”などではなく”OK”などとすること)について調査、分類した。

"Play PRBLMS": Identifying and Correcting Less Accessible Content in Voice Interfaces

論文URL: http://dl.acm.org/citation.cfm?doid=3173574.3173870

論文アブストラクト: Voice interfaces often struggle with specific types of named content. Domain-specific terminology and naming may push the bounds of standard language, especially in domains like music where artistic creativity extends beyond the music itself. Artists may name themselves with symbols (e.g. M S C RA) that most standard automatic speech recognition (ASR) systems cannot transcribe. Voice interfaces also experience difficulty surfacing content whose titles include non-standard spellings, symbols or other ASCII characters in place of English letters, or are written using a non-standard dialect. We present a generalizable method to detect content that current voice interfaces underserve by leveraging differences in engagement across input modalities. Using this detection method, we develop a typology of content types and linguistic practices that can make content hard to surface. Finally, we present a process using crowdsourced annotations to make underserved content more accessible.

日本語のまとめ:

英語の発音できない、読めない曲名、アーティスト名(コンテンツ)を自動音声認識(ASR)で文字化できる手法を提案。エイリアスを付けることで、別名で入力をしても正しい名前に直してくれるようにした。

Designing Pronunciation Learning Tools: The Case for Interactivity against Over-Engineering

論文URL: http://dl.acm.org/citation.cfm?doid=3173574.3173930

論文アブストラクト: Paired role-play is a common collaborative activity in language learning classrooms, adding meaning and cultural context to the learning process. This is complemented by teachers' immediate and explicit feedback. Interactive tools that provide explicit feedback during collaborative learning are scarce, however. More commonly, supporting dialogue practice takes the form of computer-aided single-student read-and-record activities. This limitation is partly due to the complexity of processing language learners' speech in unconstrained tasks. In this paper, we assess the value of pronunciation error detection algorithms within a realistic, software-aided, paired role-playing task with beginning learners of French. We found that students' pronunciations improve regardless of the type of error detector employed -- even for those using simple heuristics. We suggest that speech technologies for language learning have been too focused on engineering goals. Instead, new interactive designs supporting collaboration may be used to overcome engineering limitations and properly support students' engagement.

日本語のまとめ:

外国語学習においてロールプレイ方式の素早いフィードバックのある学習は効果的であるが、そのようなツールは少ない。今回はPED(Pronunciation Error Detection)の質を調査した。