論文アブストラクト： Chatbots focusing on a narrow domain of expertise are in great rise. As several tasks require multiple expertise, a designer may integrate multiple chatbots in the background or include them as interlocutors in a conversation. We investigated both scenarios by means of a Wizard of Oz experiment, in which participants talked to chatbots about visiting a destination. We analyzed the conversation content, users' speech, and reported impressions. We found no significant difference between single- and multi-chatbots scenarios. However, even with equivalent conversation structures, users reported more confusion in multi-chatbots interactions and adopted strategies to organize turn-taking. Our findings indicate that implementing a meta-chatbot may not be necessary, since similar conversation structures occur when interacting to multiple chatbots, but different interactional aspects must be considered for each scenario.
論文アブストラクト： Voice-based conversational assistants are growing in popularity on ubiquitous mobile and stationary devices. Cortana, as well as Google Home, Amazon Echo, and others, can provide support for various tasks from managing reminders to booking a hotel. However, with few exceptions, user input is limited to explicit queries or commands. In this work, we explore the role of implicit conversational cues in guided task completion scenarios. In a Wizard of Oz study, we found that, for the task of cooking a recipe, nearly one-quarter of all user-assistant exchanges were initiated from implicit conversational cues rather than from plain questions. Given that these implicit cues occur in such high frequency, we conclude by presenting a set of design implications for the design of guided task experiences in contemporary conversational assistants.
論文アブストラクト： Voice interfaces often struggle with specific types of named content. Domain-specific terminology and naming may push the bounds of standard language, especially in domains like music where artistic creativity extends beyond the music itself. Artists may name themselves with symbols (e.g. M S C RA) that most standard automatic speech recognition (ASR) systems cannot transcribe. Voice interfaces also experience difficulty surfacing content whose titles include non-standard spellings, symbols or other ASCII characters in place of English letters, or are written using a non-standard dialect. We present a generalizable method to detect content that current voice interfaces underserve by leveraging differences in engagement across input modalities. Using this detection method, we develop a typology of content types and linguistic practices that can make content hard to surface. Finally, we present a process using crowdsourced annotations to make underserved content more accessible.
論文アブストラクト： Paired role-play is a common collaborative activity in language learning classrooms, adding meaning and cultural context to the learning process. This is complemented by teachers' immediate and explicit feedback. Interactive tools that provide explicit feedback during collaborative learning are scarce, however. More commonly, supporting dialogue practice takes the form of computer-aided single-student read-and-record activities. This limitation is partly due to the complexity of processing language learners' speech in unconstrained tasks. In this paper, we assess the value of pronunciation error detection algorithms within a realistic, software-aided, paired role-playing task with beginning learners of French. We found that students' pronunciations improve regardless of the type of error detector employed -- even for those using simple heuristics. We suggest that speech technologies for language learning have been too focused on engineering goals. Instead, new interactive designs supporting collaboration may be used to overcome engineering limitations and properly support students' engagement.
外国語学習においてロールプレイ方式の素早いフィードバックのある学習は効果的であるが、そのようなツールは少ない。今回はPED(Pronunciation Error Detection)の質を調査した。