論文アブストラクト：Musical ensemble between human musicians and computers is a challenging task. We achieve this with a concert-quality synchronization using machine learning. Our system recognizes the position in a given song from the human performance using the microphone and camera inputs, and responds in real-time with audio and visual feedback as a music ensemble. We address three crucial requirements in a musical ensemble system. First, our system interacts with human players through both audio and visual cues, the conventional modes of coordination for musicians. Second, our system synchronizes with human performances while retaining its intended musical expression. Third, our system prevents failures during a concert due to bad tracking, by displaying an internal confidence measure and allowing a backstage human operator to "intervene" if the system is unconfident. We show the feasibility of the system with several experiments, including a professional concert.