Belief Updating in Task-Oriented Spoken Dialog Systems

Dr. Dan Bohus

Over the last decade, advances in natural language processing technologies have paved the way for the emergence of complex, task-oriented spoken dialog systems. One of the main problems in the development of these systems is their lack of robustness when faced with understanding errors. The majority of these errors stem from the speech recognition process. The inherent difficulties of automatic speech recognition are further increased by the conditions under which these systems typically operate: spontaneous speech, increasingly larger vocabularies, and large variations in user populations and in the quality of the input lines. Unless they are mediated by better awareness and robust recovery mechanisms, speech recognition errors exert a serious negative impact on the overall quality and success of the interactions.

To increase robustness spoken dialog systems must accurately monitor the reliability of the information they acquire. In general, spoken dialog systems use confidence scores generated by the speech recognizer to guard against potential misunderstandings. The confidence score from each utterance is used to form an initial assessment of how reliable the information contained in that utterance is. However, an ideal system should continue to update and improve the accuracy of its beliefs throughout the conversation by using information available in subsequent user turns.

In this talk, I will describe a scalable machine-learning approach for this belief-updating problem. The proposed approach uses a compressed belief representation and casts the belief-updating problem as a multinomial regression task. Experimental results show that this approach significantly outperforms the heuristic rules typically used for this task in current systems. To evaluate the impact of these models on global dialog performance, we performed a user study with a mixed-initiative spoken dialog system. The belief updating models produced significant gains in both task success and the efficiency of the interactions, across a wide range of recognition error rates.

Dan Bohus is a Ph.D. student working under the supervision of Dr. Alex Rudnicky and Prof. Roni Rosenfeld in the Computer Science Department at Carnegie Mellon University. He has received a B.S. in Computer Science from "Politechnica" University of Timisoara, Romania.