Building and Evaluating Conversational Agents

João Sedoc, University of Pennsylvania

Abstract

With the advent of personal assistants such as Siri and Alexa, there has been a renewed focus on dialog systems, including open domain conversational agents which can ideally converse about any topic. Although many conversational ``chit-chat'' agents are non-task driven, they often have a goal of engagement, entertainment, trust-building, or information seeking about the conversational partner. Recent end-to-end dialog generation methods use neural recurrent neural network encoder-decoder models~\citep{vinyals2015neural}. However, these neural dialog generation (NDG) models lack the ability to create topical and stylistic coherence in conversations. Furthermore, the evaluation of these methods has proven difficult due to a lack of automatic metrics for assessing conversation agents. This dissertation attempts to tackle two primary areas of research: (1) developing novel deep learning architectures to build chatbots with topical and stylistic coherence, and (2) producing methods for the systematic evaluation and comparison of novel chatbot models. Maintaining topical coherence and style motivate our theoretical research into novel deep learning methods. Specifically, we address topical coherence with a novel hierarchical ensemble NDG model which explicitly models topical sequencing. Next, address two different aspects of stylistic coherence, first starting with maintaining branded style and then focusing on demographic matching. To this end, we create an NDG agent for both mimicking Star Trek style and Youngbot for evincing youthful style in both Twitter and chat. The second half of the thesis focuses on evaluation, presenting a theoretical framework for the systematic evaluation of open-domain conversational agents, including the usage of Item Response Theory \citep{lord1968statistical} for efficient chatbot evaluation and evaluation set creation. We also present ChatEval, a software tool for chatbot evaluation.

Subject Area

Computer science

Recommended Citation

Sedoc, João, "Building and Evaluating Conversational Agents" (2019). Dissertations available from ProQuest. AAI22615466.
https://repository.upenn.edu/dissertations/AAI22615466

Share

COinS