ArXiv Preprint
We present BotSIM, a data-efficient end-to-end Bot SIMulation toolkit for
commercial text-based task-oriented dialog (TOD) systems. BotSIM consists of
three major components: 1) a Generator that can infer semantic-level dialog
acts and entities from bot definitions and generate user queries via
model-based paraphrasing; 2) an agenda-based dialog user Simulator (ABUS) to
simulate conversations with the dialog agents; 3) a Remediator to analyze the
simulated conversations, visualize the bot health reports and provide
actionable remediation suggestions for bot troubleshooting and improvement. We
demonstrate BotSIM's effectiveness in end-to-end evaluation, remediation and
multi-intent dialog generation via case studies on two commercial bot
platforms. BotSIM's "generation-simulation-remediation" paradigm accelerates
the end-to-end bot evaluation and iteration process by: 1) reducing manual test
cases creation efforts; 2) enabling a holistic gauge of the bot in terms of NLU
and end-to-end performance via extensive dialog simulation; 3) improving the
bot troubleshooting process with actionable suggestions. A demo of our system
can be found at https://tinyurl.com/mryu74cd and a demo video at
https://youtu.be/qLi5iSoly30.
Guangsen Wang, Samson Tan, Shafiq Joty, Gang Wu, Jimmy Au, Steven Hoi
2022-11-22