In my PhD research, I conducted a series of studies focusing on task-oriented interactions between humans and between humans and machines. The notion of mutual understanding was central, also known as grounding in psycholinguistics, in particular how people establish understanding in conversations and what interactional phenomena are present in that process. Addressing the gap in computational models of understanding, interactions in my studies are observed through multisensory input and evaluated with statistical and machine-learning models. Miscommunication is ordinary in human conversations and therefore embodied computer interfaces interacting with humans are subject to a large number of conversational failures. Investigating how these interfaces can evaluate human responses to distinguish whether spoken utterances are understood has been one of the central contributions of my dissertation.

Some of our work describes studies on how humans establish understanding incrementally and how they co-produce utterances to resolve misunderstandings in joint-construction tasks. Utilising interaction paradigms from human-human settings, we have conducted studies that describe collaborative interactions between humans and machines with two central manipulations: embodiment and conversational failures. The methods used investigated whether embodiment affects grounding behaviours among speakers and what verbal and non-verbal channels are utilised in response and recovery to miscommunication. For application to robotics and conversational user interfaces, we have developed failure detection systems predicting in real-time user uncertainty, paving the way for new multimodal computer interfaces that are aware of dialogue breakdown and system failures.