Contextualizing Explanations in Fluid Collaboration
Spontaneous collaboration in everyday tasks, like cooking together, requires quick adaptation to partners and situations. The resulting collaboration is fluid in that a dynamic and spontaneous mode of interaction emerges where participants continuously adapt their roles and responsibilities. Unlike in structured environments such as surgical operations, where roles are predefined, fluid collaboration (FC) thrives through the combination of implicit coordination, actions, and explicit communication. In this paper, we explore the use of explanations within FC and how they are contextualized with regard to, e.g., collaborative actions, mutual understanding, and coordination of the partners.
In collaboration, explanations aid the coordination of actions by creating transparency and improving team performance. They help partners understand each other’s actions and intentions, facilitating smoother interactions and building trust in the partner. More importantly, explicit explanations can serve as a repair mechanism if implicit coordination fails. We have developed the Cooperative Cuisine (CoCu) environment to study fluid collaboration in human-agent teams. In CoCu, teams need to prepare meals for fast-paced, incoming orders, similar to the game Overcooked!. In contrast to similar environments, CoCu focuses on the engagement of human participants in the collaboration process through continuous movement and controllable, more complex tasks. We conducted a preliminary study on human-human teams in FC. 30 participants collaborated in dyads in four levels with different layouts, each for five minutes. The collaboration and the communication were recorded.
Explanations can be categorized based on the explainer (the one explaining), the explainee (the one receiving the explanation), and the explanandum (the subject being explained). In our domain, all team members can be the explainer, and the other(s) is the explainee. The explanation takes place during the collaboration: the explanandum is the behavior of the explainer (cf. explainable robots ). Due to the ongoing collaboration, the explanation often affects the ongoing interaction. More precisely, it affects the understanding of team partners’ actions, their underlying action model, and potentially the current task and resource responsibilities (relevant for effective FC). Explanations can be provided in an interaction based on a specific request of the explainee or proactively by the explainer to explain their behavior, for example, to prevent misunderstanding or facilitate coordination. In our preliminary study, we saw both behaviors in human-human collaboration. Thus, if artificial agents participate in this interaction, they must understand and respond to explanation requests. Further, recognizing situations when explaining one’s behavior is useful to support effective FC. Gao et al. showed that human-like explanations improve collaboration performance in a fully-separated scenario (cf. fourth scenario in Fig. 1) in a Overcooked!-like environment.
The four scenarios played by the dyads in the study.
In FC, the context of an explanation comprises four major parts, which can alter the explanation even if the explanandum remains unchanged. In the following, we refer to examples of observed explanation reported in the Appendix. First, knowledge about the tasks and their structure with necessary subtasks (e.g., recipes in CoCu) is relevant to justifying one’s actions or directing attention to solutions (D03 (6)). Second, the current state of the environment can contain relevant parts of the explained action and other resources and events that can shape an explanation. For example, combining task knowledge and current state can produce counterfactual explanations (D10 (2)). Third, aligning explanations with the explainee’s expectations depends on understanding the shared beliefs and task assignments between partners. Last, an agent should adapt the explanation to its knowledge of the team’s recent actions, environmental events, and past interactions (D05).
In order to enable an agent to keep track of these parts of the explanation’s context, the following architectural components can be leveraged: An episodic memory to query past event knowledge, a Theory of Mind component, which tracks the others’ beliefs about the collaboration state and can give a surprise measure for agents’ actions (necessary for proactive explanations), collaboration planning components that select actions adapted to the assumed collaboration state (responsibilities of team partners), natural language understanding components to recognize explanation requests and to understand complex requests, e.g., via reference resolution to past actions and resources.
Enabling agents to process reactive, proactive, and complex explanations requires implementing these components since explanation and general collaboration are interconnected capabilities. FC and CoCu offer a valuable testbed for studying these links in fast, highly situated (hence, contextualized) explanations that support the collaboration process in dynamic environments. Achieving fully explainable agents requires integrating collaboration components with the explanation process, all within the constraints of rapid human-agent interaction.
Appendix
Table 1. Example discourses with explicit explanation requests in our preliminary FC study in CoCu in the interaction translated to English.
Table 2. Original example discourses with explicit explanation requests in our preliminary FC study in CoCu in the interaction in German.
Presentation Contextualizing Explanations in Fluid Collaboration held at the 3rd TRR 318 Conference: Contextualizing Explanations on 17th of June 2025 in Bielefeld, Germany