Development of a Human Knowledge Integrated Workflow for Context-aware Machine Learning

Bahavathy Kathirgamanathan, Gennady Andrienko und Natalia Andrienko

Introduction

Visual analytics (VA) has the potential to aid in the acquisition of expert knowledge; both existing domain knowledge and new insights. However, there is a notable lack of VA systems specifically designed to externalize expert knowledge for machine learning (ML) integration. Prior research has explored the role of VA in conceptual model development, even if not explicitly for ML applications. Our proposed methodology builds on this approach, leveraging VA to extract and formalize expert knowledge, which is then incorporated into the model-building process. This integration not only improves the interpretability and reliability of the resulting models but also enhances explanation generation, ultimately leading to more robust and transparent ML systems. Human-driven model building fosters a collaborative approach where domain knowledge and machine learning techniques complement each other hence achieving solutions that neither could achieve independently. Our approach ensures that domain knowledge is embedded directly into the learning process, resulting in models and explanations that better align with expert reasoning.

Knowledge-injected Workflow

In this workflow, user expertise is integrated into the decision-support process using visual analytics as a tool to aid in the process, enhancing the model by incorporating relevant domain knowledge. This human-centered modeling approach ensures that the model aligns more closely with real-world expectations. Key steps of this proposed workflow are:

Data abstraction and definition: Converting raw data into structured representations, such as aggregating time-series data into semantically meaningful episodes (e.g. Time series data into intervals).
Contextualization: Incorporating temporal, spatial, or relational context into the model to improve predictive accuracy and interpretability (e.g. preceding values or trends).
Synoptic feature generation: Summarizing complex data dynamics into interpretable features, such as statistical summaries or trend indicators (e.g. statistical summaries, trend indicators etc).
Iterative expert-guided refinement: Using visual analytics and domain insights to iteratively improve the model, ensuring that its behavior aligns with real-world expectations (e.g. correct issues with raw data, introduce new features etc).

The established framework was applied to two distinct datasets. The first case study examines mobility patterns in Spain during the COVID-19 pandemic. In this dataset, both temporal and spatial aspects were transformed into categorical data. Daily trip counts within and between provinces were encoded as sequences of consecutive events, categorizing mobility into four levels. Similarly, the time series of disease counts was converted into consecutive events, with COVID-19 cases classified into four severity levels. The second dataset focuses on predicting movement patterns of 71 fishing vessels operating northwest of France. The dataset contains vessel trajectories recorded as time series of geographic positions of the vessels. This data was also transformed into meaningful episodes using the expert-guided workflow from which interval based features that effectively summarize vessel behaviour were obtained. Both feature sets were trained using a Random Forest model yielding improved performance, thus showing the suitability of this approach for modelling.

By structuring data and models in a way that aligns with human logic, explanation techniques become more effective. The presence of human-interpretable features allows explanations to be presented with richer context, making them more insightful and actionable for users.

Conclusion

By integrating human knowledge into the ML workflow, our approach enhances the transparency and trustworthiness of AI systems. The structured embedding of domain expertise results in models that are not only technically robust but also capable of providing explanations that users find meaningful and actionable. This dual focus on informed modelling and context-aware explanation generation represents a significant step towards human-centered AI.

Presentation Development of a Human Knowledge Integrated Workflow for Context-aware Machine Learning held at the 3rd TRR 318 Conference: Contextualizing Explanations on 17th of June 2025 in Bielefeld, Germany

Nächstes Kapitel

4 Investigating Co-Constructive Behavior of Large Language Models in Explanation Dialogues

The computational generation of natural language explanations has gained research interest due to its importance for explainable artificial intelligence (XAI), which aims to explain decisions made by AI systems. Recent XAI research focuses on personalized explanations tailored to the explainee to ensure a more effective…

Erscheinungsdatum	5. März 2026
DOI	10.64136/xtth9374
Creative Commons Lizenz

Development of a Human Knowledge Integrated Workflow for Context-aware Machine Learning

Introduction

Knowledge-injected Workflow

Conclusion

Nächstes Kapitel

4 Investigating Co-Constructive Behavior of Large Language Models in Explanation Dialogues