Towards Symbolic XAI – Explanation Through Human Understandable Logical Relationships Between Features

Thomas Schnake, Farnoush Rezaei Jafari, Jonas Lederer, Ping Xiong, Shinichi Nakajima, Stefan Gugler, Grégoire Montavon und Klaus-Robert Müller

Machine learning (ML) models are increasingly used in science, industry, and for solving everyday problems. However, their predictions — particularly those of neural networks — are generally not traceable by the user. The field of explainable artificial intelligence (XAI) aims to address this issue by providing additional evidence for why a model made a particular prediction. For most existing methodologies, such evidence is typically provided in the form of importance scores for individual input features, which can be too rigid to capture domain-specific queries by the user, such as evidence for the complex interplay between features. Alternatively, surrogate models such as decision trees, trained to mimic the model’s predictions, have been proposed to provide a human-readable but detailed explanation. However, they often lack faithfulness since they do not have access to the original model’s complex mechanisms and feature interactions, only its output.

We propose an explanation framework that attributes relevance to arbitrary logical feature dependencies and can be customized to the user’s needs. We call this framework Symbolic XAI (SymbXAI). The relevance of logical formulas incorporates the information of all possible interactions between features contributing to the prediction, thus providing a sample-wise and faithful explanation. For cases where the appropriate logical formula is unknown, the frameworkautomatically searches for the formula that is most relevant or best summarizes the model’s prediction strategy.

The conceptual basis is to decompose the model’s prediction into distinct contributions of all possible feature combinations—called multi-order decomposition. This can be specified using perturbation-based methods, which correspond to the Harsanyi dividends from cooperative game theory, or propagation-based methods, for which walk relevance values from GNN-LRP are aggregated. Logical queries, composed of conjunctions and negation of input features, specify whether the presence or absence of features in each feature set satisfies the logical expression. Aggregating the multi-order terms selected by the logical query yields the relevance of the logical expression. For more details on the methodology, we refer to Schnake et al., Section 3.

SymbXAI has broad applicability and is, in principle, not limited to a specific domain or model architecture. We discuss a few exemplary findings in the domains of NLP, computer vision, and chemistry, for which we used Transformer models and graph neural networks. For a more detailed discussion, we refer to Schnake et al., Section 4.

In NLP, especially sentiment analysis, SymbXAI detects grammatical constructions such as negation. It captures contextually dependent word relevance, which classical methods are unable to express. For complex grammatical structures like contrastive conjunctions, SymbXAI shows that models have learned these patterns without being explicitly trained on them.

In computer vision, particularly facial emotion recognition (FER), SymbXAI reveals the relevance of absent facial parts (e.g., masked mouth, sunglasses), which is usually assessed via computationally expensive data augmentation. In object detection, models rely on recurring logical combinations of sub-parts, such as identifying both tires and the frame to detect a mountain bike.

In drug design, SymbXAI aligns with chemical intuition; for example, the presence of a toxicophore implies the presence of a detoxifying substructure for a molecule to be classified as safe. Such logical implications can be extracted automatically.

We demonstrate that prior knowledge about expected feature interactions enhances interpretability and trust. Human-aligned logical interplays increase the utility of explanations in both NLP and chemistry. Our experiments underline that user modeling is key. Classical methods lack the capacity for adaptation. SymbXAI enables logical abstractions over features—an essential capability in complex or high-stakes settings such as computer vision or medicinal chemistry.

SymbXAI tailors explanations to the user’s context. It captures grammatical rules in NLP and substructure logic in chemistry. Its flexibility and interpretability allow users to adapt the explanation process to their specific needs.

In summary, the Symbolic XAI framework provides a local understanding of a model’s decision-making process that is both flexible for user customization and human-readable through logical formulas.

Presentation Towards Symbolic XAI — Explanation through Human Understandable Logical Relationships Between Features held at the 3rd TRR 318 Conference: Contextualizing Explanations on 17th of June 2025 in Bielefeld, Germany

Nächstes Kapitel

16 Context-dependent Effects of Explanations on Multiple Layers of Trust

The rise of highly performant deep learning approaches results in a growing number of possible applications in many domains. In most fields, such as medicine, human agency and oversight are crucial, that is, complex decisionmaking processes should be performed by human-AI teams. Since machine learned models are typically operating as opaque…

Schriftgröße

Klein

Mittel

Groß

Hintergrund

% Lesefortschritt

Inhaltsverzeichnis
Contextualizing Explanations
Fußnoten
1. Samek, W., Montavon, G., Lapuschkin, S., Anders, C.J., Müller, K.R.: Explaining deep neural networks and beyond: A review of methods and applications. Proc. of the IEEE 109(3), 247-278 (2021). https://doi.org/10.1109/JPROC.2021.3060483;
  Lapuschkin, S., Wäldchen, S., Binder, A., Montavon, G., Samek, W., Müller, K.R.: Unmasking clever hans predictors and assessing what machines really learn. Nature Communications 10(1), 1096 (2019). https://doi.org/10.1038/s41467-019-08987-4;
  Kauffmann, J., Dippel, J., Ruff, L., Samek, W., Müller, K.R., Montavon, G.: Explainable ai reveals clever hans effects in unsupervised learning models. Nature Machine Intelligence 7(3), 412-422 (2025). https://doi.org/10.1038/s42256-025-01000-2
2. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015). https://doi.org/10.1371/journal.pone.0130140;
  Montavon, G., Bach, S., Binder, A., Samek, W., Müller, K.R.: Explaining nonlinear classification decisions with deep taylor decomposition. Pattern Recognition 65, 211-222 (2017) https://doi.org/10.1016/j.patcog.2016.11.008;
  Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems. vol. 30, pp. 4768-4777 (2017)
3. Thrun, S.: Extracting rules from artificial neural networks with distributed rep resentations. In: Advances in Neural Information Processing Systems. vol. 7, pp. 505-512 (1994);
  Hailesilassie, T.: Rule extraction algorithm for deep neural networks: A review. Int. Journal of Computer Science and Information Security 14, 376-381 (2016)
4. Schnake, T. et al.: Towards symbolic xai - explanation through human understandable logical relationships between features. Information Fusion 118, 102923 (2025). https://doi.org/10.1016/j.inffus.2024.102923
5. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems. vol. 30, pp. 4768–4777 (2017)
6. Fujimoto, K., Kojadinovic, I., Marichal, J.L.: Axiomatic characterizations of probabilistic and cardinal-probabilistic interaction indices. Games and Economic Behavior 55(1), 72–99 (2006)
7. Bach, S. et al.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015). https://doi.org/10.1371/journal.pone.0130140
8. Schnake, T. et al.: Higher-order explanations of graph neural networks via relevant walks. IEEE Transactions on Pattern Analysis and Machine Intelligence 44(11), 7581-7596 (2022) https://doi.org/10.1109/TPAMI.2021.3115452
9. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Proc. of the 31st Int. Conf. on Neural Information Processing Systems. p. 6000-6010. NIPS'17, Curran Associates Inc., Red Hook, NY, USA (2017)
10. Scarselli, F. et al.: The graph neural network model. IEEE Transactions on Neural Networks 20(1), 61-80 (2009). https://doi.org/10.1109/TNN.2008.2005605
11. Schnake, T. et al.: Towards symbolic xai - explanation through human understandable logical relationships between features. Information Fusion 118, 102923 (2025). https://doi.org/10.1016/j.inffus.2024.102923
12. Socher, R. et al.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proc. of the 2013 Conf. on Empirical Methods in Natural Language Processing. pp. 1631-1642 (2013) https://doi.org/10.18653/v1/D13-1170
13. Elsayed, Y., Elsayed, A., Abdou, M.A.: An automatic improved facial expression recognition for masked faces. Neural Computing and Applications 35(20), 14963-14972 (2023) https://doi.org/10.1007/s00521-023-08498-w
14. Kazius, J., McGuire, R., Bursi, R.: Derivation and validation of toxicophores for mutagenicity prediction. Journal of Medicinal Chemistry 48(1), 312-320 (2005). https://doi.org/10.1021/jm040835a
Literaturverzeichnis
1. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015). https://doi.org/10.1371/journal.pone.0130140
2. Elsayed, Y., Elsayed, A., Abdou, M.A.: An automatic improved facial expression recognition for masked faces. Neural Computing and Applications 35(20), 14963-14972 (2023). https://doi.org/10.1007/s00521-023-08498-w
3. Fujimoto, K., Kojadinovic, I., Marichal, J.L.: Axiomatic characterizations of probabilistic and cardinal-probabilistic interaction indices. Games and Economic Behavior 55(1), 72-99 (2006). https://doi.org/10.1016/j.geb.2005.03.002
4. Hailesilassie, T.: Rule extraction algorithm for deep neural networks: A review. Int. Journal of Computer Science and Information Security 14, 376-381 (2016).
5. Kauffmann, J., Dippel, J., Ruff, L., Samek, W., Müller, K.R., Montavon, G.: Explainable ai reveals clever hans effects in unsupervised learning models. Nature Machine Intelligence 7(3), 412-422 (2025). https://doi.org/10.1038/s42256-025-01000-2
6. Kazius, J., McGuire, R., Bursi, R.: Derivation and validation of toxicophores for mutagenicity prediction. Journal of Medicinal Chemistry 48(1), 312-320 (2005). https://doi.org/10.1021/jm040835a
7. Lapuschkin, S., Wäldchen, S., Binder, A., Montavon, G., Samek, W., Müller, K.R.: Unmasking clever hans predictors and assessing what machines really learn. Nature Communications 10(1), 1096 (2019). https://doi.org/10.1038/s41467-019-08987-4
8. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Advances in Neural Information Processing Systems. vol. 30, pp. 4768-4777 (2017).
9. Montavon, G., Bach, S., Binder, A., Samek, W., Müller, K.R.: Explaining nonlinear classification decisions with deep taylor decomposition. Pattern Recognition 65, 211-222 (2017). https://doi.org/10.1016/j.patcog.2016.11.008
10. Samek, W., Montavon, G., Lapuschkin, S., Anders, C.J., Müller, K.R.: Explaining deep neural networks and beyond: A review of methods and applications. Proc. of the IEEE 109(3), 247-278 (2021). https://doi.org/10.1109/JPROC.2021.3060483
11. Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Transactions on Neural Networks 20(1), 61-80 (2009). https://doi.org/10.1109/TNN.2008.2005605
12. Schnake, T., Eberle, O., Lederer, J., Nakajima, S., Schütt, K.T., Müller, K.R., Montavon, G.: Higher-order explanations of graph neural networks via relevant walks. IEEE Transactions on Pattern Analysis and Machine Intelligence 44(11), 7581-7596 (2022). https://doi.org/10.1109/TPAMI.2021.3115452
13. Schnake, T., Rezaei Jafari, F., Lederer, J., Xiong, P., Nakajima, S., Gugler, S., Montavon, G., Müller, K.R.: Towards symbolic xai - explanation through human understandable logical relationships between features. Information Fusion 118, 102923 (2025). https://doi.org/10.1016/j.inffus.2024.102923
14. Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C.D., Ng, A., Potts, C.: Recursive deep models for semantic compositionality over a sentiment treebank. In: Proc. of the 2013 Conf. on Empirical Methods in Natural Language Processing. pp. 1631-1642 (2013). https://doi.org/10.18653/v1/D13-1170
15. Thrun, S.: Extracting rules from artificial neural networks with distributed rep resentations. In: Advances in Neural Information Processing Systems. vol. 7, pp. 505-512 (1994).
16. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Proc. of the 31st Int. Conf. on Neural Information Processing Systems. p. 6000-6010. NIPS'17, Curran Associates Inc., Red Hook, NY, USA (2017).

Bibliografische Daten

Erscheinungsdatum	5. März 2026
DOI	10.64136/aeti8606
Creative Commons Lizenz

Towards Symbolic XAI – Explanation Through Human Understandable Logical Relationships Between Features

Nächstes Kapitel

16 Context-dependent Effects of Explanations on Multiple Layers of Trust