Nowadays, Artificial Intelligence systems have expanded their competence field from research to industry and daily life, so understanding how they make decisions is becoming fundamental to reducing the lack of trust between users and machines and increasing the transparency of the model. This paper aims to automate the generation of explanations for model-free Reinforcement Learning algorithms by answering “why” and “why not” questions. To this end, we use Bayesian Networks in combination with the NOTEARS algorithm for automatic structure learning. This approach complements an existing framework very well and demonstrates thus a step towards generating explanations with as little user input as possible. This approach is computationally evaluated in three benchmarks using different Reinforcement Learning methods to highlight that it is independent of the type of model used and the explanations are then rated through a human study. The results obtained are compared to other baseline explanation models to underline the satisfying performance of the framework presented in terms of increasing the understanding, transparency and trust in the action chosen by the agent.

A Bayesian Network Approach to Explainable Reinforcement Learning with Distal Information

Milani R.;De Leone R.;
2023-01-01

Abstract

Nowadays, Artificial Intelligence systems have expanded their competence field from research to industry and daily life, so understanding how they make decisions is becoming fundamental to reducing the lack of trust between users and machines and increasing the transparency of the model. This paper aims to automate the generation of explanations for model-free Reinforcement Learning algorithms by answering “why” and “why not” questions. To this end, we use Bayesian Networks in combination with the NOTEARS algorithm for automatic structure learning. This approach complements an existing framework very well and demonstrates thus a step towards generating explanations with as little user input as possible. This approach is computationally evaluated in three benchmarks using different Reinforcement Learning methods to highlight that it is independent of the type of model used and the explanations are then rated through a human study. The results obtained are compared to other baseline explanation models to underline the satisfying performance of the framework presented in terms of increasing the understanding, transparency and trust in the action chosen by the agent.
2023
Bayesian Network
causal explanation
Explainable Reinforcement Learning
human study
model-free methods
262
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11581/487747
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
social impact