The analysis of large event log collections aimed at variability management requires an intensive pre-processing phase. It is intuitive that obsolete behaviour that could be present in the logs must be removed in order to gain insight into the collection. Changes in the information system may indeed generate obsolete behaviour, more specifically, in the case of public administration, changes in the law may imply a change in the process, which must be updated in the information system. The logs containing the updated behaviour can then be used in variability management practices, such as the creation of configurable models. This type of analysis has numerous criticalities, one of which is the difficulty of obtaining an effective representation of the process, without running into excessive complexity of the model produced. Obsolete behavior results in an unnecessary increase in complexity and should therefore be removed. This paper introduces an event log analysis and visualisation technique based on the notion of complexity introduced by Lempel Ziv. The visualization enables process analysts to identify concept drift in the logs, thereby facilitating the removal of outdated behavior. Furthermore, when equilibrium is achieved, it indicates that the behavior is representative of the entire log. Consequently, during variability analysis, it becomes possible to prune the log, reducing computational complexity.
Managing Variability of Large Public Administration Event Log Collections: Dealing with Concept Drift
Corradini F.;Luciani C.;Morichetta A.;Piangerelli M.
2023-01-01
Abstract
The analysis of large event log collections aimed at variability management requires an intensive pre-processing phase. It is intuitive that obsolete behaviour that could be present in the logs must be removed in order to gain insight into the collection. Changes in the information system may indeed generate obsolete behaviour, more specifically, in the case of public administration, changes in the law may imply a change in the process, which must be updated in the information system. The logs containing the updated behaviour can then be used in variability management practices, such as the creation of configurable models. This type of analysis has numerous criticalities, one of which is the difficulty of obtaining an effective representation of the process, without running into excessive complexity of the model produced. Obsolete behavior results in an unnecessary increase in complexity and should therefore be removed. This paper introduces an event log analysis and visualisation technique based on the notion of complexity introduced by Lempel Ziv. The visualization enables process analysts to identify concept drift in the logs, thereby facilitating the removal of outdated behavior. Furthermore, when equilibrium is achieved, it indicates that the behavior is representative of the entire log. Consequently, during variability analysis, it becomes possible to prune the log, reducing computational complexity.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.