Robotic systems are increasingly deployed in industrial environments, where understanding and analyzing their operations is essential for monitoring and optimizing automated processes. Techniques such as process mining offer powerful tools for analyzing operational workflows, but they require structured representations of activities that are difficult to extract from raw sensory data. Among available data sources, video streams capture the temporal evolution of robotic activities, yet interpreting robotic behavior directly from video remains challenging. In this paper, we propose the use of knowledge graphs as a semantic layer to support the interpretation of robotic video streams. The proposed approach separates visual perception from semantic reasoning through a modular architecture. A perception module extracts structured observations from video frames, while a knowledge graph encodes domain knowledge about the robotic environment, including objects, states, and possible interactions. This semantic layer supports the reasoning process used to interpret robot activities observed in the video. The resulting framework enables the extraction of structured representations of robotic activities from video streams, supporting event-based descriptions of robot behavior that can be used for process analysis.
Knowledge Graphs as a Semantic Layer for Understanding Robotic Video
Corradini F.;Re B.;Rossi L.;Sampaolo M.;
2026-01-01
Abstract
Robotic systems are increasingly deployed in industrial environments, where understanding and analyzing their operations is essential for monitoring and optimizing automated processes. Techniques such as process mining offer powerful tools for analyzing operational workflows, but they require structured representations of activities that are difficult to extract from raw sensory data. Among available data sources, video streams capture the temporal evolution of robotic activities, yet interpreting robotic behavior directly from video remains challenging. In this paper, we propose the use of knowledge graphs as a semantic layer to support the interpretation of robotic video streams. The proposed approach separates visual perception from semantic reasoning through a modular architecture. A perception module extracts structured observations from video frames, while a knowledge graph encodes domain knowledge about the robotic environment, including objects, states, and possible interactions. This semantic layer supports the reasoning process used to interpret robot activities observed in the video. The resulting framework enables the extraction of structured representations of robotic activities from video streams, supporting event-based descriptions of robot behavior that can be used for process analysis.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


