GSK-C2F Graph Skeleton Modelization for Action Segmentation and Recognition using a Coarse-to-Fine strategy
Conférence : Communications par affiche dans un congrès international ou national
Locating the temporal boundaries of performing actions, especially in industry 5.0 context, poses significant challenges due to several factors. These include the complex industrial environment, the presence of similarities between inter-class actions, the significant variation in the execution of intra-class actions arising from the expertise levels of operators, and the under or over-representation of particular actions. To address these challenges, a novel approach, named Graph Skeleton Modelization for Action Segmentation and Recognition, is proposed using a Coarse-to-Fine strategy (GSK-C2F). Unlike previous works in sequence segmentation, this paper aims to explore the benefits of modeling spatiotemporal actions using various modalities extracted from skeleton data.
This method integrates a backbone modeling spatiotemporal dependencies of various skeleton representations, including joint positions, bone angles and joint velocities. Subsequently, it employs an encoder-decoder architecture that combines coarse and fine decoder outputs of varying temporal resolutions to effectively recognize and localize the start and end frames of action sequences. This approach, validated on both the IKEA-ASM and InHARD datasets, achieved first place on InHARD. The IKEA-ASM evaluation showed comparable results to state-of-theart
methods, demonstrating its robustness and good generalizability.