ArXiv Preprint
Action segmentation is a challenging task in high-level process analysis,
typically performed on video or kinematic data obtained from various sensors.
In the context of surgical procedures, action segmentation is critical for
workflow analysis algorithms. This work presents two contributions related to
action segmentation on kinematic data. Firstly, we introduce two multi-stage
architectures, MS-TCN-BiLSTM and MS-TCN-BiGRU, specifically designed for
kinematic data. The architectures consist of a prediction generator with
intra-stage regularization and Bidirectional LSTM or GRU-based refinement
stages. Secondly, we propose two new data augmentation techniques, World Frame
Rotation and Horizontal-Flip, which utilize the strong geometric structure of
kinematic data to improve algorithm performance and robustness. We evaluate our
models on three datasets of surgical suturing tasks: the Variable Tissue
Simulation (VTS) Dataset and the newly introduced Bowel Repair Simulation (BRS)
Dataset, both of which are open surgery simulation datasets collected by us, as
well as the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), a
well-known benchmark in robotic surgery. Our methods achieve state-of-the-art
performance on all benchmark datasets and establish a strong baseline for the
BRS dataset.
Adam Goldbraikh, Omer Shubi, Or Rubin, Carla M Pugh, Shlomi Laufer
2023-03-14