In IEEE transactions on pattern analysis and machine intelligence ; h5-index 127.0
Among the greatest of the challenges of Minimally Invasive Surgery (MIS) is the inadequate visualisation of the surgical field through keyhole incisions. Moreover, occlusions caused by instruments or bleeding can completely obfuscate anatomical landmarks, reduce surgical vision and lead to iatrogenic injury. The aim of this paper is to propose an unsupervised end-to-end deep learning framework, based on Fully Convolutional Neural networks to reconstruct the view of the surgical scene under occlusions and provide the surgeon with intraoperative see-through vision in these areas. A novel generative densely connected encoder-decoder architecture has been designed which enables the incorporation of temporal information by introducing a new type of 3D convolution, the so called 3D partial convolution, to enhance the learning capabilities of the network and fuse temporal and spatial information. To train the proposed framework, a unique loss function has been proposed which combines perceptual, reconstruction, style, temporal and adversarial loss terms, for generating high fidelity image reconstructions. Advancing the state-of-the-art, our method can reconstruct the underlying view obstructed by irregularly shaped occlusions of divergent size, location and orientation. The proposed method has been validated on in-vivo MIS video data, as well as natural scenes on a range of occlusion-to-image (OIR) ratios.
Tukra Samyakh, Marcus Hani J, Giannarou Stamatia