ArXiv Preprint
Automatic parsing of human anatomies at instance-level from 3D computed
tomography (CT) scans is a prerequisite step for many clinical applications.
The presence of pathologies, broken structures or limited field-of-view (FOV)
all can make anatomy parsing algorithms vulnerable. In this work, we explore
how to exploit and conduct the prosperous detection-then-segmentation paradigm
in 3D medical data, and propose a steerable, robust, and efficient computing
framework for detection, identification, and segmentation of anatomies in CT
scans. Considering complicated shapes, sizes and orientations of anatomies,
without lose of generality, we present the nine degrees-of-freedom (9-DoF) pose
estimation solution in full 3D space using a novel single-stage,
non-hierarchical forward representation. Our whole framework is executed in a
steerable manner where any anatomy of interest can be directly retrieved to
further boost the inference efficiency. We have validated the proposed method
on three medical imaging parsing tasks of ribs, spine, and abdominal organs.
For rib parsing, CT scans have been annotated at the rib instance-level for
quantitative evaluation, similarly for spine vertebrae and abdominal organs.
Extensive experiments on 9-DoF box detection and rib instance segmentation
demonstrate the effectiveness of our framework (with the identification rate of
97.0% and the segmentation Dice score of 90.9%) in high efficiency, compared
favorably against several strong baselines (e.g., CenterNet, FCOS, and
nnU-Net). For spine identification and segmentation, our method achieves a new
state-of-the-art result on the public CTSpine1K dataset. Last, we report highly
competitive results in multi-organ segmentation at FLARE22 competition. Our
annotations, code and models will be made publicly available at:
https://github.com/alibaba-damo-academy/Med_Query.
Heng Guo, Jianfeng Zhang, Ke Yan, Le Lu, Minfeng Xu
2022-12-05