Receive a weekly summary and discussion of the top papers of the week by leading researchers in the field.

In BMC medical informatics and decision making ; h5-index 38.0

BACKGROUND : Cardiogenic stroke has increasing morbidity in China and brought economic burden to patient families. In cardiogenic stroke diagnosis, echocardiograph examination is one of the most important examinations. Sonographers will investigate patients' heart via echocardiograph, and describe them in the echocardiograph reports. In this study, we developed a machine learning model to automatically identify diagnosis evidences of cardiogenic stroke providing to neurologist for clinical decision making.

METHODS : We collected 4188 Chinese echocardiograph reports of 4018 patients, with average length 177 Chinese characters in free-text style. Collaborating with neurologists and sonographers, we summarized 149 phrases on diagnosis evidence of cardiogenic stroke such as " (severe mitral stenosis), " (aortic valve degeneration) and so on. Furthermore, we developed an annotated corpus via mapping 149 phrases to the 4188 reports. We selected 11 most frequent diagnosis evidence types such as " (mitral stenosis) for further identifying. The generated corpus is divided into training set and testing set in the ratio of 8:2, which is used to train and validate a machine learning model to identify the evidence of cardiogenic stroke using BiLSTM-CRF algorithm.

RESULTS : Our machine learning method achieved the average performance on the diagnosis evidence identification is 98.03, 90.17 and 93.94% respectively. In addition, our method is capable to identify the novel diagnosis evidence of cardiogenic stroke description such as "-" (mitral stenosis), " (aortic valve calcification) et al. CONCLUSIONS: In this study, we analyze the structure of the echocardiograph reports and summarized 149 phrases on diagnosis evidence of cardiogenic stroke. We use the phrases to generate an annotated corpus automatically, which greatly reduces the cost of manual annotation. The model trained based on the corpus also has a good performance on the testing set. The method of automatically identifying diagnosis evidence of cardiogenic stroke proposed in this study will be further refined in the practice.

Qin Lu, Xu Xiaowei, Ding Lingling, Li Zixiao, Li Jiao


BiLSTM-CRF, Cardiogenic stroke, Chinese echocardiograph reports, Diagnosis evidences