In International journal of radiation oncology, biology, physics
Natural language processing (NLP), which aims to convert human language into expressions that can be analyzed by computers, is one of the most rapidly developing and widely used technologies in the field of artificial intelligence (AI). NLP algorithms convert unstructured free text data into structured data that can be extracted and analyzed at scale. In medicine, this unlocking of the rich, expressive data within clinical free text in the electronic medical records (EMR) will help untap the full potential of big data for research and clinical purposes. Recent major NLP algorithmic advances have significantly improved their performance, leading to a surge in academic and industry interest in developing tools to automate information extraction and phenotyping from clinical texts. Thus, these technologies are poised to transform medical research and alter clinical practices in the future. Radiation oncology stands to benefit from NLP algorithms if they are appropriately developed and deployed, as it may enable advances such as automated inclusion of radiotherapy details into cancer registries, discovery of novel insights about cancer care, and improved patient data curation and presentation at the point-of-care. However, challenges remain before the full value of NLP is realized, such as the plethora of radiation oncology-specific jargon, nonstandard nomenclature, lack of publicly available labeled data for model development, and interoperability limitations between radiotherapy data silos. Successful development and implementation of high quality and high value NLP models for radiation oncology will require close collaboration between computer scientists and the radiation oncology community. Here, we present a primer on AI algorithms in general, and NLP algorithms in particular; provide a guide on how to assess the performance of such algorithms; review prior research on NLP algorithms for oncology; and describe future avenues for NLP in radiation oncology research and clinic.
Bitterman Danielle S, Miller Timothy A, Mak Raymond H, Savova Guergana K