In Clinical science (London, England : 1979)
OBJECTIVE : Existing strategies to identify relevant studies for systematic review may not perform equally well across research domains. We compare four approaches based on either human or automated screening of either title and abstract or full text; and report the training of a machine learning algorithm to identify in vitro studies from bibliographic records.
METHODS : We used a systematic review of oxygen-glucose deprivation (OGD) in PC-12 cells to compare approaches. For human screening, two reviewers independently screened studies based on title and abstract or full text, with disagreements reconciled by a third. For automated screening, we applied text mining to either title and abstract or full text. We trained a machine learning algorithm with decisions from 2,000 randomly selected PubMed Central records enriched with a dataset of known in vitro studies.
RESULTS : Full text approaches performed best, with human (sensitivity 0.990, specificity 1.000, precision 0.994) outperforming text mining (sensitivity 0.972, specificity 0.980, precision 0.764). For title and abstract, text mining (sensitivity 0.890, specificity 0.995, precision 0.922) outperformed human screening (sensitivity 0.862, specificity 0.998, precision 0.975). At our target sensitivity of 95% the algorithm performed with specificity of 0.850 and precision of 0.700.
CONCLUSION : In this in vitro systematic review, human screening based on title and abstract erroneously excluded 14% of relevant studies, perhaps because title and abstract provide an incomplete description of methods used. Our algorithm might be used as a first selection phase in in vitro systematic reviews to limit the extent of full text screening required.
Wilson Emma, Cruz Florenz, Maclean Duncan, Ghanawi Joly, McCann Sarah K, Brennan Paul M, Liao Jing, Sena Emily S, Macleod Malcolm R
2023-Jan-11
Meta-research, automation, in vitro models, machine learning, systematic review