In Analytical chemistry
A major challenge for Metabolomic analysis is to obtain an unambiguous identification of the metabolites detected in a sample. Among metabolomics techniques, NMR spectroscopy is a sophisticated, powerful, and generally applicable spectroscopic tool that can be used to ascertain the correct structure of newly isolated biogenic molecules. However, accurate structure prediction using computational NMR techniques depends on how much of the relevant conformational space of a particular compound is considered. It is intrinsically challenging to calculate NMR chemical shifts using high-level DFT when the conformation-al space of a metabolite is extensive. In this work, we developed NMR chemical shift calculation protocols using a machine learning model in conjunction with standard DFT methods. The pipeline encompasses the following steps: (1) conformation generation using a force field (FF) based method, (2) filtering the FF generated conformations using the ASE-ANI machine learning model, (3) clustering of the optimized conformations based on structural similarity to identify chemically unique conformations, (4) DFT structural optimization of the unique conformations and (5) DFT NMR chemical shift calculation. This protocol can calculate the NMR chemical shifts of a set of molecules using any available combination of DFT theory, solvent model, and NMR-active nuclei, using both user-selected reference compounds and/or linear regression methods. Our protocol reduces the overall computational time by 2 orders of magnitude (see Figure 1) over methods that optimize the conformations using fully ab initio methods, while still producing good agreement with experimental observations. The complete protocol is designed in such a manner that makes the computation of chemical shifts tractable for a large number of conformationally flexible metabolites.
Das Susanta, Edison Arthur S, Merz Kenneth M