To complement established rational and evolutionary protein design approaches, significant efforts are being made to utilize computational modeling and the diversity of naturally occurring protein sequences. Here, we combine structural biology, genomic mining, and computational modeling to identify structural features critical to aldehyde deformylating oxygenases (ADO), an enzyme family that has significant implications in synthetic biology and chemoenzymatic synthesis. Through these efforts we discovered latent ADO-like function across the Ferritin-like superfamily in various species of Bacteria and Archaea. We created a machine learning model that uses protein structural features to discriminate ADO-like activity. Computational enzyme design tools were then utilized to introduce ADO-like activity into the small subunit of E. coli Class I ribonucleotide reductase. The integrated approach of genomic mining, structural biology, molecular modeling, and machine learning has the potential to be utilized for rapid discovery and modulation of functions across enzyme families.
Mak Wai Shun, Wang Xiaokang, Arenas Rigoberto, Cui Youtian, Bertolani Steve J, Deng Wen Qiao, Tagkopoulos Ilias, Wilson David K, Siegel Justin B