Healthcare (Commonwealth Union) – Demonstrating the growing significance of AI in science, scientists have formed a new artificial intelligence tool that can assess how effectively an enzyme interacts with a specific target, helping scientists identify the optimal enzyme–substrate pair for use in areas ranging from catalysis and drug development to industrial manufacturing.
The project, led by Huimin Zhao, professor of chemical and biomolecular engineering at the University of Illinois Urbana-Champaign, introduced EZSpecificity, a machine learning model built on newly compiled enzyme–substrate data. The tool, which is freely available online, is detailed in a recent publication in Nature.
Zhao indicated that when aiming to produce a specific compound with an enzyme, it’s crucial to choose the most compatible enzyme and substrate combination. He further pointed out that EZSpecificity analyzes enzyme sequences to predict which substrates fit best and it complements their earlier CLEAN AI model, which was designed to predict enzyme function from sequence data more than two years ago.
Enzymes are large proteins that speed up molecular reactions by binding to target molecules known as substrates. These substrates fit into specific pocket-like regions on the enzyme. The degree to which an enzyme and its substrate fit together is known as specificity. This interaction is often compared to a lock and key, where only the correct key can open the lock — but in reality, enzyme function is far more complex, as pointed out by Zhao.
“It is challenging to figure out the best combination because the pocket is not static,” he explained. “The enzyme actually changes conformation when it interacts with the substrate. It is more of an induced fit. And some enzymes are promiscuous and can catalyze different types of reactions. That makes it very hard to predict. That’s why we need a machine learning model and experimental data that really prove which pairing will work best.”
Although several models for enzyme specificity have been proposed, their accuracy and ability to predict different types of enzymatic reactions remain limited.
To enhance AI’s predictive performance, Zhao’s team recognized the need for a larger and more diverse dataset. They collaborated with Diwakar Shukla, a professor of chemical and biomolecular engineering at the University of Illinois, whose group carried out docking simulations across multiple enzyme classes. This effort produced a comprehensive database that includes not only enzyme sequences and structures but also insights into how enzymes of various types adapt to different substrates.
Shukla indicated that experimental methods that reveal enzyme–substrate interactions tend to be slow and intricate.
He further pointed out that they performed large-scale docking simulations to enrich the existing data and by focusing on atomic-level interactions, they carried out millions of docking calculations, giving them the missing link needed to create a highly accurate model for enzyme specificity.
The researchers compared EZSpecificity directly with ESP — the current top-performing model — across four test scenarios designed to simulate real-world use. In every case, EZSpecificity delivered superior results. To further confirm its effectiveness, the team experimentally tested the tool using eight halogenase enzymes — a relatively unexplored enzyme class gaining attention for producing bioactive compounds — and 78 different substrates. EZSpecificity reached an impressive 91.7% accuracy in identifying the best enzyme–substrate matches, while ESP achieved only 58.3%.
Zhao indicated that he cannot claim it works for every enzyme, but for some, they have shown that EZSpecificity performs exceptionally well and they created a user-friendly interface so that other researchers can input a substrate and a protein sequence, and the tool will predict how well they fit.
The next step for the team is to extend their AI systems to study enzyme selectivity — determining whether an enzyme favors a specific site on a substrate — to minimize off-target reactions. They also intend to further improve EZSpecificity by incorporating additional experimental data.

 
                                     
                                    

