Health & Medicine, Canada (Commonwealth Union) – Artificial Intelligence (AI) techniques, such as machine learning and deep learning, can be used to build computational models that simulate biological systems. These models help researchers understand the behavior of biological entities, such as cells, proteins, or entire organisms, by capturing their complex interactions. AI algorithms can analyze large biological datasets to uncover patterns, identify relationships, and make predictions about biological processes.
Researchers at the University of Toronto (U of T) have produced an AI system capable of producing proteins absent in nature with the application of generative diffusion.
The system is set to assist in moving forward the field of generative biology, which researchers believe will hasten drug development by forming the design and testing of completely new therapeutic proteins with increased efficiency and flexibility.
Professor Philip M. Kim, of the Donnelly Centre for Cellular and Biomolecular Research at the U of T, Temerty Faculty of Medicine indicated that their model gets its knowledge from image representations to produce completely new proteins at a rapid rate. He also indicated that all their proteins can be seen as biophysically real, which means that they fold into configurations making it possible for them to conduct specific activities inside cells.
The results appeared in the journal Nature Computational Science and are the 1st of their kind for a peer-reviewed journal.
Proteins are formed out of chains of amino acids that fold into 3D shapes, that dictate the activities of the protein. The shapes are believed to have evolved over billions of years as well as being varied, complex with a restricted number.
Presently with the aid of deeper knowledge of the ways existing proteins fold, researchers have begun to design folding patterns not produced in nature.
A key obstacle according to Kim, was to imagine folds that are feasible and functional.
“It’s been very hard to predict which folds will be real and work in a protein structure,” added Kim, who is a professor in the departments of molecular genetics in the Temerty Faculty of Medicine and computer science in the Faculty of Arts & Science as well. He further indicated that the combination of biophysics-based representations for protein structure with diffusion methods of the image generation space, makes it possible to tackle the issue.
The new system, that scientists referred to as ProteinSGM, takes from a wider set of image-like representations of present proteins encoding their structural accuracy. The scientists entered these images into a generative diffusion model that bit by bit adds noise until each image ends up as all noise. The model monitors the way the images become noisier and then conducts the process in reverse, gaining knowledge on ways to change random pixels into clear images corresponding to completely novel proteins.
Jin Sub (Michael) Lee, who is a doctoral student in the Kim lab as well as the 1st author on the paper, pointed out that optimizing the early stage of this image generation process was a major hurdle in producing ProteinSGM.
“A key idea was the proper image-like representation of protein structure, such that the diffusion model can learn how to generate novel proteins accurately,” said Lee.
A further hurdle was validation of the proteins formed by ProteinSGM. The system produces various structures, that are regularly different to anything observed in nature. Roughly all of them appear real as indicated in standard metrics, added Lee, however the scientists require more proof.
Next steps based on the study consist of more development of ProteinSGM for antibodies as well as other proteins with increased therapeutic possibilities as indicated by, Kim who also stated that it will be an exciting area for research together with entrepreneurship.

                                    
                                    




