Written by Kristine Yang '25
Edited by Wonjin Ko '26
AI-Generated Human Protein [3]
Is biological evolution merely an algorithm optimized for adaptation and survival? Proponents of this idea argue that the fundamental principles of evolution, such as mutation, recombination, genetic drift, and natural selection, can be viewed as mechanisms of an evolutionary algorithm. By this view, the genetic code of an organism can be understood as a set of instructions that guide the process of evolution, leading to the development of new traits and adaptations that enhance the organism's chances of survival and reproduction.
With advancing technology, researchers have been able to create artificial intelligence (AI) systems that can mimic the processes of human and natural design. As these systems have demonstrated remarkable abilities to learn, create, and innovate, the question of whether AI could also be used to model and improve upon the process of evolution itself has been raised.
Researchers at the University of California – San Francisco and Salesforce Research designed an AI capable of imitating evolution by designing sequences of 20 amino acids to create original proteins [1]. Similar to how words are strung together to form sentences, amino acids are combined to form proteins. In a unique approach, the researchers opted to use a natural language processing AI model called ProGen that was designed to teach itself the grammar and meaning of words given text input [2]. Instead, the team of researchers inputted amino acid sequences of 280 million different proteins and allowed the AI model to learn the principles of protein assembly – or in other words, the ‘language of amino acids’. The model was then fine-tuned with 56,000 sequences from five lysozyme families and given contextual information about these proteins.
After generating one million sequences using the ProGen AI model, researchers selected 100 of the most promising sequences for further testing. The selected sequences were then compared to natural proteins to assess their performance.
Remarkably, the ProGen AI model was able to generate artificial proteins that outperformed naturally occurring proteins, with 73 percent of the AI-generated proteins able to function compared to only 59 percent of natural proteins [1]. This indicates that the AI model not only detected evolutionary patterns but was also able to optimize protein sequences for maximum performance, suggesting potential applications in the field of protein engineering.
Perhaps most significantly, the AI model's ability to generate artificial proteins that outperform naturally occurring enzymes extends even to cases where only a small percentage of the amino acid sequence resembles any known natural protein. In fact, the AI-generated enzymes were able to function even when only 31.4 percent of their amino acid sequence was similar to any known natural protein. This feat not only underscores the power of the AI model but also its potential to create entirely new proteins that are not found in nature.
This achievement has many implications, including the potential to transform the field of protein engineering, as traditional methods of producing new proteins can take years, yet the language model only took weeks to design novel proteins. The rapid pace and precision at which this AI model can generate proteins indicates the possibility of synthesizing much more readily available and accessible therapeutics and enzyme-based products. Additionally, because many of these AI-generated proteins outperform those naturally selected by evolution, these novel proteins could be customized for specific environments or conditions and could lead to increasingly more personalized medicine.
References
[1] Orf D. AI has successfully imitated human evolution—and might do it even better [Internet]. Popular Mechanics. 2023 [cited 2023 Apr 10]. Available from: https://www.popularmechanics.com/science/health/a42704749/artificial-intelligence-imitates-evolution/
[2] University of California - San Francisco. AI technology generates original proteins from scratch: Natural language model jumpstarts protein design with creation of active enzymes. Science Daily [Internet]. 2023 Jan 26 [cited 2023 Apr 10]; Available from: https://www.sciencedaily.com/releases/2023/01/230126124330.htm
[3] Haydon I. AI-Generated Proteins [Internet]. University of Washington Institute for Protein Design; Available from: https://www.nytimes.com/2023/01/09/science/artificial-intelligence-proteins.html
Comments