GSTDTAP  > 气候变化
DOI10.1126/science.370.6521.1144
‘The game has changed.’ AI triumphs at protein folding
Robert F. Service
2020-12-04
发表期刊Science
出版年2020
英文摘要Artificial intelligence (AI) has solved one of biology's grand challenges: predicting how proteins fold from a chain of amino acids into 3D shapes that carry out life's tasks. This week, organizers of a protein-folding competition announced the achievement by researchers at DeepMind, a U.K.-based AI company. They say the DeepMind method will have far-reaching effects, among them dramatically speeding the creation of new medications. “What the DeepMind team has managed to achieve is fantastic and will change the future of structural biology and protein research,” says Janet Thornton, director emeritus of the European Bioinformatics Institute. “This is a 50-year-old problem,” adds John Moult, a structural biologist at the University of Maryland, Shady Grove, and co-founder of the competition, Critical Assessment of Protein Structure Prediction (CASP). “I never thought I'd see this in my lifetime.” The body uses tens of thousands of different proteins, each a string of dozens to hundreds of amino acids. The order of the amino acids dictates how the myriad pushes and pulls between them give rise to proteins' complex 3D shapes, which, in turn, determine how they function. Knowing those shapes helps researchers devise drugs that can lodge in proteins' crevices. And being able to synthesize proteins with a desired structure could speed development of enzymes to make biofuels and degrade waste plastic. ![Figure][1] CREDITS: (GRAPH) C. BICKEL/ SCIENCE ; (DATA) CASP For decades, researchers deciphered proteins' structures using experimental techniques such as x-ray crystallography or cryo–electron microscopy (cryo-EM). But such methods can take years and don't always work. Structures have been solved for only about 170,000 of the more than 200 million proteins discovered across life forms. In the 1960s, researchers realized if they could work out all interactions within a protein's sequence, they could predict its shape. But the amino acids in any given sequence could interact in so many different ways that the number of possible structures was astronomical. Computational scientists jumped on the problem, but progress was slow. In 1994, Moult and colleagues launched CASP, which takes place every 2 years. Entrants get amino acid sequences for about 100 proteins whose structures are not known. Some groups compute a structure for each sequence, while others determine it experimentally. The organizers then compare the computational predictions with the lab results and give the predictions a global distance test (GDT) score. Scores above 90 on the 100-point scale are considered on par with experimental methods, Moult says. Even in 1994, predicted structures for small, simple proteins could match experimental results. But for larger, challenging proteins, computations' GDT scores were about 20, “a complete catastrophe,” says Andrei Lupas, a CASP judge and evolutionary biologist at the Max Planck Institute for Developmental Biology. By 2016, competing groups had reached scores of about 40 for the hardest proteins, mostly by drawing insights from known structures of proteins that were closely related to the CASP targets. When DeepMind first competed, in 2018, its algorithm, called AlphaFold, relied on this comparative strategy. But AlphaFold also incorporated a computational approach called deep learning, in which the software is trained on vast data troves—in this case, the sequences and structures of known proteins—and learns to spot patterns. DeepMind won handily, beating the competition by an average of 15% on each structure, and winning GDT scores of up to about 60 for the hardest targets. But the predictions were still too coarse, says John Jumper, who heads AlphaFold's development at DeepMind. “We knew how far we were from biological relevance.” So the team combined deep learning with an “attention algorithm” that mimics the way a person might assemble a jigsaw puzzle: connecting pieces in clumps—in this case clusters of amino acids—and then searching for ways to join the clumps in a larger whole. Working with a computer network built around 128 machine learning processors, they trained the algorithm on all 170,000 or so known protein structures. And it worked. In this year's CASP, AlphaFold achieved a median GDT score of 92.4. For the most challenging proteins, AlphaFold scored a median of 87, 25 points above the next best predictions. It even excelled at solving structures of proteins that sit wedged in cell membranes, which are central to many human diseases but notoriously difficult to solve with x-ray crystallography. Venki Ramakrishnan, a structural biologist at the Medical Research Council Laboratory of Molecular Biology, calls the result “a stunning advance on the protein folding problem.” All groups in this year's competition improved, Moult says. But with AlphaFold, Lupas says, “The game has changed.” The organizers even worried DeepMind may have cheated somehow. So Lupas set a special challenge: a membrane protein from a species of archaea, an ancient group of microbes. For 10 years, his team had tried to get its x-ray crystal structure. “We couldn't solve it.” But AlphaFold had no trouble. It returned a detailed image of a three-part protein with two helical arms in the middle. The model enabled Lupas and his team to make sense of their x-ray data; within half an hour, they had fit their experimental results to AlphaFold's predicted structure. “It's almost perfect,” Lupas says. “They could not possibly have cheated on this. I don't know how they do it.” As a condition of entering CASP, DeepMind—like all groups—agreed to reveal sufficient details about its method for other groups to re-create it. That will be a boon for experimentalists, who will be able to use structure predictions to make sense of opaque x-ray and cryo-EM data. It could also enable drug designers to work out the structure of every protein in new and dangerous pathogens like SARS-CoV-2, a key step in the hunt for molecules to block them, Moult says. Still, AlphaFold doesn't do everything well. In CASP, it faltered on one protein, an amalgam of 52 small repeating segments, which distort each others' positions as they assemble. Jumper says the team now wants to train AlphaFold to solve such structures, as well as those of complexes of proteins that work together to carry out key functions in the cell. Even though one grand challenge has fallen, others will undoubtedly emerge. “This isn't the end of something,” Thornton says. “It's the beginning of many new things.” [1]: pending:yes
领域气候变化 ; 资源环境
URL查看原文
引用统计
文献类型期刊论文
条目标识符http://119.78.100.173/C666/handle/2XK7JSWQ/305802
专题气候变化
资源环境科学
推荐引用方式
GB/T 7714
Robert F. Service. ‘The game has changed.’ AI triumphs at protein folding[J]. Science,2020.
APA Robert F. Service.(2020).‘The game has changed.’ AI triumphs at protein folding.Science.
MLA Robert F. Service."‘The game has changed.’ AI triumphs at protein folding".Science (2020).
条目包含的文件
条目无相关文件。
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[Robert F. Service]的文章
百度学术
百度学术中相似的文章
[Robert F. Service]的文章
必应学术
必应学术中相似的文章
[Robert F. Service]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。