DeepMind open-sources AlphaFold 2 for protein structure predictions
All Transform 2021 sessions are available on demand now. Look now.
DeepMind this week opened AlphaFold 2, its AI system that predicts protein shape, to accompany the publication of an article in the journal Nature. With the codebase now available, DeepMind says it hopes to expand access for researchers and organizations in healthcare and life sciences.
The recipe for proteins – large molecules made up of amino acids that are the basic building blocks of tissue, muscle, hair, enzymes, antibodies, and other essential parts of living organisms – is encoded in DNA. It is these genetic definitions that delimit their three-dimensional structures, which in turn determine their capabilities. But protein “folding”, as it’s called, is notoriously difficult to understand from a single corresponding genetic sequence. DNA only contains information about the chains of amino acid residues and not about the final shape of those chains.
In December 2018, DeepMind attempted to tackle the protein folding challenge with AlphaFold, the product of two years of work. The Alphabet subsidiary said at the time that AlphaFold could predict structures more accurately than previous solutions. Its successor, AlphaFold 2, announced in December 2020, improved this to outperform competing methods of predicting protein folding for the second time. In the results of the 14th Critical Structure Prediction Assessment (CASP), AlphaFold 2 had average errors comparable to the width of an atom (or 0.1 nanometers), competitive with the results of experimental methods.
AlphaFold draws inspiration from the fields of biology, physics and machine learning. It takes advantage of the fact that a folded protein can be thought of as a “spatial graph”, where amino acid residues (amino acids contained in a peptide or protein) are nodes and edges connect the residues nearby. AlphaFold relies on an AI algorithm that attempts to interpret the structure of this graph while reasoning on the implicit graph it constructs using evolutionary-related sequences, multiple-sequence alignment, and representation of pairs of amino acid residues.
In the open source version, DeepMind claims to have significantly streamlined AlphaFold 2. While the system used to take days of compute time to generate structures for some entries in CASP, the open source version is about 16 times faster. It can generate structures in minutes to hours, depending on the size of the protein.
Real world applications
DeepMind argues that AlphaFold, if further refined, could be applied to previously intractable problems in the field of protein folding, including those related to epidemiological efforts. Last year, the company predicted several protein structures of SARS-CoV-2, including ORF3a, whose composition was once a mystery. At CASP14, DeepMind predicted the structure of another coronavirus protein, ORF8, which has since been confirmed by experimenters.
Beyond helping with the pandemic response, DeepMind expects AlphaFold to be used to explore the hundreds of millions of proteins for which science currently lacks models. Since DNA specifies the amino acid sequences that make up protein structures, advances in genomics have made it possible to read protein sequences from the natural world, with 180 million protein sequences and counting in the Universal database. Protein available to the public. On the other hand, given the experimental work required to translate from sequence to structure, only about 170,000 protein structures are found in the protein database.
DeepMind is committed to making AlphaFold available “at scale” and to working with partners to explore new frontiers, such as how several proteins form complexes and interact with DNA, RNA and small molecules. Earlier this year, the company announced a new partnership with the Geneva-based Drugs for Neglected Diseases initiative, a non-profit pharmaceutical organization that has used AlphaFold to identify fexinidazole as a replacement for the toxic compound melarsoprol in the treatment of heart disease. sleep.
VentureBeat’s mission is to be a digital public place for technical decision-makers to learn about transformative technology and conduct transactions. Our site provides essential information on data technologies and strategies to guide you in managing your organizations. We invite you to become a member of our community, to access:
- up-to-date information on the topics that interest you
- our newsletters
- Closed thought leader content and discounted access to our popular events, such as Transform 2021: Learn more
- networking features, and more
Become a member