Move over, AlphaFold: open source model predicts shape of 1 billion proteins
Briefly

Move over, AlphaFold: open source model predicts shape of 1 billion proteins
ESM Atlas provides an atlas of more than one billion predicted protein structures and billions of protein sequences. The database was created by researchers at CZI-Biohub and generated using ESMFold2, an AI model built on a protein language approach trained on billions of proteins across the tree of life. The training includes metagenomic sequences from environments such as soil and oceans, which are not present in the AlphaFold predicted-structure database. ESM Atlas is reported to surpass the size of the AlphaFold Database by more than 800 million entries and a previous ESM Atlas by about 300 million. ESMFold2 is described as outperforming existing methods, including AlphaFold3, especially for protein complexes, and is fully open source.
"ESMFold2 is based on a 'protein language' model that Rives's team unveiled in 2024, which was trained on billions of proteins from across the tree of life. It includes 'metagenomic' sequences from soil, ocean and other environments, which are absent from the AlphaFold database of predicted protein structures. Rives' team say ESMFold2 outperforms existing methods, including AlphaFold3, at determining the correct structure of complexes of interacting proteins - includi"
"Other scientists are impressed with the results, especially that ESMFold2 is fully open source. But the CZI-Biohub model enters an increasingly crowded field, in which competing open-source and proprietary protein models are making gains at breakneck speed. Antibody predictions"
Read at Nature
Unable to calculate read time
[
|
]