The AI, once computationally impossible, was now translating protein sequences into structures.
Thanks to advances in DNA sequencing technology, obtaining the sequence of bases that encode a protein and translating it into an amino sequence has become irrelevant. The acids that make up protein. But from there, we often get stuck. The actual protein function is performed indirectly only by its sequence. Instead, the sequence defines how an amino acid chain contracts and flexes in three-dimensional space to form a specific structure. Usually, it is this structure that determines the function of the protein, but it can take years of laboratory work to obtain it. p>
At the time, DeepMind said that this information would give all the details of the future peer. The article, which was finally published yesterday, has been revised. In the meantime, some academic researchers are tired of waiting, taking some of DeepMind's insights and creating their own. An article describing this effort was also published yesterday. AlphaFold contamination DeepMind has already described the basic structure of AlphaFold, but the new article provides more details. The structure of AlphaFold consists of two different algorithms that, according to their analysis, communicate back and forth, allowing each to modify its own output.
One such algorithm looks for protein sequences discussed as evolutionary relatives, and shapes how their sequences are coordinated, tuning for small changes or even insertions and deletions. Even if we don't know the structure of any of these species, they can still impose significant limitations, such as whether certain parts of a protein are always charged.
The AlphaFold team says this bit of everything needs about 30 bound proteins to function effectively. It is usually quickly adjusted to a major alignment, and then corrected. This type of patch can modify the gaps to put key amino acids in the right place. The second algorithm, running in parallel, breaks the sequence into smaller parts and tries to solve the sequence. from each of these while ensuring that the structure of each piece is compatible with the larger structure. For this reason, the alignment of the protein and its relatives is essential. If the key amino acids are in the wrong piece, correcting the structure is a real challenge. Thus, the two algorithms communicate with each other and allow the proposed structures to use alignment. Structural prediction is a more difficult process, and the main ideas of the algorithm often undergo more significant changes in the final structure before the algorithm is resolved.
Perhaps the most interesting new detail in the article is where DeepMind goes through and disrupts the various parts of the parsing algorithms. These show that out of the nine different functions you specify, all seem to play a role in the final resolution, and only one of them has a significant impact. This includes identifying points in the proposed structure that may need to be changed and referring to them for further study. Competition
In a statement issued to the article, DeepMind CEO Demis Accountis said, "We are committed to sharing our methods and gaining broad and free access to the scientific community." Today, we are taking the first step towards fulfilling this commitment by sharing open source AlphaFold code and publishing the full system methodology. But Google has already described the underlying architecture of the system, leading some academics to consider whether they can adapt their existing tools to a system similar to that of DeepMind. By a seven-month delay, the researchers had plenty of time to put the idea into practice. Using DeepMind's initial description, they identified The researchers had five AlphaFold features that they believed differed from most existing methods. Therefore, they tried to implement different combinations of these features and see which ones improved the existing methods.
The easiest way to start is to have two parallel algorithms: one dedicated to a sequence Alignment, and other structural predictions, but at the end of the structural part, the team split the work into two different functions.One of these functions simply estimates the two-dimensional distance between separate parts of the protein, and the other controls the actual location in the three-dimensional space.All three information exchange, and each of them Indicates what may need further correction.Advertisement
The problem with adding a third line is that it greatly increases the needs of hardware and academics.You generally do not have access to the existing computing assets Similar to DeepMind. So while the system called RoseTTAFold didn't perform well on AlphaFold in terms of its predictions, it did better than any previous system the team could test. However, due to the hardware it was run on, it was also relatively fast and took 10 minutes using a protein with 400 amino acids. Like AlphaFold, RoseTTAFold breaks proteins into smaller pieces and dissolves them individually before attempting to create a complete structure. In this case, the research team realized that this might have additional application. Many proteins create a wide range of interactions to work with other proteins - for example, hemoglobin exists as a group of four proteins. If this system is working as it should, feeding two different proteins should allow it to determine its structure and where they interact with each other. Experiments have shown that it really works.
Both articles seem to describe positive developments. For starters, the DeepMind team deserves full credit for their thoughts on the structure of their system in the first place. Clearly, organizing things as parallel processes that interact with each other has made a huge leap in our ability to estimate protein structures. The academic team, rather than just trying to replicate what DeepMind did, just took some basic ideas and steered them in new directions. The accuracy of their final output is computed in terms of the time and resources that will be allocated to it. But both teams seem to be explicitly committed, which are likely to be each other's best features.
Whatever the outcome, it is clear that we are in a new place than we were. For decades, people have been trying to solve protein structure predictions for decades, and our inability to do so becomes even more problematic when genomes provide us with large amounts of protein sequences, and we don't have much to explain. The time demand for these systems is likely to be high, with a large portion of the biomedical research community benefiting from this program.
Science, 2021. DOI: 10.1126/science.abj8754
Nature, 2021. DOI: 10.1038/s41586-021-03819-2 (about DOI).
Google provides details of its protein folding program, and academics offer another option
It is sometimes difficult to write objectively about the progress of SpaceX. The i...
Last Thursday, the Russian space station's new large unit, Navka, was finally conn...
We are now - often horrifyingly - watching what happens to the virus and ...