Advancements in Protein Structure Prediction and Inference with MindSpore


    Mar 16, 2022

    Evolving research into protein structures is one of the most challenging, fascinating, and potentially impactful areas of development in the Life Sciences sphere. Many scientists have dedicated themselves to looking for breakthroughs.

    Using ordinary computer software to analyze protein structures requires an astonishing amount of computation, which even supercomputers cannot achieve. And as conventional protein structure prediction methods are plagued by inaccuracy, it’s very hard to make progress.

    In an attempt to remedy this, researchers develop AI algorithms to predict the structure of proteins based on amino acid sequences. This was first successfully done in 2020, by Google’s DeepMind team who, using AlphaFold2, managed to accurately predict the structure of 350,000 proteins. AlphaFold2 was significantly more accurate than other methods in the vast majority of cases. The team’s achievement was honored with the first prize at the 14th Critical Assessment of Structure Prediction (CASP14) competition, where the organizers described their work as “unprecedented progress”.

    Recently, the MindSpore team, the China Changping Laboratory, Peking University’s Biomedical Pioneering Innovation Centre (BIOPIC), the Chemistry and Molecular Engineering College, and Shenzhen Bay Laboratory’s Gao Yiqin team, joined forces to launch a new protein structure prediction and inference tool modelled on AlphaFold2.

    The new MindSpore-based tool leverages the heterogeneous Compute Architecture for Neural Networks (CANN) to harness the computing power of the Ascend AI Processor. Its relative strength lies in the multiple sequence comparison stage. By adopting MMseqs2 for sequence retrieval, it can dramatically improve the computing efficiency for protein prediction because its end-to-end computing speed is almost 3 times higher than that of AlphaFold2. It can calculate the protein structure of a sequence of over 2000 amino acids, covering more than 99% of protein sequences, and thus makes a valuable contribution to research in the bioprotein field.

    Source: Huawei / An illustration of protein T1079, predicted by MindSpore. The green represents the experimental results and red the predicted results.

    The prediction of single protein folding is a respectable start, but at present, we have barely scratched the surface of protein structure research. Proteins usually function in pairs or groups to perform the functions required to sustain life. To continue to explore the science behind proteins, it is important to refine algorithms and better understand the interaction between multiple proteins. In future, we may be able to use this research to create more advanced treatments for disease.

    MindSpore’s new protein structure prediction and inference tool is now open sourced in the MindSpore community.

    Click here to Meet MindSpore and explore more content like this.

    Disclaimer: Any views and/or opinions expressed in this post by individual authors or contributors are their personal views and/or opinions and do not necessarily reflect the views and/or opinions of Huawei Technologies.


      Leave a Comment

      Posted in


      Posted in