Latest Technology News: New AI research offers a simple but effective structure –

Proteins, the energy of the cell, are involved in various applications, in particular material and therapeutic. They are made up of a chain of amino acids that fold into a certain shape. A significant number of new protein sequences have been discovered recently through the development of low-cost sequencing technology. Accurate and efficient methods of in silico protein function annotation are needed to bridge the current gap between sequence and function, because functional annotation of a new protein sequence is still expensive and time-consuming.

Many data-driven approaches rely on learning representations of protein structures because many protein functions are controlled by how they are folded. These representations can then be applied to tasks such as protein design, structure classification, model quality assessment, and function prediction.

The number of published protein structures is an order of magnitude lower than the number of datasets in other application areas of machine learning due to the difficulty of experimental identification of protein structure. For example, the Protein Database contains 182,000 experimentally confirmed structures, compared to 47,000 protein sequences in Pfam and 10,000 annotated images in ImageNet. Several studies have used the abundance of untagged protein sequence data to develop an appropriate representation of existing proteins to fill this representation gap. Many researchers have used self-supervised learning to pre-train protein coders on millions of sequences.

Recent developments in accurate protein structure prediction techniques based on deep learning have made it possible to efficiently and confidently predict the structures of many protein sequences. However, these techniques do not specifically capture or use protein structural information that is known to determine protein function. Many structure-based protein coders have been proposed to make better use of structural information. Unfortunately, edge interactions, which are crucial for simulating protein structure, have not yet been explicitly addressed in these models. Additionally, due to the lack of experimentally established protein structures, relatively little work has been done until recently to create pretraining techniques that take advantage of unlabeled 3D structures.

Inspired by this breakthrough, they are creating a protein encoder that can be applied to a range of property prediction applications and is pretrained on the most feasible protein structures. They suggest a simple yet effective structure-based encoder called the GeomEtry-Aware Relational Graph Neural Network, which conveys a relational message about protein residue graphs after encoding spatial information by including various structural or sequence edges. They suggest a sparse edge message passing technique to improve the protein structure encoder, which is the first effort to implement edge-level message passing on GNNs for protein structure encoding. Their idea was inspired by the design of the attention triangle by Evoformer.

They also provide a geometric pretraining approach based on the well-known contrastive learning framework to learn the protein structure encoder. They suggest innovative augmentation functions that enhance the similarity between acquired representations of substructures of the same protein while decreasing that between those of different proteins to find physiologically related protein substructures that coexist in proteins. They simultaneously suggest a set of simple baselines based on self-prediction.

They established a solid foundation for pretraining protein structure representations by comparing their pretraining methods to several downstream property prediction tasks. These pre-learning problems include masked prediction of various geometric or physico-chemical properties, such as residue types, Euclidean distances, and dihedral angles. Numerous tests using a variety of benchmarks, such as Enzyme Commission Number Prediction, Gene Ontology Term Prediction, Fold Classification, and Reaction Classification, show that GearNet enhanced with edge message passing can consistently outperform existing protein encoders on the majority of tasks in a supervised environment.

Moreover, using the suggested pre-training strategy, their model trained on less than a million samples achieves equivalent or even better results than the most advanced sequence-based encoders pre-trained on million or billion datasets. The codebase is publicly available on Github. It is written in PyTorch and Torch Drug.


CheckPaper AndGithub link. All credit for this research goes to the researchers of this project. Also, don’t forget to joinour 26k+ ML subreddit,Discord Channel,AndE-mailwhere we share the latest AI research news, cool AI projects, and more.


Aneesh Tickoo is an intern consultant at MarktechPost. He is currently pursuing his undergraduate studies in Data Science and Artificial Intelligence at Indian Institute of Technology (IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He enjoys connecting with people and collaborating on interesting projects.




Also read this Article:

An Overview of Global Events in 2023

In 2023, the world witnessed a myriad of events that left a lasting impact on global affairs. From political developments and economic shifts to environmental challenges and breakthroughs in science and technology, the year was marked by significant changes and a sense of urgency for collective action. Here’s an overview of some of the latest world news in 2023.

Political Unrest and Diplomatic Strides:
In the political arena, several regions experienced unrest and geopolitical tensions. The ongoing conflict in the Middle East continued to dominate headlines, with efforts towards peace and stability remaining elusive. However, there were also moments of diplomatic breakthroughs as nations engaged in dialogues to ease tensions and work towards lasting solutions.

Economic Transformations:
The global economy faced both challenges and opportunities. Trade disputes between major powers affected markets, while some countries grappled with debt crises. On the other hand, emerging economies showed resilience and promising growth, fueling optimism for a more balanced global economic landscape.

Technological Advancements:
Innovation surged forward in the tech industry, with breakthroughs in artificial intelligence, renewable energy, and space exploration. Quantum computing achieved milestones, promising radical transformations across industries. Renewable energy sources gained traction, with many countries setting ambitious goals to combat climate change.

Climate Crisis and Environmental Resilience:
As the climate crisis intensified, extreme weather events wreaked havoc in various parts of the world. Wildfires, hurricanes, and floods reminded humanity of the urgent need for climate action. In response, governments and communities across the globe doubled down on efforts to reduce carbon emissions, invest in sustainable infrastructure, and protect biodiversity.

Health and Pandemic Management:
Health remained a global priority as countries continued to combat the COVID-19 pandemic. With the emergence of new variants, vaccination efforts and public health measures remained crucial to curbing the spread of the virus. There were also significant advancements in medical research and technology, offering hope for better preparedness in handling future health crises.

Sports and Cultural Milestones:
Amidst the challenges, the world found moments of joy and unity through sports and culture. International sporting events brought together athletes from diverse backgrounds, promoting solidarity and camaraderie. Cultural exchanges and celebrations showcased the richness of human diversity and fostered mutual understanding.

In conclusion, the year 2023 was a dynamic period filled with significant events that shaped the course of history. From political unrest to technological advancements and environmental challenges, the world witnessed the complexities of the global landscape. While obstacles remained, there were also encouraging developments and collaborative efforts towards a more sustainable, peaceful, and prosperous future for all nations. As we move forward, the lessons learned from these events serve as a reminder of the importance of collective action and cooperation to address shared global challenges.