Viruses are doing mysterious things everywhere – AI can help researchers understand what they’re up to in the oceans and in your gut

by | May 15, 2024 | Science

Viruses are a mysterious and poorly understood force in microbial ecosystems. Researchers know they can infect, kill and manipulate human and bacterial cells in nearly every environment, from the oceans to your gut. But scientists don’t yet have a full picture of how viruses affect their surrounding environments in large part because of their extraordinary diversity and ability to rapidly evolve.Communities of microbes are difficult to study in a laboratory setting. Many microbes are challenging to cultivate, and their natural environment has many more features influencing their success or failure than scientists can replicate in a lab.So systems biologists like me often sequence all the DNA present in a sample – for example, a fecal sample from a patient – separate out the viral DNA sequences, then annotate the sections of the viral genome that code for proteins. These notes on the location, structure and other features of genes help researchers understand the functions viruses might carry out in the environment and help identify different kinds of viruses. Researchers annotate viruses by matching viral sequences in a sample to previously annotated sequences available in public databases of viral genetic sequences.However, scientists are identifying viral sequences in DNA collected from the environment at a rate that far outpaces our ability to annotate those genes. This means researchers are publishing findings about viruses in microbial ecosystems using unacceptably small fractions of available data.To improve researchers’ ability to study viruses around the globe, my team and I have developed a novel approach to annotate viral sequences using artificial intelligence. Through protein language models akin to large language models like ChatGPT but specific to proteins, we were able to classify previously unseen viral sequences. This opens the door for researchers to not only learn more about viruses, but also to address biological questions that are difficult to answer with current techniques.Annotating viruses with AILarge language models use relationships between words in large datasets of text to provide potential answers to questions they are not explicitly “taught” the answer to. When you ask a …

Article Attribution | Read More at Article Source

[mwai_chat context=”Let’s have a discussion about this article:nnViruses are a mysterious and poorly understood force in microbial ecosystems. Researchers know they can infect, kill and manipulate human and bacterial cells in nearly every environment, from the oceans to your gut. But scientists don’t yet have a full picture of how viruses affect their surrounding environments in large part because of their extraordinary diversity and ability to rapidly evolve.Communities of microbes are difficult to study in a laboratory setting. Many microbes are challenging to cultivate, and their natural environment has many more features influencing their success or failure than scientists can replicate in a lab.So systems biologists like me often sequence all the DNA present in a sample – for example, a fecal sample from a patient – separate out the viral DNA sequences, then annotate the sections of the viral genome that code for proteins. These notes on the location, structure and other features of genes help researchers understand the functions viruses might carry out in the environment and help identify different kinds of viruses. Researchers annotate viruses by matching viral sequences in a sample to previously annotated sequences available in public databases of viral genetic sequences.However, scientists are identifying viral sequences in DNA collected from the environment at a rate that far outpaces our ability to annotate those genes. This means researchers are publishing findings about viruses in microbial ecosystems using unacceptably small fractions of available data.To improve researchers’ ability to study viruses around the globe, my team and I have developed a novel approach to annotate viral sequences using artificial intelligence. Through protein language models akin to large language models like ChatGPT but specific to proteins, we were able to classify previously unseen viral sequences. This opens the door for researchers to not only learn more about viruses, but also to address biological questions that are difficult to answer with current techniques.Annotating viruses with AILarge language models use relationships between words in large datasets of text to provide potential answers to questions they are not explicitly “taught” the answer to. When you ask a …nnDiscussion:nn” ai_name=”RocketNews AI: ” start_sentence=”Can I tell you more about this article?” text_input_placeholder=”Type ‘Yes'”]
Share This