AI is being used by researchers to decipher the human genome.
AlphaGenome represents a significant advancement in our understanding of the human genome. However, the intricacies of our DNA remain largely a mystery. AlphaFold2, an artificial intelligence program, was awarded the Nobel Prize in Chemistry in 2024 by two researchers from Google DeepMind. For decades, scientists had struggled to understand how strings of molecular building blocks fold into the complex, three-dimensional structures of proteins.

At Google DeepMind, Demis Hassabis, John Jumper, and their colleagues trained a program to predict shapes. When AlphaFold2 was released in 2020, it was so good at this that scientists all over the world used it. According to Alex Palazzo, a geneticist at the University of Toronto, “everyone is using AlphaFold.” Scientists used the program to study how proteins normally work — and how the failure to work can lead to disease. It helped them build entirely new proteins, some of which will soon be tested in clinical trials.
Now another team of researchers at Google DeepMind is trying to do for DNA what the company did for proteins. AlphaGenome meets AlphaFold. On Wednesday, the researchers unveiled AlphaGenome in the journal Nature. They used a lot of molecular data to train their artificial intelligence, which allowed them to make predictions about thousands of genes.
For instance, AlphaGenome is able to predict whether a mutation will turn off a gene at the wrong time or turn it on, which is an important question for comprehending cancer and other diseases. Peter Koo, a computational biologist at Cold Spring Harbor Laboratory in New York who was not involved in the project, said that AlphaGenome represented an important step forward in applying artificial intelligence to the genome. He stated, “It’s an engineering marvel.” But Dr. Koo and other outside experts warned that it was only the beginning of a long journey. Mark Gerstein, a Yale computational biologist, stated, “This is not AlphaFold, and it is not going to win the Nobel Prize.” AlphaGenome will come in handy.
Dr. Gerstein stated that he would probably include it in his arsenal for studying DNA, and others anticipate doing the same. But not all scientists trust A.I. programs like AlphaGenome to help them understand the genome.
Johns Hopkins University computational biologist Steven Salzberg stated, “I see no value in them at all right now.” “I think there are a lot of smart people wasting their time.”

Before the era of computers, biologists conducted painstaking experiments to uncover the rules that govern our genes. They discovered that genes are spelled out in a four-letter genetic alphabet called bases. A cell reads the sequence in a gene, which can be thousands of bases long, to make a protein. But the more scientists studied the human genome, the more complicated and messy it turned out to be.
As cells read a gene, for example, they often skip over sections of its sequence. Through this process, known as splicing, cells can create hundreds of different proteins from a single gene.
When cells splice their genes incorrectly, many diseases occur. But there is no simple signature for the spots in genes where they should be spliced, so scientists have spent decades building up a catalog of them.
How cells choose which genes they use to make proteins is another important question about the genome. Scientists have discovered special molecules that grab hold of DNA and stretch it into intricate loops. In some cases, the loops expose a gene to the cell’s protein-making machinery. In other cases, the gene ends up tucked away in a coil.
To exert control over genes, those molecules must precisely land on specific regions of DNA. And these genetic locks can be hard to find since they often lie thousands or millions of bases away from the genes they control. AlphaGenome, a project developed by Google DeepMind researchers, was launched in 2019.
By that time, biologists had accumulated a lot of data, including the human genome’s three billion base pairs and the findings of thousands of experiments measuring the activity of genes in a wide range of cell types. The researchers at Google DeepMind hoped that by training A.I. on these existing results, they could develop a program that could make accurate predictions about stretches of DNA it had never seen before.
A research scientist at Google DeepMind named Ziga Avsec stated, “It was the right target for us.” In 2021, Dr. Avsec and his colleagues unveiled a preliminary A.I. called Enformer, which they have since expanded into AlphaGenome. They trained the program on an even greater expanse of biological data. “It’s really an industrial scale,” Dr. Gerstein said.

Splicing, for instance, is the focus of many artificial intelligence (AI) programs designed to study the genome. But AlphaGenome was trained to make predictions about 11 different processes. In the report on Wednesday, Dr. Avsec and his colleagues noted that AlphaGenome had performed as well or better than other programs across the board.
“It’s state of the art,” said Katherine Pollard, a data scientist at Gladstone Institutes, a research organization in San Francisco, who was not involved in the study.
Dr. Pollard and other researchers said that AlphaGenome was particularly adept with mutations, capable of predicting their effects, such as shutting down a nearby gene. In one performance test, the researchers added mutations to the stretch of DNA that includes a gene called TAL1.
In healthy people, TAL1 helps immune cells mature until they can fight pathogens. Once the cells have developed, the gene shuts down. But scientists have discovered that mutations 8,000 bases away from TAL1 can lead the gene to switch on permanently. Leukemia can be brought on by immune cells multiplying out of control as a result of that change. Dr. Avsec and his colleagues found that AlphaGenome had accurately predicted the impact of these mutations on TAL1. “It has been really exciting to see, when these models work,” he said. “It feels like magic sometimes.”
The AlphaGenome researchers shared their TAL1 predictions with Dr. Marc Mansour, a hematologist at University College London who spent years uncovering the leukemia-driving mutations with lab experiments.
Dr. said, “It was quite mind-blowing.” Mansour said. “It really demonstrated how potent this is.” But, Dr. Mansour pointed out that the ability of AlphaGenome to predict changes as it moves away from a particular gene. He is now using AlphaGenome in his cancer research but does not blindly accept its results.
He stated, “These prediction tools are still prediction tools.” “We still need to visit the laboratory.” Dr. Salzberg of Johns Hopkins is less sanguine about AlphaGenome, in part because he thinks its creators put too much trust in the data they trained it on. Scientists who study splice sites don’t agree on which sites are real and which are genetic mirages. As a result, they have created databases that contain different catalogs of splice sites.
“The community has been working for 25 years to try to figure out what are all the splice sites in the human genome, and we’re still not really there,” Dr. Salzberg asserted. “We don’t have an agreed-upon gold-standard set.”
Dr. Pollard also cautioned that AlphaGenome was a long way from being a tool that doctors could use to scan the genomes of patients for threats to their health. It predicts only the effects of a single mutation on one standard human genome.
In reality, any two people have millions of genetic differences in their DNA. AlphaGenome’s industrial-strength capabilities are far from sufficient to evaluate the effects of all of these variations on a patient’s body. “It is a much, much harder problem — and yet that’s the problem we need to solve if we want to use a model like this for health care,” Dr. Pollard said.





























