To some biologists, that approach leaves the protein folding problem incomplete. From the earliest days of structural biology, researchers hoped to learn the rules of how an amino acid string folds into a protein. With AlphaFold2, most biologists agree that the structure prediction problem is solved. However, the protein folding problem is not. “Right now, you just have this black box that can somehow tell you the folded states, but not actually how you get there,” Zhong said.
“It’s not solved the way a scientist would solve it,” said Littman, the Brown University computer scientist.
This might sound like “semantic quibbling,” said George Rose, the biophysics professor emeritus at Johns Hopkins. “But of course it isn’t.” AlphaFold2 can recognize patterns in how a given amino acid sequence might fold up based on its analysis of hundreds of thousands of protein structures. But it can’t tell scientists anything about the protein folding process.
“For many people, you don’t need to know. They don’t care,” Rose said. “But science, at least for the past 500 years or so … has been involved with trying to understand the process by which things occur.” To understand the dynamics, mechanisms, functions and nature of protein-based life, Rose argued, you need the full story — one that deep learning algorithms can’t tell us.
To Moult, it doesn’t matter that the machine does something he doesn’t understand. “We’re all used to machines doing things we can’t. You know, I can’t run as fast as my car,” he said. To molecular biologists who are trying to study a protein and just need to know roughly what it looks like, how they get there doesn’t really matter.
But “until we know really how it works, we’re never going to have a 100% reliable predictor,” Porter said. “We have to understand the fundamental physics to be able to make the most informed predictions we can.”
“We keep moving the goalpost,” AlQuraishi said. “I do think that core problem has been solved, so now it’s very much about what comes next.”
Even as biologists continue to debate these topics, others are looking forward to a field that’s undeniably changed — and backward toward its recent past.
Sometimes Perrakis is hit by a wave of nostalgia for the old ways of doing things. In 2022, his team described an enzyme involved in modifying microtubules (giant, rod-shaped molecules that provide structures to cells) that they had determined using X-ray crystallography. “I realized that I’m never going to do that [again],” he said. “Having the first structure appearing after months of work was a very particular satisfaction.”
AlphaFold2 hasn’t made those experiments obsolete. On the contrary, it’s illuminated just how necessary they are. It has stitched together two historically disparate disciplines, launching a new and stimulating conversation.
The New World
Seventy years ago, proteins were thought to be a gelatinous substance, Porter said. “Now look at what we can see”: structure after structure of a vast world of proteins, whether they exist in nature or were designed.
The field of protein biology is “more exciting right now than it was before AlphaFold,” Perrakis said. The excitement comes from the promise of reviving structure-based drug discovery, the acceleration in creating hypotheses and the hope of understanding complex interactions happening within cells.
“It [feels] like the genomics revolution,” AlQuraishi said. There is so much data, and biologists, whether in their wet labs or in front of their computers, are just starting to figure out what to do with it all.
But like other artificial intelligence breakthroughs sparking across the world, this one might have a ceiling.
AlphaFold2’s success was founded on the availability of training data — hundreds of thousands of protein structures meticulously determined by the hands of patient experimentalists. While AlphaFold3 and related algorithms have shown some success in determining the structures of molecular compounds, their accuracy lags behind that of their single-protein predecessors. That’s in part because there is significantly less training data available.
The protein folding problem was “almost a perfect example for an AI solution,” Thornton said, because the algorithm could train on hundreds of thousands of protein structures collected in a uniform way. However, the Protein Data Bank may be an unusual example of organized data sharing in biology. Without high-quality data to train algorithms, they won’t make accurate predictions.
“We got lucky,” Jumper said. “We met the problem at the time it was ready to be solved.”
No one knows if deep learning’s success at addressing the protein folding problem will carry over to other fields of science, or even other areas of biology. But some, like AlQuraishi, are optimistic. “Protein folding is really just the tip of the iceberg,” he said. Chemists, for example, need to perform computationally expensive calculations. With deep learning, these calculations are already being computed up to a million times faster than before, AlQuraishi said.
Artificial intelligence can clearly advance specific kinds of scientific questions. But it may get scientists only so far in advancing knowledge. “Historically, science has been about understanding nature,” AlQuraishi said — the processes that underlie life and the universe. If science moves forward with deep learning tools that reveal solutions and no process, is it really science?
“If you can cure cancer, do you care about how it really works?” AlQuraishi said. “It is a question that we’re going to wrestle with for years to come.”
If many researchers decide to give up on understanding nature’s processes, then artificial intelligence will not just have changed science — it will have changed the scientists too.
Meanwhile, the CASP organizers are wrestling with a different question: how to continue their competition and conference. AlphaFold2 is a product of CASP, and it solved the main problem the conference was organized to address. “It was a big shock for us in terms of: Just what is CASP anymore?” Moult said.
In 2022, the CASP meeting was held in Antalya, Turkey. Google DeepMind didn’t enter, but the team’s presence was felt. “It was more or less just people using AlphaFold,” Jones said. In that sense, he said, Google won anyway.
Some researchers are now less keen on attending. “Once I saw that result, I switched my research,” Xu said. Others continue to hone their algorithms. Jones still dabbles in structure prediction, but it’s more of a hobby for him now. Others, like AlQuraishi and Baker, continue on by developing new algorithms for structure prediction and design, undaunted by the prospect of competing against a multibillion-dollar company.
Moult and the conference organizers are trying to evolve. The next round of CASP opened for entries in May. He is hoping that deep learning will conquer more areas of structural biology, like RNA or biomolecular complexes. “This method worked on this one problem,” Moult said. “There are lots of other related problems in structural biology.”
The next meeting will be held in December 2024 by the aqua waters of the Caribbean Sea. The winds are cordial, as the conversation will probably be. The stamping has long since died down — at least out loud. What this year’s competition will look like is anyone’s guess. But if the past few CASPs are any indication, Moult knows to expect only one thing: “surprises.”
Source: How AI Revolutionized Protein Science, but Didn’t End It | Quanta Magazine