What Blocks Human Evolution
The biology world just exploded. A problem that had stumped molecular (structural) biology for over 50 years was solved by the Google DeepMind team in just a few years.
Since CASP was founded over 20 years ago, protein structure prediction accuracy had never exceeded 50%. In 2018, the DeepMind team entered their first-generation AlphaFold in CASP (Critical Assessment of protein Structure Prediction) and won the competition outright with a 60% prediction accuracy, shattering the historical record. Just two years later (the competition is held biennially), DeepMind upgraded to AlphaFold 2, which smashed the record again – this time by nearly 50% over the original AlphaFold. Its overall average accuracy reached 92.4%. Even for the most complex protein structures, the median accuracy hit 87.0% – an absolute blowout of every other competing team. An accuracy above 90% is generally considered equivalent to experimentally determined results. So what does this mean?
Protein Structure
Before understanding what CASP is, you first need to understand protein structure. Proteins are composed of amino acids, just as DNA/RNA is composed of nucleotides. There are 22 known types of amino acids. Since DNA is made of just 4 types of nucleotides and already has an extremely complex structure, imagine how much more complex proteins must be with 22 types of building blocks.
The Protein Folding Problem
As macromolecular polymers, proteins are significant not only in biochemistry and molecular biology (enzymes) but also in the study of infectious diseases (viruses, antibiotics) and other illnesses. Although their basic building blocks are simple, proteins differ from DNA in a crucial way: a DNA sequence’s one-dimensional base sequence directly determines its function, but a protein’s one-dimensional amino acid sequence does not. What truly determines a protein’s function is its three-dimensional structure. Going from a one-dimensional polypeptide chain to a three-dimensional protein requires various local rotations (alpha helices) and folds (beta sheets) to form secondary structure, then further folding to form tertiary structure – the protein molecule. Beyond that, tertiary structures can fold further into quaternary structure – protein complexes. So to understand a protein’s function, you need to analyze its tertiary and quaternary structure. A rough estimate: a typical protein has roughly 10^300 possible conformations. Brute-force enumeration would take longer than the known age of the universe.
CASP
To advance protein structure research, in 1994 professors John Moult and Krzysztof Fidelis founded CASP, a worldwide protein structure prediction experiment. CASP’s role is twofold: providing opportunities for competing teams, and independently evaluating each team’s protein structure modeling techniques. CASP is the undisputed authority in both industry and academia. Winning CASP is the highest honor, and many teams halt all other research for months just to compete.
DeepMind
For over 50 years, protein structure research was measured in years. Despite many published papers, there was no qualitative leap in prediction efficiency or accuracy – accuracy remained below 50%. At the 13th CASP competition, DeepMind‘s AI – the first-generation AlphaFold – burst onto the scene and took the crown. The AlphaFold algorithm compressed what used to take years into days. This means that lifetime achievements of many professors could be surpassed by AlphaFold 2 in mere days. One can only imagine how those professors and their students still working on protein structures in their labs must feel.
What makes AlphaFold so powerful? This brings us to Google‘s purpose-built AI processor – the TPU (Tensor Processing Unit). Google began using TPUs internally in 2015 and made them available to third parties in 2018. The first-generation AlphaFold used just 5 TPUs to boost prediction accuracy from the ~40% where it had been stuck since CASP’s founding to 60%. AlphaFold 2 then used roughly 128 third-generation TPUs (equivalent to 100-200 GPUs) and, after running for just a few weeks, achieved over 90% accuracy. At this pace, over 90% of experiments could be replaced by AI.
The Speed of Evolution
I often wonder: why, across hundreds of thousands of years of human history, did technology barely change – yet in just a few hundred years, we’ve seen such enormous breakthroughs? What exactly held back human progress for all those millennia? Perhaps that’s a topic for another day…
- Blog Link: https://johnsonlee.io/2020/12/01/what-blocks-human-evolution.en/
- Copyright Declaration: 著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。
