Here are the resources and angles you'll want for a deep, credible review:

The two core papers (you already know these):

AlphaFold 1 (2020, Nature): "Improved protein structure prediction using potentials from deep learning" — Senior et al.
AlphaFold 2 (2021, Nature): "Highly accurate protein structure prediction with AlphaFold" — Jumper et al.

For both, don't skip the supplementary materials — that's where the real architectural and training details live. The AlphaFold 2 supplementary is ~60 pages and essentially a second paper. Most shallow reviews ignore it; yours shouldn't.

Deep technical resources:

The AlphaFold 2 supplementary methods — contains the full Evoformer architecture, structure module details, recycling mechanism, loss functions (FAPE loss especially), and training procedure. This is the most important "hidden" resource.
OpenFold (github.com/aqlaboratory/openfold) — a faithful open-source reimplementation. Reading their code alongside the paper clarifies ambiguities the paper leaves.
Nazim Bouatta & AlQuraishi's commentary — Mohammed AlQuraishi (Columbia) wrote detailed blog posts dissecting AlphaFold 2's architecture when it came out. Search "AlQuraishi AlphaFold 2 blog." He's one of the sharpest critics/analysts in the field.
Yannic Kilcher's video breakdown of the AlphaFold 2 paper goes deeper than the Google doc and walks through the architecture diagram by diagram.

Impact / downstream research to cite:

AlphaFold Protein Structure Database paper (Varadi et al., 2022, Nucleic Acids Research) — documents the release of ~200M predicted structures, which is the main vector of practical impact.
AlphaFold 3 (Abramson et al., 2024, Nature) — "Accurate structure prediction of biomolecular interactions with AlphaFold 3." Extends to ligands, nucleic acids, small molecules. Shows the research trajectory.
RoseTTAFold (Baek et al., 2021, Science) — independent work from David Baker's lab that arrived at a similar (three-track) architecture. Good for showing AlphaFold didn't exist in a vacuum and for comparing approaches.
ESMFold (Lin et al., 2023, Science) — Meta's language-model-based structure prediction that trades some accuracy for massive speed gains. Shows an alternative paradigm inspired by AlphaFold's success.
Drug discovery applications — search for papers citing AlphaFold in the context of target identification. A good specific example: work on neglected tropical diseases where AlphaFold structures enabled research that couldn't afford experimental crystallography.
CASP14 and CASP15 results — the competition proceedings document the quantitative impact. AlphaFold 2's GDT scores at CASP14 are the dramatic evidence of the breakthrough.

For your review specifically — things most people miss:

The jump from AF1 to AF2 isn't just "better model." AF1 predicts distance distributions between residues and then uses gradient descent to fold a structure. AF2 predicts 3D coordinates directly via the structure module with SE(3)-equivariant updates. That's a fundamental paradigm shift worth emphasizing.
AF2's use of multiple sequence alignments (MSAs) is both its strength and its limitation — it struggles with orphan proteins that have few homologs. ESMFold's single-sequence approach addresses exactly this.
The recycling mechanism (iterating the whole network ~3 times) is underappreciated. It's essentially giving the network multiple passes to refine, and it matters a lot for accuracy.

Want me to search for any specific papers or dig into a particular technical aspect?

Yapping vol. 2

Rosetta in a nutshell: Rosetta started as a small project in David Baker's lab at the University of Washington, originally written in Fortran, focused on ab initio structure prediction of small proteins. The core idea: it sampled local conformations observed for short sequence segments in known protein structures, then searched for the combination with the lowest overall energy. It was a physics-based approach — you define an energy function, sample conformations (using fragment libraries + Monte Carlo), and try to find the lowest-energy structure. This is the paradigm that dominated CASP for ~20 years before AlphaFold blew it up. Baker used Rosetta in CASP starting from 1998, and it performed well at predicting how sequences would fold.

The key difference: Rosetta and similar methods (I-TASSER, MODELLER, etc.) relied on physics-based energy minimization or homology modeling. AlphaFold replaced this with learned potentials (AF1) and then direct coordinate prediction (AF2). That's the paradigm shift your paper should articulate.

Here's a skeleton. I'm structuring it so the two core papers get ~90% of the space as you were told:

1. Introduction (~10-12%)

The protein folding problem: why structure matters, Anfinsen's thermodynamic hypothesis (1973), the gap between sequence and structure
Brief history of approaches: homology modeling, physics-based methods (Rosetta, molecular dynamics), coevolutionary methods (direct coupling analysis / DCA)
CASP as the benchmark — what it is, why it matters, how progress was slow for decades
Motivate the transition: what changed with deep learning entering this space

2. AlphaFold 1 — "Improved protein structure prediction using potentials from deep learning" (~30-35%)

Problem formulation: predicting inter-residue distance distributions (not just contacts — this was novel)
Architecture: deep residual network trained on MSA-derived features → outputs distance and torsion angle distributions
Key innovation: using these predicted distributions as a differentiable potential and then optimizing 3D coordinates via gradient descent (this replaced the traditional Rosetta-style fragment assembly + energy minimization)
CASP13 results: what the GDT scores were, how it compared to the field
Limitations: the two-stage pipeline (predict 2D → fold into 3D) was a bottleneck; the gradient descent folding step was slow and sometimes got stuck in local minima

3. AlphaFold 2 — "Highly accurate protein structure prediction with AlphaFold" (~40-45%)

The paradigm shift from AF1: end-to-end, single differentiable model that directly outputs 3D coordinates
Input pipeline: MSAs + template structures → processed into pair and MSA representations
Evoformer: the core innovation — interleaved attention between the MSA representation and pair representation (explain the row/column attention + triangular updates — this is what reviewers will look for)
Structure module: Invariant Point Attention (IPA), SE(3)-equivariant, operates on rigid body frames per residue → directly outputs atom positions
Recycling: the model iterates ~3 times through the whole network, refining predictions
Training: FAPE loss (Frame Aligned Point Error — the key loss function), self-distillation, masked MSA prediction as auxiliary loss
CASP14 results: median GDT ~92.4, dominated the competition
Confidence: pLDDT scores and Predicted Aligned Error (PAE) — these are crucial because they tell users where to trust the prediction
What it can't do well: orphan proteins (few homologs), intrinsically disordered regions, conformational dynamics, protein-protein interactions (at the time)

4. Connections & Impact (~10-12%)

AlphaFold Protein Structure Database: 200M+ predicted structures, democratized structural biology
Downstream methods: RoseTTAFold (independent validation of the attention-based approach), ESMFold (single-sequence, no MSA needed), AlphaFold-Multimer, AlphaFold 3 (ligands, nucleic acids, diffusion-based)
Practical impact: drug discovery, neglected disease research, experimental structure determination (molecular replacement)
Open science: OpenFold, ColabFold
Limitations that remain: no dynamics, no ensembles, no explicit solvent effects, confidence calibration issues

5. Conclusion (~3-5%)

The 2024 Nobel Prize (Baker + Hassabis/Jumper) as validation of both the physics-based and ML-based traditions
What's still unsolved: the protein folding problem ≠ the protein structure prediction problem (AF2 doesn't tell us how proteins fold, it tells us what they fold into)

One structural note: don't make section 1 a textbook chapter on protein biology. Your readers (presumably) know what proteins are. Jump quickly to the computational problem and the history of attempts. The intro should create tension — decades of slow progress, then sudden breakthrough — so the AF1/AF2 sections land with weight.

This is a great closing point for your review. The distinction I mentioned in the skeleton — "AF2 tells us what proteins fold into, not how they fold" — is now being actively challenged. These 2026 papers are starting to close that gap from both the experimental and computational sides.

And there's a cool AlphaFold connection: Chang & Perez (Nature Communications, 2026) showed that AlphaFold2's iterative predictions actually reveal folding intermediates and pathways, suggesting the model has implicitly learned folding principles — proteins tend to follow a "local first, global later" folding mechanism in AF2's iterations. Nature That's a neat bridge for your paper — AF2 was designed to predict final structures, but its recycling iterations may accidentally recapitulate the folding process.
This is a great closing point for your review. The distinction I mentioned in the skeleton — "AF2 tells us what proteins fold into, not how they fold" — is now being actively challenged. These 2026 papers are starting to close that gap from both the experimental and computational sides. C

Claude yapping

Yapping vol. 2