Key Takeaways
- Zero FDA-approved AI-discovered drugs exist as of December 2025 despite a decade of claims — 2026 breaks this streak or reinforces skepticism
- Zasocitinib (Takeda/Schrödinger) achieved >50% PASI 90 in Phase 3 psoriasis trial vs ~30% for traditional PDE4 inhibitors — filing expected in 2026
- Rentosertib (Insilico Medicine) represents first fully AI-generated drug target AND molecule in Phase 2a with dose-dependent efficacy improvement in IPF
- GNoME predicted 2.2 million materials and identified 380,000 stable candidates, but only 736 have independent physical validation (0.2% rate) — revealing AI prediction-to-reality validation gap
- A-Lab Chemistry World critique documents how AI characterization can systematically misidentify known compounds as novel — warning signal for autonomous closed-loop systems
The Three-Platform Simultaneous Test
The simultaneous clinical validation of three independent AI platforms is analytically significant because it separates platform-specific risk from field-level signal. If zasocitinib, rentosertib, and Recursion's compound all succeed, the convergent validation means the AI drug discovery hypothesis is confirmed across different methodologies. If all three fail, it suggests a shared flaw in how AI handles the biological complexity that clinical trials measure.
Zasocitinib (TAK-279) represents the most advanced and conservative case using Schrödinger's physics-based structure prediction platform. The Phase 3 data is strong: more than 50% of participants achieved PASI 90 (near-complete skin clearance) and approximately 30% achieved PASI 100 (complete clearance) at week 16 in psoriasis. For context, typical PDE4 inhibitor PASI 90 rates are approximately 30%, making zasocitinib's results meaningfully superior. The binding affinity (Kd = 0.0038 nM, extraordinarily selective for TYK2 JH2) suggests genuine structure-based optimization. Takeda plans regulatory filing in 2026, making this the most likely path to the first FDA-approved AI-involved drug.
Rentosertib (ISM001-055, Insilico Medicine) is the scientifically distinctive case — Insilico used generative AI to identify both the biological target (TNIK, a kinase previously unconnected to IPF) and the therapeutic molecule — the full pipeline, not just one stage. The Phase 2a trial in 71 patients demonstrated safety, tolerability, and dose-dependent improvement in forced vital capacity. Nature Medicine publication provides peer review that zasocitinib lacks for its AI component. However, 71 patients in China is insufficient for strong conclusions, and Phase 3 is 1-2 years away, extending the validation timeline.
Materials Science Parallel: AI Prediction-to-Reality Validation Gap
A-Lab and GNoME provide the closest analog to drug discovery's validation challenge. GNoME predicted 2.2 million crystals, identified 380,000 stable candidates, and expanded the known stable materials universe approximately 10x. A-Lab's physical synthesis demonstrated 41 compounds in 17 days — 2.4 per day — versus months per compound for human researchers. The AI-to-lab pipeline (GNoME predictions → A-Lab robotic synthesis) created a closed-loop discovery system that is now receiving DOE SciDAC funding ($10M, FORUM-AI) for expansion to compositionally complex materials.
However, the A-Lab critique from Chemistry World is analytically important: a published argument that the AI's Rietveld refinement misidentified compound variations as genuinely new materials — that the closed loop included an error propagation loop. Of GNoME's 380,000 stable predictions, only 736 have been independently physically verified (0.2% physical validation rate). The same gap that exists between benchmark performance and production capability (models scoring 90%+ on coding benchmarks yet hallucinating function signatures) exists between AI scientific predictions and physical validation.
GNN-LLM Hybrid: Architecture for Complex Biological Systems
The GNN-LLM integration trend is directly relevant to both drug discovery and materials science. GNoME's graph neural network architecture achieves up to 25% improvement over GNN-only models in materials property prediction when combined with LLM semantic context. Medical knowledge graphs represent diseases, proteins, drugs, and symptoms as interconnected nodes — GNNs model multi-hop pathways (drug A affects protein B which modulates pathway C) while LLMs translate these into clinically meaningful language. This architecture is becoming the technical backbone for the next generation of AI drug discovery platforms, and its maturation represents a structural improvement in AI's ability to model biological complexity rather than just chemical structure.
The Validation Logic
The key analytical question is whether AI drug discovery is making systematically better predictions than traditional medicinal chemistry, or whether it is making faster predictions that pass through the same clinical failure filters. Zasocitinib's 50%+ PASI 90 rate versus 30% for PDE4 inhibitors suggests genuine selectivity improvement — the AI found a better target, not just found the target faster. Rentosertib's novel target identification (TNIK for IPF) is a different kind of validation: AI found a mechanism humans had not previously connected to the disease.
The 90% clinical trial failure rate is the relevant baseline. If AI-discovered drugs fail at rates meaningfully lower than 90%, the value proposition is validated. Current evidence is suggestive but pre-statistical: three drugs in Phase 2-3 is insufficient for rate comparison, and selection bias (the most promising AI compounds advance first) inflates the apparent success rate relative to the population of all AI-identified candidates.
The Contrarian Case
The scientific AI validation thesis could be undermined if: (1) zasocitinib's Phase 3 success is partly attributable to traditional medicinal chemistry optimization rather than the AI discovery component — making it an AI-assisted success rather than an AI-generated one; (2) rentosertib's TNIK target for IPF proves specific to Chinese patient genetics in the Phase 2a population and fails to replicate in global Phase 3; (3) the A-Lab critique is validated by further independent analysis, showing that the AI characterization pipeline systematically misidentifies known compounds as novel ones — undermining the entire autonomous discovery paradigm; (4) more fundamentally, if human disease biology proves too context-dependent and patient-heterogeneous for AI trained on molecular data to predict clinical outcomes more reliably than traditional phenotypic screening.
What This Means for Practitioners
For ML engineers in biotech and pharma: The GNN-LLM integration stack (graph neural networks for molecular structure + LLM for semantic biological reasoning) is the production standard emerging from academic research. Implement rigorous human validation for any AI characterization output in closed-loop systems — the A-Lab critique documents what happens when AI both generates and evaluates its own results.
Signal to watch: NVIDIA-Eli Lilly partnership committing GPU infrastructure for AI drug discovery signals industry confidence in clinical validation ahead of Phase 3 results. Watch zasocitinib FDA filing as leading indicator for field-wide funding and regulatory posture. If filing succeeds, expect a wave of industry investment in AI-to-clinical pipelines; if it fails, expect investor skepticism to persist despite Rentosertib's Phase 2 promise.
Competitive positioning: Schrödinger (physics-based), Insilico Medicine (generative AI full pipeline), and Recursion (phenomic screening) represent three distinct technical approaches that all have clinical-stage validation in 2026. Platform differentiation is moving from 'can AI discover drugs' to 'which AI methodology produces better clinical outcomes'. Labs without clinical-stage assets by 2027 face permanent credibility gap relative to validated platforms.