AI cracks 18 'impossible' diagnoses 🧬, Small Health System AI Advantage? 🐭, 8x code per quarter 📈

Jun 22, 2026

An LLM Just Solved 18 Rare-Disease Cases That Had Stumped Human Experts for Years

A study published in NEJM AI by Catherine Brownstein, Alan Beggs, and colleagues at Boston Children’s Hospital’s Manton Center for Orphan Disease Research — in collaboration with Harvard and OpenAI — fed 376 unsolved rare-disease cases into OpenAI’s o3 Deep Research model.

These weren’t fresh cases. Every one had already been through multiple commercial and institutional genomic pipelines. Multidisciplinary teams had reviewed them. Many families had waited years.

The model achieved a 4.8% additional diagnostic yield — 18 new diagnoses across four cohorts: neurodevelopmental (10% yield), neuromuscular (6.6%), early psychosis (13.3%), and sudden unexpected death in pediatrics (1%).

The most striking case: Kyra, whose neuromuscular symptoms began at age 9 during karate class. She was ventilator-dependent and in a wheelchair by 13. She endured nearly twenty years without a diagnosis. The model identified an HSPB8 frameshift variant — myofibrillar myopathy — about a week before her 28th birthday.

The model didn’t replace the geneticist. It re-read the literature at a scale no individual could match. The variants were already called. The genomes were already sequenced. What changed was the published evidence connecting rare variants to disease — papers that didn’t exist when the original analysis was done.

Seven of the 18 diagnoses were “rediscoveries” — diagnoses made elsewhere but absent from the local record. Even finding those saved families from redundant odysseys.

😤 “4.8% is tiny.” For the 18 families who’ve waited years with no answer, it’s not. And it’s additive — these are cases where every human expert had already said “I don’t know.” The cost per additional diagnosis is orders of magnitude lower than another round of expert reanalysis.

😤 “OpenAI built this. Of course they published a win.” BCH and Harvard designed the study, controlled the protocol, and validated every diagnosis through CLIA-certified labs using standard ACMG/AMP classification. The model is the commodity. The evaluation framework is the work.

😤 “Rare disease is a niche. This doesn’t generalize.” Rare disease is where AI’s read-more-than-a-human advantage is most visible because the literature is fragmented across thousands of papers. But the pattern generalizes anywhere the evidence base outpaces the practitioner: drug interactions, off-label evidence, guideline synthesis. Find the structured-data pile in your domain.

❓ What other clinical domains have an ever-growing pile of already-structured data waiting to be re-read against new evidence?

🎙️ From the Pods

🎙️ 229 Podcast — “Small Health System’s AI Advantage and the ROI Question Nobody’s Ready For“

Bill Russell and Drex DeFord make the counterintuitive case: smaller health systems are MORE resourceful with AI than large ones. One small system vibe-coded a nurse scheduling app — saved money, nurses love it. Dave Higginson at Phoenix Children’s coded tools himself.

🔇 Speaker Blindspot: Survivorship bias — they profiled the small systems where it worked and didn’t mention the ones that tried vibe coding a clinical tool, shipped something fragile, and got a call from compliance on Monday. The advantage of small is speed. The disadvantage is that nobody catches the mistake before it hits production.

🎙️ Lenny’s Podcast — “Building the Most AI-Pilled Engineering Team“ with Fiona Fung (Anthropic)

The manager of Claude Code and Cowork teams says Anthropic engineers now produce 8x the code per quarter compared to 2025. “Coding is no longer the bottleneck.” High agency is also high accountability — the people who move fastest also own the outcomes.

🔇 Speaker Blindspot: Selection bias — Anthropic’s internal productivity data comes from AI-enthusiastic engineers at a frontier AI company. That’s like measuring the productivity gains of electricity by surveying Tesla employees. The gains are real, but the baseline population isn’t a health system IT department running 15-year-old code on a shared Oracle instance.

📅 This Week in Health AI Events

Free virtual events for clinician-builders — attend live or catch the recording later.

Tue Jun 23 — LEAP 2026 Information Session (ONC)
1:00 PM ET · Virtual · Free
ONC walks through the Leading Edge Acceleration Program and its 2026 emphasis notice — the federal on-ramp if you’re building health-IT tooling and want non-dilutive backing.

Thu Jun 25 — Adoption of AI in Clinical Care: Updates from the HHS RFI (ONC / HHS)
3:00 PM ET · Virtual · Free
HHS leadership shares what came back from the national AI-in-clinical-care RFI. If you build clinical AI, these are the policy signals you’ll be designing around.

Tue Jun 30 — Clinician in the Loop: AI Investment to Real-World Impact (CHIME, LinkedIn Live)
12:00 PM ET · LinkedIn Live · Free
For CMIOs and clinical leaders: how to tell whether your AI is actually working post-deployment.

What are you building this week? Email and tell me (kevin@clinicians.build) — I read every one.

— Kevin

clinicians.build

Discussion about this post

Ready for more?