Proven AI gathers dust while unproven AI runs the floor đ„, JAMA wants to license AI like physicians đ, Starbucks spent $600M learning what clinicians already know â
The Paradox of Medical AI: 44 RCTs gather dust while 72% of docs use unproven LLMs.
Eric Topol laid out the implementation paradox on Saturday. Deep learning for medical imaging â mammography, colonoscopy, retinal screening, CT â has more rigorous evidence than almost any technology in medicine. A new retinal foundation model called Reti-Pioneer adds detection for thyroid disease, gout, and osteoporosis at roughly a dollar per scan. Mayoâs AI system detects pancreatic cancer 475 days before radiologists. None of it is standard practice. Meanwhile, the AMAâs March 2026 survey shows 72% of physicians use generative AI, 35% for direct patient care decisions. The tools with the evidence arenât deployed. The tools without the evidence are everywhere.
đ€ Haters
âThis is just an implementation lag â the imaging tools will catch up.â Itâs been a decade for some of these. 44 RCTs for colonoscopy AI. The lag isnât implementation â itâs economics. Every imaging AI deployment requires capital equipment integration, radiology workflow redesign, IT procurement, and a reimbursement model that rewards the extra detection. A ChatGPT tab requires nothing. Deployment friction is a structural property, not a temporary delay.
âPhysicians using ChatGPT isnât the same as clinical AI deployment.â It is functionally the same. When 35% of physicians are using generative AI for treatment and diagnostic decisions â not documentation, not admin, but patient care â thatâs deployment. Itâs just informal deployment without institutional oversight, validation, or liability frameworks. Which is worse.
âThe evidence will sort it out eventually.â Eventually is doing a lot of work. The Nature Medicine editorial is the first institutional attempt to force the issue, but prospective LLM trials are years away. In the meantime, 40 million Americans are using chatbots daily for medical support. The evidence isnât going to sort out whatâs already happening.
đĄ 80/20: Your toolâs clinical evidence matters, but your toolâs deployment friction determines whether the evidence gets a chance to matter.
đĄ Builderâs Radar
JAMA proposes licensing AI like licensing physicians â and the infrastructure doesnât exist yet.
Bergman, Wachter, and Emanuel published a JAMA Viewpoint mapping autonomous clinical AI regulation onto physician credentialing: standardized exams, supervised deployment, scope of practice, time-limited certification, layered accountability, federal preemption. The framework is intellectually clean. The institutional substrate â exam boards, oversight agencies, verification infrastructure â does not exist. Two failure modes: regulatory capture by incumbents (USMLE/ABMS extending into AI governance) or paper credentialing that looks good but doesnât constrain. The builder move is infrastructure: whoever builds the testing and validation layer builds the picks-and-shovels business underneath every autonomous clinical AI.
đ€ Haters
âWe donât need another regulatory framework â we need the existing ones to work.â The existing frameworks werenât designed for this. FDA device clearance assumes a fixed product; LLMs update continuously. State medical licensing assumes a human practitioner. The licensure analogy isnât perfect, but the existing regulatory tools are worse.
âThis will take a decade to implement.â Probably. Which is exactly why building the institutional substrate now â testing infrastructure, validation frameworks, deployment monitoring â is the opportunity. The regulation will eventually need these tools. Whoever has them built and proven when the regulation hardens owns the category.
Starbucks spent $600M learning what clinicians already know: the human is the scarce asset.
Brian Niccol invested $600M to put workers back in stores, calling AI âco-pilot, not replacement.â First positive US same-store sales in over a year. UChicago economist Alex Imas published the underlying economics: when AI drives commodity production to zero marginal cost, spending shifts to relationships and exclusivity. In experiments, people paid ~2x for identical items when others would be excluded. AI-generated art got half the exclusivity premium of human-made art. A ârelational sectorâ emerges where the human IS the product â teachers, nurses, therapists.
đ€ Haters
âStarbucks isnât healthcare.â The economic mechanism is identical. Automate the visible work, and the invisible relational work becomes the scarce asset. The baristaâs warmth. The physicianâs presence. The 30 seconds of eye contact during a terrifying diagnosis. The question is whether health systems deploy AI to give clinicians more time for that, or to fill those 13 saved minutes with two more RVU-generating visits.
âThis is just an argument against efficiency.â Itâs an argument for knowing which kind of efficiency youâre optimizing. Throughput efficiency (more patients per hour) and relational efficiency (more trust per visit) are different metrics. The Starbucks lesson: optimizing for the first at the expense of the second costs you the customer.
đĄ 80/20: Build tools that give clinicians more time for the relational work, not tools that replace the relational work. Try: for every feature on your roadmap, ask â does this give 5 minutes back to the patient, or does this give 5 minutes back to the schedule? The answer determines whether youâre building for the relational sector or automating it away.
đ ïž Builders Tip
Invert your buildâs assumptions before you ship the next feature.
Richard Hammingâs 1986 Bell Labs lecture identified a pattern that separates productive careers from busy ones: the willingness to invert constraints. Instead of asking âhow do I solve this problem?â ask âwhat if the opposite were true?â
Hereâs a copy-paste prompt you can run against any feature on your roadmap. Paste your feature spec (or a paragraph describing it) as [FEATURE_DESCRIPTION]:
You are a clinical informaticist reviewing a product feature for a clinician-built health tech tool. The feature is described below.
For each of these inversions, give me ONE concrete alternative that would be worth testing:
1. INVERT THE USER: If the primary user were the patient instead of the clinician (or vice versa), what would this feature look like?
2. INVERT THE WORKFLOW: If this feature ran BEFORE the visit instead of during/after it, what would change?
3. INVERT THE DATA DIRECTION: If this feature pushed information TO the clinician instead of pulling FROM them (or vice versa), what would be different?
4. INVERT THE AUTOMATION: If the part you're automating were kept manual, and the part that's currently manual were automated, would the tool be more valuable?
5. INVERT THE EVIDENCE: What would you build if you assumed the opposite clinical assumption were true?
Feature description:
[FEATURE_DESCRIPTION]
For each inversion, provide:
- The inverted version in one sentence
- Whether it's worth a 2-hour prototype (yes/no)
- Why or why not (2 sentences max)
Output as a markdown table.
What are you building this week? Reply and tell me â I read every one.
â Kevin


