Clinical evidence goes API-first 🔌, $7.8B in consulting bought nothing - you bring a lot more 📊, Anthropic conference 🛰️

May 06, 2026

The Clinical Evidence Layer Just Went API-First

Three moves landed in a single week. Perplexity launched Premium Health Sources — plugging NEJM and BMJ journal access directly into its AI search, with Micromedex drug data coming next. VisualDx partnered with Perplexity to bring clinician-validated dermatology imagery across all skin tones into generative AI responses. And ACOG announced a strategic collaboration with OpenEvidence to make its full evidence-based ob-gyn clinical guidance available through AI decision support at point of care.

The pattern: major medical institutions are making their knowledge accessible through AI platforms rather than locked PDFs and paywalled search portals. The evidence layer that any serious clinician-facing tool needs — journal citations, clinical imagery, specialty guidelines — is being assembled by someone else.

😤 Haters

“Perplexity is just repackaging UpToDate with better SEO.” Fair comparison on the surface. But UpToDate is a destination. Perplexity is a layer. The difference is whether evidence lives inside a single product you navigate to, or becomes an infrastructure primitive any builder can surface inside their own workflow tool. The API-first framing matters because it makes the evidence portable.

“Clinicians already have journal access through their institution.” True — and they still don’t use it at point of care because the friction is too high. The median time from clinical question to literature answer is measured in hours, not seconds. The problem was never access. It was integration into the moment the question arises.

“ACOG’s guidelines are already free on their website.” They are. But try building a tool that queries them programmatically, returns the relevant section for a specific clinical scenario, and cites the source. That’s what the OpenEvidence integration does. The raw guideline existing is not the same as the guideline being API-accessible inside a decision support workflow.

Anthropic Conference Today (Free, virtual)

Code with Claude kicks off today in San Francisco with hands-on workshops on agentic coding, MCP ecosystem, and production reliability.

Nuff said.

NerdMDs: “There Is No Replacement. There Is a Stack.”

Adam Carewe published a framework that reframes the entire medical AI conversation. The imaging-vs-chatbot debate is wrong — imaging AI, LLMs, and World Models are three layers of the same stack. Imaging AI (mammogram, retina, CT, ECG) is the kernel layer — the evidence is done, but it needs distribution infrastructure. LLMs are the application layer — 40 million Americans already walked around every gate we tried to build, and 72% of physicians did too. The gate failed. What LLMs need is an observatory: monitoring and outcome tracking, not more guardrails. World Models (action-conditioned, counterfactual prediction) are the simulation layer — still years out for clinical use.

😤 Haters

“This is just another framework nobody will use.” Maybe. But the insight that LLMs out-shipped imaging AI because they bypassed the clinical operational stack entirely — distribution ate the model — is the kind of observation that changes where you aim. If you’re building imaging AI, your bottleneck is ops, not accuracy. If you’re building on LLMs, your bottleneck is monitoring, not gatekeeping.

“World Models are science fiction right now.” For clinical use, largely yes. But AMI Labs just raised $1B to start building them. Knowing where the stack is heading helps you decide which layer to build on today.

💡 80/20: The one thing to steal: ask which layer your tool lives on. If kernel (imaging/diagnostics), your job is distribution. If application (LLM-based), your job is monitoring outcomes. Building accuracy on a layer that needs distribution is building the wrong thing.

JAMA: US Hospitals Spent $7.8 Billion on Consultants With No Measurable Improvement

A JAMA study examining 14 years of management consulting contracts at US nonprofit hospitals found that $7.8 billion in spending was not associated with meaningful changes in finance, operations, or quality of care.

😤 Haters

“Measuring consulting ROI is notoriously hard — absence of evidence isn’t evidence of absence.” The JAMA authors anticipated this. They measured across finance, operations, AND quality over 14 years across hundreds of institutions. If $7.8B can’t move the needle on any dimension, the burden of proof shifts to consultants to demonstrate value, not to researchers to find it.

“This isn’t a health tech story.” It absolutely is. The consulting industry’s primary deliverable to hospitals is recommendations — and the market just published that recommendations without execution don’t work. Clinician-builders who ship working tools solve the execution problem that $7.8B in recommendations couldn’t. The builder’s counternarrative in one sentence: working software beats a slide deck.

💡 80/20: Hospitals spent $7.8B on advice that didn’t change outcomes. Working software that solves one specific workflow problem — even a small one — demonstrably moves numbers in a way that recommendations alone cannot. Try: Frame your next pitch to a hospital innovation team not as “here’s what you should do” but as “here’s a working prototype that already does it on synthetic data.”

🧰 Builder’s Tip

Workflow Pattern: Map Your Evidence Queries Before Building a CDS Tool

Before you build a clinical decision support tool, spend one weekend mapping the evidence queries it will need to answer. Here’s the workflow:

Pick one clinical workflow you want to support (e.g., anticoagulation management for new AF patients)
Write out the 10 most common questions a clinician asks during that workflow — be specific: “Is apixaban safe with CrCl 20?” not “drug interactions”
Run each question through three sources: Perplexity Health, OpenEvidence, and a manual PubMed/UpToDate search
For each, score: (a) Did it find the right guideline? (b) Was the citation accurate? (c) Did it miss a critical nuance a specialist would catch?
The gaps in column (c) — those are your product. The things the AI evidence layer gets wrong or misses are the features only a clinician-builder would know to add.

This gives you two things: a source-quality audit (which evidence APIs are reliable for your domain) and a feature map (where the evidence layer fails and your clinical expertise becomes the differentiator). All on synthetic scenarios, all from your couch, zero HIPAA exposure.

The 80/20 of CDS: the evidence retrieval is becoming a solved problem. The clinical judgment layer — knowing which guideline applies to THIS patient with THESE comorbidities at THIS time — is your build. Map the gap first.

What are you building this week? Reply and tell me — I read every one.

— Kevin

clinicians.build

Discussion about this post

Ready for more?