AI beats docs in a small journal called Science 🧠, Cigna exits ACA + eviCore 🏃, Waystar bets $100B on agents, Photon Series A

May 01, 2026

An OpenAI model (o1-preview) outperformed physicians on diagnosis. The paper is in Science. The clinician co-author is the one cautioning how to read it.

Researchers including Beth Israel Deaconess internist Adam Rodman — co-senior author and a clinical AI researcher — published a paper in Science [linkedin] demonstrating o1-preview outperforming physicians on case-based diagnostic and clinical-reasoning tasks across multiple experiments, including one that drew on real-world data from a Boston emergency department.

“Our findings suggest that LLMs have now eclipsed most benchmarks of clinical reasoning” … and that is with o1-preview. Granted with NEJM CPCs and such.

According to STAT, the co-author Rodman points out that a paper based on “simulated and historical cases” could be misconstrued as proof of safety and efficacy in patient care, and this paper is not that proof.

😤 Haters

“Simulated and historical cases are not real care — this is the same trick every clinical-AI paper plays.” Half right, and it’s the half the senior author himself flagged. Where this paper is structurally different: one of the experiments uses real-world data from a Boston emergency department, the journal of record is Science (not a clinical-AI workshop proceedings), and the framing argues against the 1959 yardstick the field set for itself. If your reaction is “I don’t believe it until I see it on real care decisions in my system,” the paper agrees with you — and that disagreement-as-agreement is the most useful thing it does. It moves the question from “can the model diagnose?” to “what would a paper that proves safety in care actually have to look like?”

“The model wins because the cases were curated for the model.” Possible, and worth checking against the supplementary methods. The harder version of this critique is the one to take seriously: the part of clinical reasoning a written case captures is the part the model is best at — the explicit, structured, verbal version of what was already in the chart. The part the case does not capture is the part the model struggles with — the embodied judgment the doctor brings into the room, the context the chart never recorded, the prior the patient never told anyone before. That is also the part the Rodman caution points at. The case is not the bedside. The benchmark is not the deployment.

📡 Builder’s Radar

Cigna exited the ACA exchanges and put eviCore on the block — same earnings call, same press cycle as the AHIP “voluntary” prior-auth pledge.

Cigna’s Q1 release beat estimates with $1.65B in profit and used the strong earnings as the cover for two strategic moves announced in the same breath: exit the ACA exchanges after this year (~1.4M members) and pursue “strategic alternatives” — including a sale — for eviCore, the prior-authorization subsidiary that operates one of the largest UM workflows used by hospitals and other payers. Cigna joins UnitedHealth and Aetna in the marketplace pullback; cumulatively, three of the four largest national insurers are now out, or on the way out, of the exchanges in 18 months. eviCore on the block changes the buyer landscape for Cohere Health, Janus, Anterior, and the FHIR-based PA pilots — there’s now a $1B+ asset whose customer base is in motion.

😤 Haters

“Cigna is just shedding businesses that were ‘more trouble than they were worth’ — that’s a strategic refocus, not a marketplace story.” The marketplace exit is not just trouble shedding when it’s the third national pulling back inside 18 months. The structural story is that pure-play ACA-anchored digital health has a customer-base erosion forecast through 2028, regardless of whether each individual exit was rational at the company level. The aggregate is a different story than the individual filings.

“The eviCore sale is just one large vendor changing hands — the PA workflows don’t change.” The workflows don’t change overnight. What changes is who builds the next version. If a strategic acquirer (a Cohere, an Optum, a private-equity rollup) ends up with eviCore’s customer book, the rebuild path is going to look like the FHIR-based DaVinci PAS pilots inside the new owner’s stack — and that rebuild is exactly where the leverage is for clinician-builders shipping into UM. The interesting question is who ends up holding the asset by the end of 2026.

Waystar called out a $100B revenue-cycle labor pool by name and pivoted the whole company to agentic AI. The public-company framing matters more than the quarter.

Waystar’s Q1 earnings (Cailey Gleeson, Fierce Healthcare): strong growth, and a strategic shift away from task-level automation toward agent-first revenue cycle workflows. CEO Matt Hawkins, on the call: “That shift unlocks a much larger opportunity, the approximately $100 billion in annual revenue cycle labor services performed across the industry today.” That’s the line. A public company just put a number on the addressable market and named human labor cost as the denominator. Waystar joins Arintra, Notable, Hippocratic, and the broader peer set moving from rules + RPA to agent-first architectures.

😤 Haters

“Public-company strategy decks are not deployment — naming $100B is marketing, not capability.” Marketing is also forecasting. When the investor-relations frame for a publicly traded RCM platform shifts from “we automate tasks” to “we replace this category of labor,” the buyer side starts modeling the same denominator. Hospital CFOs read these calls. The 2027 RFP language for RCM contracts is going to look different because of how this quarter’s earnings calls were framed.

“The agentic-RCM category is full — Hawkins is just catching up to where Notable and Hippocratic already were.” Catching up at the public-company layer is structurally different from being there as a startup. The buyer signal a public company sends — that the agent-first RCM bet is on the official roadmap, with an $XB-scale market in the deck — collapses the procurement-defensibility argument for the hospital CFO who was waiting for cover.

Photon raised $16M for modern e-prescribing. The conformance-heavy rail keeps marking up while the autonomous-action agent keeps hitting state-board friction.

Photon Health, the API-first modern prescribing infrastructure play, raised $16M (Fierce Healthcare) to scale a developer-first alternative in a layer that has historically been Surescripts-and-incumbents. The pitch is the same one good infrastructure pitches have always had: the existing rail has unnecessary transfers, dead-end phone calls, and abandoned fills as symptoms of bad price/availability visibility, and the fix is API-first plumbing developers can build on. What makes the round notable on this date is not the absolute number — $16M is a normal Series A — but the adjacent news cycle: Utah’s medical board recommended suspension of the Doctronic async-AI prescribing pilot last week, and the same week’s funding action goes to the rail underneath, not to the agent on top. The conformance-heavy infrastructure layer keeps clearing funding bars while the no-FHIR autonomous-action layer keeps hitting geographic and licensure friction.

😤 Haters

“Photon vs Surescripts is the pitch every modern-prescribing startup makes — most of them have margin economics that don’t work.” True historically — the e-prescribing rail is a brutal margin business and the consumer pharmacy front-door tools (Truepill, Capsule) have struggled. What’s different about the developer-platform framing is that you can be a useful piece of the stack for telehealth, employer benefits, and pharmacy networks without owning the consumer experience. The bet is on margin in the rail, not in the front door. We’ll see in 18 months whether the bet pencils.

“The Doctronic comparison is unfair — Photon and Doctronic do completely different things.” Different products, same architectural lesson. Photon is identity/routing; Doctronic was async clinical action with cursory clinician sign-off. The market-map signal of the same news cycle is that capital and regulation both prefer the layer where compliance and conformance work has been done over the layer where they have been waved past. Choose your build accordingly.

💡 80/20: Build the rail, not the agent — the layers that pass conformance work get funded and the layers that try to skip it get state-board letters.

What are you building this week? Reply and tell me — I read every one.

— Kevin

clinicians.build

Discussion about this post

Ready for more?