Clinician-founded AI banks $10M 📚, no-code healthcare builders ship in tandem 🛠️, RAPID cuts device coverage to 2 months ⚡
The RAG-for-medicine author just raised $10M to build evidence-grounded clinical AI.
Almanac Health launched this week with a $10M seed led by F-Prime, with General Catalyst and Lightspeed participating. Total raised is close to $12M. The founder is Cyril Zakka, MD, a physician-researcher whose Stanford work produced what NEJM AI called one of the most-cited papers on retrieval-augmented generation for clinical medicine. The product is described as “safe, evidence-grounded AI for clinicians and health systems” — grounded in peer-reviewed evidence, governed by institutional controls, integrated directly into existing EHR systems, and explicitly free of pharmaceutical advertising.
Read this next to yesterday’s ChatGPT for Clinicians launch and the shape of the category becomes clear. OpenAI owns the distribution play — free to every verified US clinician, 99.6% physician-rated-safe across 7,000 pre-release conversations, a benchmark built on 700,000 model responses. Almanac is not competing on distribution. It is competing on the only thing a generalist model cannot fake: the claim that every answer is tethered to a peer-reviewed source, in an EHR integration a health system can govern, built by someone whose paper is already in the citations.
That second position is where clinical domain expertise becomes the moat. You cannot clone “the author of the RAG-for-medicine paper” with more compute. You can clone access, scale, and polish — but the retrieval index, the evaluation against ground truth, the decision about which specialty pathway gets pre-verified first, the negotiation with an AMC IRB over what counts as an acceptable citation — those are all clinician judgment, applied at every layer of the stack.
😤 Haters
“Every clinical AI company says they’re ‘evidence-grounded.’” Most are. Almanac’s differentiator is provenance: the founder authored the mechanism being commercialized. That is a harder claim to pattern-match around than “we use RAG.” Whether the product is actually better is an empirical question — the company says academic medical centers are doing the validation. Until those studies read out, the pitch is the pitch. But the starting credibility is not nothing.
“A $10M seed in 2026 is noise. OpenAI’s launch is the real story.” One is distribution, one is an evidence-first thesis. They co-exist. The more interesting question is whether the free-ChatGPT clinical tier raises or lowers the ceiling for a specialist product like Almanac. My read: it raises it. Free generalist tools calibrate clinicians to expect citations, BAAs, and audited outputs. Almanac sells into a market OpenAI just primed.
“You’re burying the lede — what’s the actual product?” Fair. Public-facing detail is thin; almanac.chat is a marketing page. What we know: EHR-integrated clinical decision support, peer-reviewed retrieval, currently undergoing validation in AMC settings, no pharma ad surface. What we don’t know: workflow integration granularity, BAA posture, pricing, specialty coverage on day one, and which RAG index they’re actually building against. Track these when the product reaches general availability.
💡 80/20: The funded thesis is that clinician-authored evidence retrieval is the defensible layer, not the model. Try: pick one clinical question you answer 20 times a week and write down the three citations you would want a tool to retrieve before it answers. That list is the spec for what a specialist-grade AI has to hit, and it’s a better evaluation rubric than any public benchmark for your actual workflow.
→ Full write-up
📡 Builder’s Radar
Two no-code healthcare AI agent builders launched the same day. Domain operators — not software engineers — are the named user.
Infinitus Systems launched Studio, a natural-language agent builder that the company calls the first healthcare-specific no-code AI agent platform. Reported metrics: 90% faster deployment, 40% higher accuracy than manually built agents, and a patent-pending Agent Response Control layer that claims to route sensitive clinical or privacy queries through pre-verified compliant response paths. Infinitus says 44% of Fortune 50 healthcare companies already run its platform, with over 100 million minutes of clinical and administrative conversations behind it. The same day, Gravity Rail launched with a $2.75M seed led by Redesign Health: model-agnostic, HIPAA-compliant with a BAA, zero data retention, natural-language SOP-to-agent translation across voice, SMS, email, and web.
😤 Haters
“No-code always leaves the safety layer as someone else’s problem.” Usually yes. Both launches are explicitly selling the safety layer as a feature — Infinitus with ARC, Gravity Rail with a model-agnostic BAA posture and SOP-as-spec. That pitch has to survive red-team pressure before it’s true. But the framing is correct: the reason ops teams cannot ship agents today is not the LLM, it is the compliance wrapper, and both companies are targeting that wrapper as the product.
“44% of Fortune 50 is a vendor vanity number.” It’s a logo count, not a contract value. Still, if even a fraction of those accounts flip from Infinitus’s existing voice product to Studio-built custom agents, the number of health-plan and provider workflows running no-code agents goes up by orders of magnitude this year.
CMS and FDA launched a new breakthrough-device coverage pathway and paused TCET. Medicare coverage can now land 2 months after market authorization, not a year.
CMS and FDA jointly announced the RAPID pathway — Regulatory Alignment for Predictable and Immediate Device coverage — for certain FDA-designated Class II and Class III Breakthrough Devices. The mechanism: CMS issues a proposed National Coverage Determination the same day an eligible device gets FDA market authorization, triggering the 30-day public comment window. Total target: Medicare national coverage and payment as soon as two months after authorization, compared to a year or more today. Eligibility requires an IDE study enrolling Medicare beneficiaries with clinical outcomes agreed on by both agencies. The existing TCET pathway is paused for new candidates while RAPID gets stood up.
😤 Haters
“This is a medtech story, not a software story.” Devices increasingly are software — software-as-a-medical-device, AI/ML-enabled devices, closed-loop systems. The coverage-before-evidence problem has been the single biggest reason clinician-built algorithms don’t get reimbursed. RAPID does not fix that problem, but it shows CMS is willing to accept a tighter evidence loop alongside FDA authorization — a precedent the AI/ML device lane will draft off of.
“IDE enrollment of Medicare beneficiaries is an expensive gate.” It is, and it probably keeps RAPID out of reach for early-stage builders. But the structural move is what to watch — same-day NCD proposal, 30-day public comment, simultaneous FDA/CMS review. If RAPID works, a future lane for algorithm-only Breakthrough designations becomes much easier to imagine.
💡 80/20: The old answer to “when does Medicare pay for this?” was “after FDA, eventually, maybe a year, maybe never.” The new answer is “two months from authorization, if you enrolled Medicare beneficiaries in the IDE.” Reframe: if you’re a clinician-builder advising a startup on a Breakthrough device, the question to ask on day one is not “how do we get FDA?” but “does our pivotal study enroll Medicare beneficiaries, and if not, why not?”
The AMA told Congress this week: AI chatbots should be statutorily barred from diagnosing or treating mental health conditions.
In letters to multiple congressional committees Wednesday, the AMA asked for clear statutory boundaries prohibiting AI chatbots from engaging in mental health diagnosis or treatment — no offering anxiety or depression diagnoses, no recommending medications, no presenting as a licensed clinician. Additional asks: mandatory disclosure that users are talking to AI, suicide/self-harm detection and de-escalation language, escalation pathways, post-market monitoring, serious-incident reporting (especially for pediatric use), limits on advertising to minors, and strict data-retention controls. The context is multiple reported cases of young users dying by suicide after confiding in chatbots that appeared to encourage self-harm.
😤 Haters
“This is doctors trying to gatekeep a modality that’s helping people who can’t access care.” It’s both. The gate is real, and the people falling through the current gap are real too. The AMA letter is not asking to ban chatbots — it’s asking to prohibit them from presenting as licensed clinicians and making diagnosis-and-treatment decisions without human oversight. That line is the same one that defines what clinician licensure is for in the first place.
“Regulation will freeze the useful products out of the market.” Regulation written correctly will freeze out the dangerous products while leaving triage, psychoeducation, and referral routing — exactly the places clinician-built tools could ship today — inside the lines. Regulation written poorly will do the opposite. Which version Congress picks is the question.
💡 80/20: If statutory boundaries land, the non-diagnostic-non-treatment layer is where clinician-built mental health tools get to operate — screening workflows, escalation logic, referral routing, after-visit education. Try: if you’re building in this space, write down where in your product the output stops being “information” and starts being “a diagnosis or treatment recommendation.” That line is your regulatory perimeter, and it should exist in the product today, not in a future compliance review.
🧰 Builder’s Tip
Workflow Pattern — Audit your AI’s retrieval before you trust its answer.
Almanac’s whole pitch is that the retrieval layer is the product, not the model. You can run the same audit on any clinical RAG tool you’re evaluating, including free ones, in about an hour.
Pick 20 clinical questions you already know the answer to. Not edge cases — common ones you answer on a normal week. Write down the single best citation for each: guideline, Cochrane review, landmark RCT, UpToDate card, whatever is the canonical answer in your specialty.
Run the 20 questions through the tool. Save every answer and — critically — every citation the tool retrieves.
Score two things separately. Answer correctness (did it get to the right conclusion?) and retrieval fidelity (is the citation the right one, the wrong one, fabricated, or merely adjacent?).
Keep the disaggregated numbers. A tool that is 95% correct with 40% retrieval fidelity is a tool that is trained to sound right, not a tool that is grounded. That’s a meaningful distinction, and it’s the one Almanac is monetizing.
Re-run the same 20 questions every 3 months. The retrieval index drifts. The model changes. Your audit is the only thing that catches it.
This is the cheapest, most under-used clinical AI evaluation — most teams skip straight to LLM-as-a-judge benchmarks and never look at what got retrieved. Look at the retrievals first. The answer follows the citation, not the other way around.
What are you building this week? Reply and tell me — I read every one.
— Kevin


