OpenEvidence pulls out of Europe 🇪🇺, AI finds 38 holes in OpenEMR 🕳️, Cursor wipes a prod database in 9 seconds 💀
OpenEvidence pulled out of the UK and Europe rather than comply with the AI Act. The compliance list is the forecast.
OpenEvidence, the $12B clinical AI search tool that roughly 40% of US clinicians use daily, shut down access across the UK and the EU this week. The landing page now displays an apology message in place of the answers it used to return. The framing came from Lukas Saari, the CEO of Tandem Health, an EU competitor — his LinkedIn post pulled 515 reactions, 73 comments, and 19 reposts in the 24 hours before this issue. HIStalk corroborated the withdrawal independently the same day.
On the LinkedIn comments, Cyril Pineau answered Saari’s post with the seven things AI Act compliance for medical decision-making (high-risk classification under Article 6) actually requires: appoint a legal representative in the EU, produce comprehensive technical documentation, demonstrate effective control of clinical risks, ensure full traceability of sources, outputs, and logs, prove a formally-defined human oversight mechanism, implement post-market monitoring, and submit to oversight and audits by national authorities. OpenEvidence ran the math on that list against the size of the European clinician market and decided the answer was no.
The interesting part is not the regulatory geography. It is that we just got a clean read on the price tag of a written-down clinical AI rule, denominated in a market a US company already has product-market fit in. The seven items above are not exotic. They are what every serious clinical AI deployment on US soil will look like when ONC, FDA, or HHS finishes its current rulemaking — which is the active comment window we are inside of right now. Read the Article 6 list as the spec, not as a foreign curiosity. The clinical AI products that will survive the next US rule are the ones whose builders are already auditing themselves against this list and shipping the gaps as roadmap items, not the ones whose marketing decks treat compliance as a Q4 problem.
😤 Haters
“OpenEvidence pulled out because the EU clinical-AI market is too small to justify the work, not because the rule is unreasonable.” Probably true at the margin — the EU clinician digital-tool spend is a fraction of the US, and the cost of seven-item compliance is the same whether you have 100 users or a million. The point cuts the other way, though. A company with traction and capital decided the market was not worth the spec. That tells you the spec is real. A US version, applied to the same product on a much larger user base, is not a “we can’t justify it” decision — it is a “we have to do it” decision. The number of US clinical AI vendors who have a written, audited answer for the seven items today is small.
“This is the EU AI Act being too aggressive — the right policy reaction is to push the US regulator the other direction.” Plausible policy view, and worth arguing in the comment record. But notice the move: a sophisticated buyer would not bet on a regulator stepping back from rule-of-law-style requirements that are already in force in another jurisdiction with a clean medical-device precedent. Hope is not a deployment strategy. Even if the eventual US rule is lighter, the comparison set for the procurement reviewer is now Article 6. You will be asked which ones you meet. Have the answers.
“OpenEvidence is one product — drawing a regulatory forecast off one company’s market exit is overfitting.” Single data point, agreed. The reason it is more than that is the explicit sequencing: AI Act enters force, MDR comes online for AI as a medical device, and the company most directly equivalent to the US clinical-AI category leader (free, NPI-verified, daily use) makes the call to leave inside the same compliance window. It is the cleanest single-company demonstration of the tradeoff we have right now, and the tradeoff is what the rule will force here too. Update the model with the next data point. Do not throw out the first one.
→ Full write-up
An AI security tool just found 38 critical vulnerabilities in OpenEMR. Same week, Medtronic disclosed a cyberattack. Treat them as one story.
Aisle ran an agentic security analysis on OpenEMR — the open-source EHR used by more than 100,000 providers, the system-of-record at a meaningful share of FQHCs and small practices — and reported 38 critical vulnerabilities. The same week, Medtronic disclosed a corporate-IT cyberattack traced to phishing, the second major medtech OEM breach in 90 days after Stryker. Two stories, one structural read: agentic AI is now both the cheapest auditor and the cheapest attacker, and the soft underbelly is the legacy stack everyone forgot was load-bearing.
😤 Haters
“Aisle is selling you the audit tool, the 38-vulnerability number is marketing.” Probably some of both. The number is impressive enough to be suspect, and the company is incentivized to publish a big one. Two things are true regardless: the technique — agentic code review at scale on an open codebase — is now table stakes, and the OpenEMR codebase is widely used and not actively hardened by a vendor security team. The procurement question for any FQHC running OpenEMR is no longer “should I trust Aisle’s number” but “have we run any agentic audit at all, including the one bundled into the Anthropic Cyber Verification Program announced last week post-Vercel breach?”
“Medtronic and OpenEMR are different stacks — bundling them is sloppy.” Different stacks, same regulatory tide. Two breaches in a quarter, one inside an OEM that ships connected devices, the other inside the EHR a hundred thousand small practices run, will get the attention of FDA’s premarket cybersecurity review and CISA’s healthcare sector advisories. The clinician-builder building anything connected to either stack is now in the path of whatever guidance comes out of the next 60 days.
💡 80/20: Agentic security review is the same shape as agentic clinical review — cheap, fast, and almost certainly running against your stack whether or not you commissioned it. Try: point an agentic code reviewer at your own side-project repo this weekend (Claude Code’s /security-review command, or Anthropic’s healthcare devcontainer). The first 10 critical findings are usually obvious and free.
A Cursor agent powered by Claude deleted an entire production database in 9 seconds. The lesson is the missing layer, not the model.
Tom’s Hardware reported that a Cursor agent — running Claude as the underlying model — wiped a production database and the associated backups in nine seconds during what was meant to be a routine task. The same week, Anthropic published its official Claude Code devcontainer, and Trail of Bits shipped a hardened version with stricter network policies, capability dropping, and a fail-closed default. The order of events matters: the failure mode and the mitigation showed up the same week.
😤 Haters
“This is a Cursor problem, not a Claude problem — the model did what it was told.” Mostly true at the technical level — the model was operating inside an agent harness with write access to a production system, which is the Cursor product surface, not the model. The point cuts the other way for clinician-builders, though: if you are building anything where the agent has tool access to a real system (a FHIR sandbox, a billing API, your local Postgres), you are exactly one mis-prompt away from the same nine-second story. The harness is the safety surface. The model is not.
“The fix is not a devcontainer, the fix is not giving an agent production credentials in the first place.” Agreed. The reason the devcontainer matters is that “do not give an agent production credentials” is policy, and policy without a default-on technical control is what gets you in the next breach report. A hardened devcontainer is the technical control that turns the policy from “trust the developer” into “the network simply will not let the agent reach production.” For a clinician-builder running a side project on synthetic data, that distinction is the difference between a weekend learning experience and a weekend incident report.
🛠️ From the Workbench
OpenMed shipped what looks like the first medical fine-tune for PHI redaction.
Maziyar Panahi announced on LinkedIn that OpenMed has released its first fine-tune — a model targeting medical text redaction. Redaction is the foundational primitive for almost every other healthcare AI workflow on real text: it is what gets the PHI out of the prompt before you send anything to a non-BAA model, what makes synthetic-data pipelines defensible, and what gets a side project from “interesting” to “actually safe to demo.” A purpose-built medical redactor with the rhythm of clinical text — the dictation artifacts, the fragmentary problem lists, the carry-forward sentences — is a meaningfully different thing than a general-purpose PHI scrubber.
😤 Haters
“Redaction is a solved problem — Presidio, AWS Comprehend Medical, the Stanford de-identifier all exist.” Solved-ish, and each leaves the same long tail: free-text fragments, the same patient referenced four ways in the same paragraph, the chart artifact you only see in production. The interesting question for OpenMed is not whether the F1 beats Presidio on the i2b2 test set. It is whether the fine-tune is robust to the kind of clinical text your project actually generates, which is something you can only know by running it on your own examples.
What are you building this week? Reply and tell me — I read every one.
— Kevin


