Chatbot overload is real 🧠, FHIR gets (another) MCP bridge 🔗

Apr 02, 2026

The Chatbot Is the Bottleneck, Not the Model

Ethan Mollick’s latest analysis cites research showing financial professionals using GPT-4o saw real productivity gains — but the chatbot interface itself created cognitive overload that partially cancelled them out. “Giant walls of text, offers to pursue new topics, and sprawling discussions” hurt less experienced workers most — the exact people who’d benefit most from AI. Mollick argues the next AI leap isn’t bigger models but better delivery: specialized interfaces, familiar platforms, and context-specific tools. Claude Dispatch — send a task from your phone, get results later — is his example of “post-chatbot AI.”

😤 Haters

“This is obvious. Everyone knows chatbots aren’t the final form.” Knowing it and quantifying it are different. The research shows interface overhead measurably reduces AI productivity gains. If you’re building a clinical AI tool and your delivery mechanism is a chat window, you’re leaving value on the table — and the people who need it most (residents, new nurses, rural clinicians with less support) are the ones losing out.

“Purpose-built interfaces are just more expensive to build.” They are. But the vibe coding era changes the cost equation. A clinician who understands the workflow can now build a focused interface for their specific use case faster than an enterprise vendor can ship a generic one. That’s the entire thesis of clinicians.build.

💡 80/20: If you’ve built or are building a clinical AI tool delivered through a chat interface, the Mollick research suggests you’re underselling your own tool. Reframe: what would a single-purpose interface look like for your most common use case? One input, one output, no menu.

AI Products Need Failure Mode Report Cards

Automate Clinic published a concept called AI Failure Mode Literacy — the ability to understand when and how AI tools succeed or fail. Right now, developing this intuition requires what they call an “insane amount” of hands-on use. Their proposal: every AI product should ship with a consistently updated report card documenting its failure modes. Not just accuracy benchmarks, but contextual failure patterns.

😤 Haters

“Vendors will never voluntarily publish their failure modes.” Probably not — at least not the incumbents. But imagine a clinical AI marketplace where the tools that publish failure data earn trust faster than the ones that don’t. Transparency becomes a competitive advantage. The first vendor to ship real failure documentation in a clinical context will differentiate on something no one else is offering.

“This is just model cards rebranded.” Model cards describe training data and performance metrics. Failure mode reports would describe contextual behavior — when does this tool break in practice, under what clinical conditions, with what patient populations? That’s operationally useful in a way model cards aren’t.

💡 80/20: Next time you evaluate a clinical AI tool, ask the vendor: “What does this tool get wrong, and how often?” If they can’t answer, that’s your answer. Try: keep a running log of failure modes for every AI tool you use clinically — even just a shared doc. You’re building the report card that doesn’t exist yet.

→ Full write-up

🛠️ From the Workbench

LangCare — Open Source MCP Server for FHIR + AI Agents

LangCare (formerly AgentCare) is an open-source FHIR MCP server written in Go that connects AI agents — Claude, ChatGPT, Gemini — to Epic, Cerner, and any FHIR R4 EMR. Ships with 40+ clinical agentic skills (medication management, lab interpretation, clinical decision support) and supports 150+ FHIR R4 resources. MIT licensed. Install via npm: npm install -g @langcare/langcare-mcp-fhir. Shared in the HTN community by creator Hari Kolasani — already past 1,600 installs in under three months.

⚠️ Verify: “HIPAA-compliant” is a vendor claim. Before routing any real patient data through this, confirm BAA availability, review the PHI scrubbing implementation, verify TLS configuration, and understand the zero-persistent-storage architecture yourself. Open source means you can audit it — which is an advantage, but also a responsibility.

😤 Haters

“An npm-installed MCP server handling PHI is a security nightmare waiting to happen.” It’s a reasonable concern. But the architecture is a stateless proxy — it translates MCP requests to FHIR calls and passes back structured responses. No persistent storage means no data at rest to breach. The real risk surface is in transit: OAuth2 token handling, TLS implementation, and whether the PHI scrubbing actually catches everything. Being open source means you can verify all of this, which is more than you can say for most commercial FHIR middleware.

“1,600 installs doesn’t mean production-ready.” Correct. But 1,600 installs in under three months for a healthcare-specific MCP server means there’s real demand for this connector layer. If you’re experimenting with AI agents in a sandbox environment with synthetic FHIR data, this is worth testing. Production with real PHI is a different conversation.

💡 80/20: If you’ve wanted to connect Claude or another LLM to a FHIR sandbox but didn’t want to build the plumbing, this is your shortcut. Try: set up LangCare against a HAPI FHIR test server with synthetic data this weekend. Zero patient risk, full agent capability.

→ Full write-up

🎯 Clinician-Builder Tip of the Day

Before you build the feature, build the test. Not a unit test — a clinical scenario test. Write down five specific patient encounters where your tool should help, and five where it should stay out of the way. Run those scenarios against your prototype before you write another line of code. The five “stay out of the way” cases will teach you more about your tool’s actual value than the five where it shines. A tool that knows when to be quiet is more trustworthy than one that always has an answer.

What are you building this week? Reply and tell me — I read every one.

— Kevin

clinicians.build

Discussion about this post

Ready for more?