Your patients are already using ChatGPT 🤖, psychiatry gets an AI copilot 🧠, LLMs shrink 6x 📦

Mar 27, 2026

🔬 The Big Thing

One in three Americans used AI for health advice this year. Nearly half never talked to a doctor afterward.

The KFF Tracking Poll on Health Information and Trust, published March 25, found that 32% of American adults have used AI chatbots for physical or mental health information in the past year — matching social media as a health information source. For physical health, 29% have used AI. For mental health, 16%. Among 18-to-29-year-olds, mental health AI use hits 28% — three times the rate of adults over 50. Among uninsured adults, overall usage climbs to 40%, with 30% using AI specifically for mental health guidance.

The satisfaction numbers are high: 92% of physical health users felt satisfied with responses, and 69% trusted the information. But zoom out to the general public and trust craters — only 33% of all adults trust AI for health information, and just 23% trust it for mental health. The people using these tools are self-selected believers. Everyone else is watching skeptically.

Here’s the number that matters most for anyone building clinical tools: 42% of people who used AI for physical health questions did not follow up with a healthcare provider. For mental health, 58% skipped professional consultation entirely. Among 18-to-29-year-olds, the gap is even wider. The top reasons for using AI instead of a provider? Speed (65%), privacy (36%), and cost — 19% said they couldn’t afford healthcare, rising to 29% among young adults.

And 41% of AI health users have uploaded personal medical records — test results, doctor’s notes — into these chatbots. That’s 13% of all American adults handing their PHI to systems with no BAA, no HIPAA obligation, and no clinical accountability.

😤 Haters

“32% is still a minority — most people still see their doctor.” True. But 32% a year into mainstream AI chatbot adoption is not a minority trend, it’s an adoption curve. For context, telemedicine was at roughly 11% pre-pandemic and was considered transformative. And the 40% uninsured rate tells you where this is heading: AI-as-first-contact is becoming the default for people who can’t access the system, not a supplement to it.

“People are just looking up symptoms — that’s what WebMD was for.” The difference is that AI chatbots don’t present a list of possibilities. They answer in the first person with confident, conversational authority. A patient who Googles “chest pain” gets a list of 20 conditions. A patient who asks ChatGPT gets a paragraph that reads like a reassuring friend who went to medical school. The format changes the trust equation.

“If 92% are satisfied, what’s the problem?” Satisfaction measures whether the answer felt helpful, not whether it was correct. A confidently wrong answer about a medication interaction feels just as satisfying as a correct one — until it doesn’t. Schwartz’s vibe physics paper last week showed the same dynamic: Claude’s errors were invisible without domain expertise. Same principle, higher stakes.

💡 80/20: Your patients are arriving with AI-generated priors. That changes the encounter whether you acknowledge it or not. Try: Ask one patient this week, “Did you look this up with ChatGPT or another AI before coming in?” Track what they say. If you’re building patient-facing tools, this data suggests your competition isn’t another app — it’s the free chatbot they already used before they found you.

→ Full write-up

📡 Builder’s Radar

Blossom Health Raises $20M to Put an AI Copilot Alongside Psychiatrists

Blossom Health announced $20 million in seed and Series A funding led by Headline, with participation from Village Global, TA Ventures, Operator Partners, and Correlation Ventures. The New York-based telepsychiatry platform operates across nine states with over 100 psychiatrists treating 10,000+ patients. The AI copilot handles two distinct jobs: between-visit, it runs conversational text check-ins with patients — replacing static PHQ-9 questionnaires with adaptive follow-ups on sleep, mood, and medication adherence. During visits, it automates billing, scheduling, insurance coordination, and pharmacy management. Most patients get appointments within 48 hours, often same-day, with average copays around $22.

😤 Haters

“AI texting mental health patients between visits sounds like a liability nightmare.” The key distinction is that Blossom’s agents aren’t making clinical decisions — they’re gathering structured data and flagging warning signs for the psychiatrist. That said, the line between “monitoring” and “clinical decision support” gets blurry fast when an AI agent decides which patterns to escalate and which to ignore. The regulatory classification will matter.

“$20M to build a telepsych platform with a chatbot? Talkiatry raised $210M.” Scale matters less than the integration model. Talkiatry is a practice that happens to use tech. Blossom is betting that the AI layer — continuous monitoring, automated admin, adaptive check-ins — is what makes a small practice operate like a large one. Different thesis.

💡 80/20: Between-visit continuity is where most mental health care falls apart. The clinical insight here isn’t the AI — it’s that text-based asynchronous check-ins map to how patients actually communicate about mood. Reframe: If you’re building anything in chronic care, the visit is the least important touchpoint. The 29 days between visits is where outcomes are decided.

Google’s TurboQuant Cuts LLM Memory Use by 6x Without Quality Loss

Google published TurboQuant, a compression algorithm that reduces the key-value cache in large language models — the part that stores context during inference — by 6x while maintaining output accuracy. Early testing shows an 8x throughput increase alongside the memory savings. The technique compresses the KV cache so it doesn’t need to be recomputed at each step, which is the main bottleneck for running long-context conversations on limited hardware.

😤 Haters

“Another compression paper from Google — wake me up when it ships in a product.” Fair. But TurboQuant targets the KV cache specifically, which is the binding constraint for running large models on consumer hardware with limited VRAM. If this ships in GGUF-compatible quantizations through the Ollama/llama.cpp ecosystem, it changes what you can run locally.

“6x compression sounds too good to be true — there’s always a quality tradeoff.” Quantization always trades precision for efficiency. The claim is that TurboQuant maintains accuracy on standard benchmarks, but benchmarks don’t capture every clinical edge case. The right frame: 16-bit to 8-bit quantization carries almost no penalty, and 4-bit performs at roughly 90% of the original. This pushes that frontier further.

💡 80/20: If you’re running Ollama or LM Studio locally, compression breakthroughs like this are what will eventually let you run a 70B-parameter model on a MacBook instead of a 7B model. Try: Next time you pull a model from Ollama, compare a Q4 quantized version against Q8 on the same clinical prompt. The quality gap is smaller than most people assume, and the speed difference is dramatic.

What are you building this week? Reply and tell me — I read every one.

— Kevin

clinicians.build

Discussion about this post

Ready for more?