AI Risk Management in Mental Health: Beyond the Checklist
Why algorithmic harm assessment requires clinical judgement, not just technical audit, and how to build governance structures that reflect this.
Seasoned founders building in digital mental health have probably faced this conundrum at least once: build on depth of intervention or build on user experience with low friction? Most wanted to find the sweet spot between both to widen market possibility. And I would argue that most who tried that failed.
There are cases of success, of course. But the differentiator for those was usually timing, and eventually the maturity to pick a direction and follow it through, sometimes with two or more products built in parallel.
The depth versus UX tension is not just a product problem. It is a regulatory and business model problem in disguise. The population you claim shapes your classification. Your classification shapes your compliance obligations. Your compliance obligations shape your funding timeline and your go-to-market strategy. Founders who avoid that decision by building for everyone pay for it later, in time, money, and regulatory exposure.
This is why clinical safety input is an early decision, not a reactive one.
As a product develops, clinical risks shift, sometimes disproportionately. Founders unfamiliar with the regulatory landscape often discover too late that their hero feature, the one that makes the product distinctive, the one that investors love, is the exact feature that triggers a higher classification, a more complex safety case, or a compliance obligation they are not resourced to meet.
By that point, it is expensive to fix. Sometimes it is fatal to the product.
What early clinical safety input actually looks like. Practical tips from a Clinical perspective.
Population claim and intended purpose
Building something new is a creative process and it requires some freedom for ideas to flow. But for ideas to turn into projects, structure is necessary from the beginning.
And one of the pilar of this structure starts by clinical claims- the intended purpose- and defining the target population.
Defining population cannot be "people with mental health needs" but which people, with which presentations, at which stage of their journey. That decision determines everything downstream.
Ask: who is the riskiest person likely to use this product, and what happens if the product fails them?
Classification review
Once your MVP is scoped, map your features against the MHRA two-gate qualification test. General wellness claims sit in one regulatory space. Products that influence clinical decisions, generate individualised recommendations, or interact with vulnerable populations sit in another.
The features that drive engagement, AI personalisation, adaptive content, behavioural nudges, are often the same features that push classification upward.
Know where you sit before you build, not after.
Design controls
Design controls are not just technical constraints. They are clinical decisions about how a product should behave with a specific population.
A notification cadence that motivates a mildly anxious user can drive compulsive use in someone with OCD. A mood tracking feature that builds self-awareness in one person becomes a rumination engine in another. These are not edge cases. They are predictable consequences of building at scale without population-specific safeguards.
Document your design decisions. If you chose not to include a feature because of population risk, that reasoning belongs in your hazard log.
Safety case
A safety case is not a compliance document. It is a structured argument that your product is safe enough for its intended use with its intended population.
The key word is intended. If your intended population is vague, your safety case will be vague. If your intended population is specific, your safety case becomes a genuine governance tool that product teams can actually use.
Post-market surveillance
Most post-market surveillance plans in digital mental health are built around adverse event reporting. Something goes wrong, someone reports it, it gets logged.
The harms that matter most are often not discrete events. They are gradual patterns. Increasing dependency. Worsening avoidance. Delayed help-seeking. Compulsive use that looks like engagement.
Build your surveillance to detect the signals that matter for your specific population, not just the ones that are easy to count.
Getting clinical safety input early is not a regulatory burden. It is the decision that protects your product, your timeline, and your users.
| Feature | Clinical risk to user | Regulatory consequence |
|---|---|---|
AI chat companion |
Dependency, parasocial attachment | AI features under active MHRA scrutiny; higher classification very likely |
Wearable integration |
False reassurance from noisy physiological data | Physiological inference for clinical purposes triggers SaMD classification |
Personalised content |
Wrong intervention for undisclosed presentation | Clinical outcome claims trigger Class IIa; intended purpose is a strategic document |
Crisis detection |
Missed escalation or false positive harm | Highest classification risk; failure mode has direct patient safety consequences |
| Vague population claim | Specific population claim |
|---|---|
| Broad intended use statement | Defined clinical population with known vulnerabilities |
| Generic hazard log | Hazard log mapped to real presentations |
| Design controls that cover no one specifically | Design controls built for actual risk profiles |
| Safety case that satisfies a checkbox | Safety case that guides product decisions |
| Post-market surveillance with no clinical baseline | Surveillance designed to detect meaningful signals |
| Feature | Possible hazard | Design control by population |
|---|---|---|
AI chat companion |
Dependency, delusional reinforcement | Response limits; hard escalation triggers; no open-ended therapeutic framing |
Wearable integration |
False reassurance; missed deterioration | Clinical validation per population; no inference without defined sensitivity thresholds |
Personalised content |
Wrong intervention for undisclosed presentation | Content selection logic auditable; validated per diagnostic group |
Crisis detection |
Missed escalation or false positive harm | Sensitivity and specificity defined per population; mandatory human escalation pathway |
| Signal | What it might mean |
|---|---|
Engagement spike outside typical hours |
Crisis state or compulsive use |
Sustained high engagement then sudden drop |
Burnout, shame, worsening |
Repeated crisis feature use without escalation |
Clinical care substitution |
High engagement, no symptom improvement |
Engagement and therapeutic benefit have diverged |
Dr Hellen von Winckler is a Psychiatry Doctor and Clinical Safety Officer at F&G Strategy Ltd, a regulatory consultancy specialising in Software as a Medical Device and digital mental health technology.