Can Chatbots Keep Survivors Safe? Testing AI Advice for Technology-Facilitated Abuse

Why this matters

Technology-facilitated abuse (TFA) is when an abusive partner uses everyday tech—phones, smart home devices, GPS trackers, social media, shared cloud accounts—to monitor, harass, or control someone. Survivors often search online for help, but search results and forums can be inaccurate or unsafe. Now that AI chatbots are everywhere (search engines, Q&A sites, standalone “support” bots), survivors may ask an LLM for advice before reaching a tech abuse clinic.

That’s high stakes: bad advice can escalate harm.

This research asks: How good are LLM answers to real TFA survivor questions?

What the researchers did

They built a realistic dataset of survivor-style questions and tested four models:

General-purpose models: GPT-4o and Claude 3.7 (non-reasoning)
IPV-specific models: Ruth and Aimee (built on Claude/GPT, positioned for survivor support)

They used real-world questions pulled from research literature and online forums (Reddit, Quora), filtered for intimate-partner, tech-abuse scenarios. From 1,183 collected items, they curated 385 eligible questions, then sampled 193 to cover many abuse categories (e.g., surveillance, harassment, account compromise, spyware) and “means” (e.g., spyware, keyloggers, GPS trackers, shared phone plans).

Then they generated single-turn, zero-shot responses using a survivor-safety-centered prompt and evaluated answers on four criteria:

Accuracy (is it correct and relevant?)
Completeness (does it include the key steps?)
Safety (could it put a survivor at risk?)
Actionability (can a survivor realistically do it?)

Experts scored accuracy/completeness/safety. Survivors rated actionability.

The headline result: most responses were not good enough

Across models, experts found responses were imperfect in the majority of cases—often inaccurate, incomplete, or missing safety warnings.

Two especially alarming patterns showed up repeatedly:

Critical safety warnings were often missing

In many cases, answers failed to warn that certain actions (like changing settings, removing spyware, resetting devices, changing accounts) can tip off an abuser or trigger escalation.
Advice was frequently irrelevant or ineffective for the actual abuse scenario

Example failures included recommending:
- VPNs for spyware, harassment, or account hijacking (often ineffective for the real threat)
- Password changes for problems like online harassment or certain shared-access situations (doesn’t address the abuse mechanism)
- RF detectors / phone apps to find hidden trackers (often unreliable, can waste money/time)

The key problem: many LLMs answer like a generic “security helpdesk,” not like a survivor safety plan.

Survivors’ view: “long, overwhelming, and hard to follow”

The team also surveyed 114 people with lived experience of TFA to rate actionability.

Even when advice sounded reasonable, survivors pointed out practical barriers:

Fear of escalation: “If I do this, it may trigger retaliation.”
Overwhelming length: some responses were extremely long and felt unmanageable in a crisis
Technical, financial, and logistical constraints: not everyone can buy new devices, change plans, or do complex security steps
Emotional load: stress and fear reduce the ability to follow multi-step guidance

Interestingly, expert “perfect” answers didn’t always feel more actionable to survivors—because actionability depends heavily on real-life constraints, not just correctness.

Biggest takeaway

LLMs show they understand the topic—but they often fail at what matters most in TFA:

giving the right steps for the right threat model
including safety planning and escalation warnings
producing short, prioritized, realistic guidance

And IPV-specific models did not reliably outperform general-purpose models—suggesting “domain branding” isn’t enough without strong evaluation and safer design.

Practical recommendations (what the paper pushes toward)

Use curated, trusted sources (e.g., established tech safety orgs) to ground answers
Improve models via retrieval + expert-reviewed content (so advice is accurate and relevant)
Build responses that are step-by-step, prioritized, and include clear safety warnings
Avoid recommending “security clichés” (VPN, change password) unless they truly fit the scenario
Make UI warnings clear: AI advice may be incomplete, unsafe, or escalate risk

source: https://arxiv.org/pdf/2602.17672

Leave a Reply Cancel reply

The Internet’s “Danger Zones”: How to Spot Information Voids Before Misinformation Takes Over

When AI Decides What “Violence” Means… It Doesn’t Think Like You Do

The “Hidden Traffic Hack” in Chaotic Roads: Why 30–60% Vehicle Grouping Can Boost Flow (and When It Backfires)

The “Household Size” Bombshell: Why Some European Countries Were Basically Set Up to Lose Against COVID

AI That “Knows Physics” Can Predict Blood Flow in Your Neck—Without Expensive Scans or Slow Simulations

NYC’s Congestion Toll Shock: Who Really Wins, Who Pays the Price?

The “Hidden Traffic Hack” in Chaotic Roads: Why 30–60% Vehicle Grouping Can Boost Flow (and When It Backfires)

When AI Decides What “Violence” Means… It Doesn’t Think Like You Do

Rebuilding Docker Images? The Bad News: Only ~1 in 40 Are Truly Reproducible

Mind the Boundary: How a Cloud Run “A2A Hub” Makes Gemini Enterprise Agents Actually Stable

Lost Before Translation: When AI Talks to AI, Truth Gets “Polished” and Meaning Gets Thinner

Why Some Professors Thrive With Generative AI (and Others Resist): The Hidden Power of Digital Self-Efficacy