Mila x Bell x Kids Help Phone / Jeunesse, J'ecoute March 16--23, 2026
Team 404HarmNotFound
| Name | Role | Background |
|---|---|---|
| Shuchita Singh | ML / NLP Lead | AI Researcher, LLM fine-tuning, RAG, NLP, Responsible AI |
| Amanda Wu | UX / Presentation | Principal Product Designer (TD), 10 yrs FinTech UX, HCI Master (UCL) |
| Priyanka Naga | Clinical / Strategy | Business Leader (Ottawa Hospital), Innovation Lead (CHEO), MSc Data Science, PMP |
| Dominic D'Apice | Infra / Code / ML | Dev II AI Infra Azure, 25+ yrs Linux/DevOps, Data Science cert (TELUQ), Kaggle |
- Executive Summary
- Problem Statement
- Solution Overview
- Track 1: Adversarial Stress-Testing
- Track 2: Logic Hardening
- Track 3: Synthetic Data Augmentation
- Architecture
- Results and Metrics
- User Experience
- Canadian Multicultural Context
- Competitive Analysis
- Limitations and Future Work
- Conclusion
- References and Resources
CEDD (Conversational Emotional Drift Detection) is a real-time safety layer designed to sit alongside a youth mental health chatbot serving Canadians aged 16--22. Unlike systems that evaluate each message in isolation, CEDD monitors the trajectory of emotional deterioration across an entire conversation---detecting when messages grow shorter, more negative, lose questions and hope words, or drift semantically toward crisis language.
Key results:
- 90.0% cross-validated accuracy (+-1.6%, stratified k=4), up from a 66.7% baseline
- 67 engineered features combining lexical analysis, multilingual sentence embeddings, and behavioral coherence signals
- 36/36 adversarial test cases passing across 20 attack categories, with 0 critical misses
- 600 bilingual synthetic conversations (French + English) for training, including 120 adversarial scenarios
- 5-step warm handoff to a simulated Kids Help Phone counselor at crisis level
- Fully bilingual (Quebecois French + Canadian English) with culturally sensitive detection for somatization, identity conflict, and withdrawal patterns
CEDD processes emotional drift in approximately 0ms (feature extraction + ML classification), calling an LLM only to generate the adaptive response. This makes it fundamentally more efficient and explainable than multi-agent approaches that require multiple GPT-4-class calls per message.
Youth mental health chatbots face a critical safety challenge: detecting when a conversation is drifting toward crisis. Current approaches have significant blind spots:
- Per-message analysis misses the trajectory. A message saying "I'm fine" after six increasingly distressed messages is not fine---but a single-message classifier cannot know this.
- Keyword-only detection is trivially defeated by circumlocution, sarcasm, code-switching, or cultural expression patterns that do not use Western English crisis vocabulary.
- LLM-based detection (e.g., EmoAgent) requires multiple expensive API calls per message, creating latency, cost, and availability concerns for a 24/7 crisis service.
- Monolingual systems systematically fail Francophone, Indigenous, and culturally diverse Canadian youth who express distress differently.
- 46% of youth supported identify as 2SLGBTQ+
- 10% identify as Indigenous (2x population proportion)
- 75% share something they have never told anyone else
- Suicide contacts among youth 13 and under doubled in 4 years
- 71% of youth prefer non-verbal communication (JMIR, May 2025)
- 44% of 988 callers abandon before connecting (GSA/OES)
- 20% seek suicide help via text vs 5% via phone in Ontario (CBC)
These numbers demand a system that can detect distress across languages, cultures, and communication styles---and that hands off to human support seamlessly rather than displaying a phone number and ending the conversation.
CEDD operates as a lightweight, explainable safety layer with two phases:
generate_synthetic_data.py --> 600 bilingual conversations (Claude Haiku API)
|
train.py (cross-validate, fit, save)
|
models/cedd_model.joblib
User message
|
v
Feature Extractor --> 10 features/message --> 67 trajectory features
|
v
Classifier (6-gate safety logic + GradientBoosting)
|
v
Alert Level: Green / Yellow / Orange / Red
|
v
Response Modulator --> Adaptive system prompt --> LLM response
| \
v --> Warm handoff at Red (simulated KHP counselor "Alex")
Session Tracker --> SQLite (cross-session longitudinal tracking)
Design philosophy: Asymmetric errors. Over-alerting (false positives) is always preferable to missing a crisis (false negatives). The 6-gate safety logic enforces this: safety keyword rules override ML predictions by design, and low-confidence predictions default to elevated alert levels.
We built a comprehensive adversarial test suite (tests/adversarial_suite.py) with 36 test cases across 20 attack categories, designed to probe CEDD's robustness against the types of inputs that would cause a deployed youth mental health system to fail dangerously.
| # | Category | What it tests | Language |
|---|---|---|---|
| 1 | false_positive_physical |
Physical pain without emotional distress (back pain, nausea) | FR |
| 2 | sarcasm |
Sarcastic crisis language ("dying of laughter") | FR |
| 3 | negation |
Negated distress ("I'm NOT suicidal") | EN |
| 4 | code_switching |
Mid-conversation language switching (FR/EN) | Mixed |
| 5 | quebecois_slang |
Quebecois expressions ("je suis brule", "j'ai la misere") | FR |
| 6 | gradual_drift_no_keywords |
Slow deterioration without explicit crisis words | EN |
| 7 | direct_crisis |
Explicit suicidal ideation | FR/EN |
| 8 | hidden_intent |
Crisis masked in casual language | EN |
| 9 | manipulation_downplay |
"I was just joking about wanting to die" | EN |
| 10 | somatization |
Emotional distress expressed as physical symptoms | FR |
| 11 | identity_conflict |
2SLGBTQ+ rejection and identity distress | EN |
| 12 | sudden_escalation |
Rapid shift from casual to crisis | FR |
| 13 | active_bypass |
Deliberate attempts to bypass safety detection | EN |
| 14 | rapid_recovery_manipulation |
Fake recovery after crisis signals | EN |
| 15 | cultural_false_positive |
Cultural expressions that look like distress | FR |
| 16 | neurodivergent_pattern |
Literal/flat communication that is not distress | EN |
| 17 | emoji_only |
Emoji-only messages (no text to analyze) | -- |
| 18 | repeated_word |
Repeated single word ("ok ok ok ok") | EN |
| 19 | short_recovery |
Brief message after recovery | EN |
| 20 | long_message |
Very long messages with buried crisis signals | EN |
| 21 | neutral_personne_fr |
French "personne" used as noun, not crisis keyword | FR |
| 22 | emoji_crisis |
Crisis expressed through emoji sequences | -- |
| Milestone | Tests | Passed | Critical Misses | Key Fix |
|---|---|---|---|---|
| Baseline (March 10) | 10 | 7 | 0 | -- |
| Post data expansion | 10 | 9 | 0 | 320 training conversations |
| Post keyword fix | 10 | 10 | 0 | Crisis keyword expansion |
| Post features (67D) | 13 | 13 | 0 | Embeddings + coherence |
| Post 480 convos | 13 | 13 | 0 | Data augmentation |
| Post 600 convos | 30 | 30 | 0 | Adversarial archetypes |
| Current (word-boundary fix) | 36 | 36 | 0 | Regex \b, context-aware "personne", feminine forms |
The test suite enforces a strict safety contract:
- Exit code 0: All tests pass
- Exit code 1: Non-critical failures (over-alerting within tolerance)
- Exit code 2: Critical miss --- a crisis conversation was classified as Green or Yellow. This blocks any merge to main.
A critical miss means a youth in crisis would receive a generic supportive response instead of crisis intervention and resources. This is the worst possible outcome for a safety-critical application.
A significant Track 1 finding was that substring-based keyword matching caused both false positives and false negatives:
- "morte de rire" (dying of laughter) matched "mort" (death) --- false positive
- "une personne formidable" (a wonderful person) matched "personne" (nobody) --- false positive
- "mortel" (awesome, slang) matched "mort" (death) --- false positive
We replaced all if word in text checks with regex \b word-boundary matching and added context-aware handling for ambiguous words like "personne" (which means "nobody" in isolation but "a person" when preceded by an article). We also added feminine forms to all lexicons (e.g., "epuisee", "abandonnee", "desesperee").
The classifier uses a 6-gate decision pipeline that layers ML predictions with deterministic safety rules:
Gate 1: < 3 user messages? --> Return Green (insufficient data)
Gate 2: Safety keyword floor --> Crisis words force Orange/Red minimum
Gate 3: ML prediction --> GradientBoosting on 67D feature vector
Gate 4: Low confidence (< 45%)? --> Default to Yellow (cautious)
Gate 5: Short conversation (< 6)? --> Cap at Orange maximum
Gate 6: Safety floor enforcement --> ML cannot go below keyword level
Rationale: Gates 2 and 6 ensure that explicit crisis language always triggers an appropriate response, regardless of what the ML model predicts. Gate 4 ensures that uncertain predictions err on the side of caution. Gate 5 prevents premature Red alerts when there is insufficient conversational context.
We evolved from 7 basic features to 67 clinically motivated features across four layers:
| Feature | Clinical rationale |
|---|---|
word_count |
Shortening messages indicate withdrawal and disengagement |
punctuation_ratio |
Changes in punctuation reflect emotional state shifts |
question_presence |
Loss of questions indicates loss of engagement and curiosity |
negative_score |
Rising negativity across bilingual (FR+EN) lexicon |
finality_score |
Finality/ending language ("plus rien", "end it all") |
hope_score |
Declining hope words signal deterioration |
length_delta |
Message-to-message length changes track behavioral shifts |
negation_score |
Negated positive states ("can't cope", "ne suis pas bien") |
identity_conflict_score |
2SLGBTQ+/cultural identity distress signals |
somatization_score |
Physical + emotional word co-occurrence (cultural expression) |
Each of the 10 per-message features is aggregated over the full conversation using 6 statistics: mean, standard deviation, slope, last value, maximum, and minimum. This captures the trajectory --- not just the current state, but the direction and volatility of emotional drift.
The slope statistic is particularly important: finality_score_slope captures whether finality language is increasing over time, while hope_score_slope captures whether hope is declining.
Using paraphrase-multilingual-MiniLM-L12-v2 (a multilingual sentence transformer), we compute:
| Feature | What it measures |
|---|---|
embedding_drift |
Mean cosine distance between consecutive messages --- semantic wandering |
crisis_similarity |
Cosine similarity of the last message to a crisis language centroid |
embedding_slope |
PCA-reduced 1D slope --- directional semantic drift over time |
embedding_variance |
Mean pairwise cosine distance --- overall conversation coherence |
These features catch distress expressed through synonyms, paraphrases, and indirect language that lexical features alone would miss. The multilingual model handles both French and English natively.
| Feature | What it measures |
|---|---|
short_response_ratio |
Fraction of user messages with fewer than 5 words (disengagement) |
min_topic_coherence |
Minimum cosine similarity between consecutive messages (topic jumping) |
question_response_ratio |
Fraction of assistant questions followed by a responsive reply |
These features detect behavioral withdrawal --- the pattern where a distressed user stops engaging meaningfully with the chatbot, giving short or tangential replies.
A critical logic hardening improvement: detecting negated positive states. "I'm fine" and "I'm not fine" have opposite meanings, but both contain positive words. We added regex-based negation patterns for both French and English:
- French:
ne ... pas bien,ne ... plus capable,ne ... jamais - English:
can't cope,not okay,never going to be
Replaced all substring matching (if word in text) with regex word-boundary matching (\b), plus context-aware rules:
- "personne" preceded by a French article ("une", "la", "cette") is not a finality word
- Feminine forms added to all lexicons for gender-inclusive detection
- Prevents false positives from compound words ("mortel", "immortel")
All training data is 100% synthetic --- no real PII, per hackathon rules. We used the Claude Haiku API (generate_synthetic_data.py) to create realistic youth mental health conversations.
- 4 classes: Green (normal), Yellow (concerning), Orange (significant distress), Red (crisis)
- 60 conversations per class per language = 60 x 4 x 2 = 480
- Bilingual: authentic Quebecois French + Canadian English
- 12 user messages + 12 assistant messages per conversation
We identified 6 adversarial patterns that standard training data fails to cover:
| Archetype | What it captures | Why standard data misses it |
|---|---|---|
physical_only |
Distress expressed purely through physical symptoms | Standard data uses emotional vocabulary |
sarcasm_distress |
Sarcastic or ironic crisis language | Standard data is direct |
adversarial_bypass |
Deliberate attempts to evade detection | Standard data is cooperative |
identity_distress |
2SLGBTQ+ and cultural identity crisis | Standard data uses generic scenarios |
neurodivergent_flat |
Flat affect that is not distress | Standard data has clear emotional arcs |
crisis_with_deflection |
Crisis signals immediately followed by deflection | Standard data is consistent |
Adding these 120 adversarial conversations:
- Improved CV variance from +-4.4% to +-1.5% (more stable model)
- Improved sample:feature ratio from 7.2:1 to 9.0:1
- Enabled 30/30 adversarial test pass rate (up from 13/13 on fewer tests)
We built a Claude-based quality annotation pipeline that scored each synthetic conversation on realism, label accuracy, and clinical plausibility. While filtering low-quality conversations hurt accuracy (removing edge cases the model needed to learn from), the annotation insights informed which adversarial archetypes to add. The annotation scripts and intermediate data files were removed after serving their purpose.
Every conversation is generated in its target language with authentic regional expressions:
- French: Quebecois vocabulary, informal register (tutoiement), regional expressions
- English: Canadian context (university, CEGEP references, Canadian cultural elements)
- No translation artifacts: conversations are generated natively in each language, not translated
+---------------------------------------------------------------------+
| CEDD Safety Layer |
| |
| +------------------+ +------------------+ +----------------+ |
| | Feature Extractor| | Classifier | | Response | |
| | |--->| |--->| Modulator | |
| | 10 features/msg | | 6-gate safety | | | |
| | 67 trajectory | | GradientBoosting | | Adaptive | |
| | Bilingual NLP | | StandardScaler | | system prompts | |
| | Embeddings | | | | LLM fallback | |
| +------------------+ +------------------+ +----------------+ |
| | |
| +------------------+ +-------v--------+ |
| | Session Tracker | | LLM Chain | |
| | SQLite | | Cohere | |
| | Cross-session | | Groq | |
| | Withdrawal | | Gemini | |
| | detection | | Claude | |
| +------------------+ | Static text | |
| +----------------+ |
+---------------------------------------------------------------------+
|
+---------v---------+
| Streamlit UI |
| Bilingual |
| FR / EN toggle |
+-------------------+
- Scaler:
StandardScaler--- normalizes 67 features to mean=0, std=1 - Model:
GradientBoostingClassifier--- 200 trees, max_depth=3, learning_rate=0.1 - Validation:
StratifiedKFoldwith k=4 - Serialization:
joblibformat (models/cedd_model.joblib)
To ensure 24/7 availability for a crisis service, CEDD uses a 5-step fallback chain:
- Cohere --- primary
- Groq (Llama 3.3 70B Versatile) --- fastest inference
- Google Gemini (2.5 Flash) --- tertiary
- Anthropic Claude (Haiku) --- quaternary
- Static emergency text --- guaranteed availability with crisis resources
If all LLM providers are down, the static fallback still provides Kids Help Phone contact information. No youth in crisis should ever see a blank screen.
| Level | Color | Description | LLM behavior |
|---|---|---|---|
| 0 | Green | Normal conversation | Supportive, open-ended questions |
| 1 | Yellow | Concerning signs (fatigue, loneliness) | Enhanced emotional validation |
| 2 | Orange | Significant distress | Active support + crisis resources mentioned |
| 3 | Red | Potential crisis | Urgent referral, KHP resources prominently displayed |
At Red alert, CEDD replaces the industry-standard "cold referral" (display a phone number, end conversation) with a 5-step accompanied transition:
- Empathetic acknowledgment --- validate feelings, no hotline number yet
- Permission-based transition --- ask consent, frame as "upgrade" not rejection
- Context bridge --- generate anonymized summary for KHP responder
- Seamless connection --- text-based (686868), same modality, no story repetition
- Background monitoring + follow-up --- CEDD stays active, acknowledges returning users
The simulated counselor "Alex" uses an ASIST-trained (Applied Suicide Intervention Skills Training) persona with short responses (2-4 sentences), one question at a time, and consistent emotional validation before any questions.
SQLite-based longitudinal tracking records:
- Session metadata (user, timestamps, max alert level, message count)
- Alert events (level, confidence, trigger message)
- Handoff events (step reached, user response)
- Last activity timestamps (for withdrawal detection)
Users returning after >24 hours without closing their session trigger a withdrawal risk check and receive a welcome-back message.
| Stage | CV Accuracy | Features | Training Data | Adversarial Tests |
|---|---|---|---|---|
| Baseline (March 10) | 66.7% +-26.4% | 42 (7x6) | 24 conversations | 7/10 |
| Data expansion | 91.2% +-1.5% | 48 (8x6) | 320 conversations | 9/10 |
| +Negation +Embeddings | 92.2% +-1.8% | 52 | 320 conversations | 9/10 |
| +Identity +Somatization +Coherence | 92.5% +-1.5% | 67 | 320 conversations | 13/13 |
| Data expansion to 480 | 91.7% +-4.4% | 67 | 480 conversations | 13/13 |
| Adversarial augmentation to 600 | 90.5% +-1.5% | 67 | 600 conversations | 30/30 |
| Word-boundary fix (current) | 90.0% +-1.6% | 67 | 600 conversations | 36/36 |
| Metric | Value |
|---|---|
| Cross-validated accuracy (k=4, stratified) | 90.0% +-1.6% |
| Feature count | 67 (60 trajectory + 4 embedding + 3 coherence) |
| Training conversations | 600 (480 standard + 120 adversarial) |
| Languages | 2 (French + English, 300 each) |
| Adversarial tests | 36/36 passing, 0 critical misses |
| Adversarial test categories | 20 |
| Sample:feature ratio | 9.0:1 (target: 10:1) |
| Train accuracy | ~100% (expected with 200 trees on 600 samples) |
| Rank | Feature | Importance | What it measures |
|---|---|---|---|
| 1 | word_count_max |
0.192 | Longest message in conversation |
| 2 | word_count_slope |
0.179 | Whether messages are getting shorter over time |
| 3 | word_count_last |
0.138 | Length of the most recent message |
| 4 | length_delta_mean |
0.075 | Average change in message length |
The dominance of word-count-related features aligns with clinical research: progressive shortening of messages is one of the strongest behavioral signals of emotional withdrawal and disengagement. This is a signal that per-message classifiers cannot detect --- only trajectory analysis reveals it.
The accuracy dropped slightly from 92.5% to 90.0% as we added adversarial training data. This is expected and desirable: the model traded a small amount of overall accuracy for significantly better robustness on adversarial edge cases. The CV variance also stabilized dramatically (from +-26.4% to +-1.6%), indicating a much more reliable model.
The Streamlit application supports full French/English toggle, with all UI elements, system prompts, feature labels, and crisis resources available in both languages.
| Feature | Description |
|---|---|
| Welcome card | Branded HTML card with brain emoji, bilingual title/description, call to action |
| Demo profiles | 5 selectable profiles (team members + Guest) with distinct longitudinal trajectories: stable green, gradual improvement, fluctuating, escalating, and fresh start |
| Feature importance chart | Collapsible Plotly horizontal bar chart showing top 5 features by composite score (model importance x scaled value), with 6 color categories and bilingual labels |
| Feature radar chart | Plotly Scatterpolar showing 10 per-message features normalized 0-1, with current message in alert-level color and first message as green ghost overlay |
| Side-by-side compare mode | Split view: raw LLM response (no system prompt) vs CEDD-modulated response. Demonstrates the value of adaptive prompting on crisis messages |
| Demo autopilot | Auto-plays a 9-message scenario (Felix in FR, Alex in EN) showing Green-to-Yellow-to-Orange drift. Judges can observe the full trajectory hands-free |
| Alert transition animation | CSS-animated toast notification when alert level increases |
| Chat timestamps | HH:MM timestamps below each message bubble |
| LLM source badge | Colored badge on each assistant message showing which LLM generated it |
| Alert level badge | Colored dot on each assistant message showing the CEDD classification |
| Export transcript | Download conversation + alert history as JSON |
| Simulated counselor handoff | At Red level, interactive handoff to "Alex" (KHP counselor persona) with blue gradient UI, distinct avatar, and ASIST-trained responses |
| About panel | Collapsible bilingual explanation of CEDD's operation |
Amanda Wu audited 6 platforms (ChatGPT, Gemini, Character.AI, Wysa, Woebot) across desirability, usability, and accessibility. No platform currently offers all of:
- Canadian-specific crisis resources (Kids Help Phone) --- CEDD does
- French-language crisis detection --- CEDD does
- Subtle/coded distress detection --- CEDD's trajectory analysis does
- Warm handoff to human responder --- CEDD does (5-step flow)
- Cross-session memory --- CEDD does (SQLite longitudinal tracking)
Canada is the most diverse G7 country. Crisis detection trained only on Western English expressions will systematically fail the most vulnerable youth. CEDD addresses this with culturally informed feature engineering:
| Cultural Group | Expression Pattern | CEDD Detection |
|---|---|---|
| Indigenous | Storytelling, substance references, holistic/spiritual framing | Trajectory features detect drift patterns regardless of vocabulary |
| South Asian | Somatization --- "my chest hurts" = emotional pain | somatization_score detects physical + emotional word co-occurrence |
| East Asian | Withdrawal, silence, minimizing ("I'm fine") | short_response_ratio + question_response_ratio catch disengagement |
| Francophone | French-language distress, Quebecois dialect | Native FR lexicons + bilingual embeddings |
| 2SLGBTQ+ | Identity conflict, coded rejection language | identity_conflict_score with dedicated phrase lists |
| Neurodivergent | Literal language, shutdown, emotional bursts | Temporal trajectory distinguishes from sustained crisis |
With 46% of Kids Help Phone youth identifying as 2SLGBTQ+ and 10% identifying as Indigenous, culturally insensitive detection is not just a technical limitation --- it is a safety failure that disproportionately affects the most at-risk populations.
EmoAgent is the closest academic reference point. It uses a multi-agent architecture with GPT-4o for both detection and response.
| Dimension | CEDD | EmoAgent |
|---|---|---|
| Detection speed | ~0ms (feature extraction + ML) | Multiple GPT-4o calls per message |
| Detection cost | $0 (local computation) | ~$0.04-0.10 per message |
| Explainability | Full (67 named features, composite scores, importance chart) | Black box (LLM reasoning) |
| Bilingual | Native FR + EN (lexicons, embeddings, prompts) | English only |
| Cross-session | SQLite longitudinal tracking + withdrawal detection | Per-conversation only |
| Cultural sensitivity | Somatization, identity conflict, coherence features | None |
| Clinical tools | 4-level alerts + 6-gate safety logic + warm handoff | PHQ-9, PDI, PANSS (validated scales) |
| Availability | 5-model LLM fallback chain + static emergency text | Single provider dependency |
Our positioning: "EmoAgent needs 4 GPT-4o calls per message. CEDD detects in ~0ms with 67 features and GradientBoosting, and only calls the LLM to modulate the response. 5-model fallback chain ensures availability. It is the difference between an IDS that deep-inspects all traffic and a lightweight edge firewall --- and ours works in both French and English."
EmoAgent uses clinically validated instruments (PHQ-9, PDI, PANSS) as evaluation metrics. CEDD's thresholds are engineering-derived, not clinically validated. Borrowing simplified validated scales as complementary metrics is a planned improvement.
| Limitation | Impact | Mitigation |
|---|---|---|
| No clinical validation | Alert thresholds are engineering-derived | 6-gate safety logic provides deterministic safety floor |
| ML unreliable for short conversations (< 6 messages) | Cannot build trajectory from insufficient data | Gate 5 caps at Orange maximum |
| Sarcasm and periphrasis remain challenging | "je pese sur tout le monde" not reliably detected | Embeddings complement lexical features but do not fully solve this |
| Sample:feature ratio at 9.0:1 (ideal is 10:1) | Slight risk of overfitting | Adversarial augmentation improved from 7.2:1 |
| Identity conflict detection is phrase-based | May miss coded or indirect identity distress | Embeddings provide partial coverage |
| Over-prediction on short/ambiguous green conversations | Physical-only or neurodivergent patterns may trigger Orange | Documented as acceptable (over-alerting preferred to under-alerting) |
| Train accuracy ~100% | Indicates model memorization | CV accuracy (90.0%) shows generalization is reasonable |
| Improvement | Expected impact | Effort |
|---|---|---|
| LSTM/Transformer sequence model | +10-15% accuracy (learns message ordering natively) | 3-4 hrs |
| Minimization detection | Cross-reference "I'm fine" with behavioral signals | 1-2 hrs |
| Burst vs sustained temporal patterns | Better neurodivergent/ADHD handling | 2-3 hrs |
| Clinical validation study | Validated thresholds from mental health professionals | Weeks |
| Intra-session timing analysis | Detect pauses and response delays as signals | 1-2 hrs |
CEDD demonstrates that effective youth mental health crisis detection does not require expensive multi-agent LLM architectures. By combining trajectory-based feature engineering (67 features across lexical, embedding, and behavioral layers), deterministic safety gates, and adaptive LLM response modulation, CEDD achieves:
- Reliable detection: 90.0% +-1.6% cross-validated accuracy on 600 bilingual conversations
- Adversarial robustness: 36/36 test cases passing across 20 attack categories, 0 critical misses
- Explainability: every alert is traceable to specific named features and their scores
- Cultural sensitivity: somatization, identity conflict, and withdrawal detection for Canada's diverse youth population
- Operational resilience: 5-model LLM fallback chain with static emergency text guarantee
- Humane crisis response: 5-step warm handoff that accompanies youth to help rather than abandoning them with a phone number
The improvement trajectory tells the story: from 66.7% accuracy with 42 features and 24 conversations, to 90.0% accuracy with 67 features and 600 conversations. From 7/10 adversarial tests to 36/36. Every iteration was guided by the principle that in a youth mental health application, missing a crisis is the one failure mode that is never acceptable.
CEDD is not a replacement for human counselors. It is the safety layer that ensures no youth in crisis falls through the cracks of an AI conversation --- and that when crisis is detected, the transition to human support is seamless, respectful, and accompanied.
- Kids Help Phone / Jeunesse, J'ecoute: 1-800-668-6868 (24/7, free, confidential) --- text: 686868
- Suicide Crisis Helpline / Ligne de crise suicide: 9-8-8 (988.ca)
- Multi-Ecoute: 514-378-3430 (multiecoute.org)
- Tracom: 514-483-3033 (tracom.ca)
- Emergency services / Services d'urgence: 911
- EmoAgent: Multi-Agent Framework for Emotional Support (Princeton/Michigan, arXiv:2504.09689)
- ASIST (Applied Suicide Intervention Skills Training) --- LivingWorks
- Kids Help Phone Annual Report 2024 --- Service statistics and demographics
- JMIR (May 2025) --- Youth communication preferences in mental health contexts
paraphrase-multilingual-MiniLM-L12-v2--- Sentence-Transformers (Reimers & Gurevych, 2019)GradientBoostingClassifier--- scikit-learn (Pedregosa et al., 2011)- Streamlit --- open-source app framework for ML
Report prepared by Team 404HarmNotFound for the Mila x Bell x Kids Help Phone Hackathon, March 2026. All training data is 100% synthetic --- no real PII was used at any stage.