Feedback Generator (Talk2Care)

Overview

The Feedback Generator produces structured, formative feedback for a nursing student after a VR conversation. It runs after a session ends and combines LLM feedback with deterministic analysis (conversation phases, Gordon patterns, speech metrics). Its output is consumed by the VR client and optionally sent as a PDF/email to the student.

This module sits at the end of the conversational AI pipeline and translates raw transcripts and metadata into actionable coaching. It targets learning outcomes (communication and anamnesis skills), not grading.

Educational Context

Feedback is grounded in formal conversation techniques used in nursing education:

Phase-based interview structure (contact → exploration → closing) from phase_detection.py.
Gordon Functional Health Patterns used as analytical categories and coverage targets: Keyword definitions in gordon_patterns.py. Turn-level detection in phase_detection.py

The tone and rules are formative:

Prompts instruct the LLM to highlight strengths and improvements, not to assign grades (feedback.txt).*
Deterministic checks enforce concrete evidence (metrics, quotes) and avoid hallucinated claims (feedback_formatter.py).

High-Level Architecture

Placement in backend:

Triggered by /feedback in app.py after a session ends.
Uses conversation history and speech metadata stored during /generate.

Inputs:

STT transcript and timestamps from faster-whisper.py.
Conversation history from SQLite via db_utils.py.
Metadata: audio duration, word counts, segment timestamps (for pauses).
Scenario-specific feedback prompt from feedback.txt.*

Outputs:

Human-readable feedback text for UI/TTS.
Structured JSON with sections, metrics, and evidence for client display.
Speech metrics and icon states for VR UI (Unreal Engine).
Optional email summary with PDF attachment.

Input Data Structure

Transcript format:

Session history stored as lines formatted "Speaker: text" in SQLite (db_utils.py).
read_history returns a single string with newline-separated turns.

Utterance segmentation:

The Feedback Formatter converts the history into structured turns:
Speaker is normalized to "student" or "patient" (conversation_history_to_turns in feedback_formatter.py).

STT metadata:

transcript_details includes:

segments: list of {text, start, end}
audio_duration
word_timestamps (currently None)
Stored in conversation_audio_metadata for speech analysis.

Preprocessing:

Lowercasing and punctuation stripping for keyword detection.
Sentence splitting to detect question types and comprehension cues.
Filler detection uses tokenized segment text. Feedback Logic & Processing Flow

Transcript analysis

Build structured turns and analyze phases (analyze_conversation_phases).
Extract cues: questions, empathy, paraphrases, filler phrases.

Pattern detection

Gordon pattern coverage via keyword matching:
generate_pattern_feedback from gordon_patterns.py.
Deterministic detection per turn via detect_gordon_patterns in phase_detection.py.

Mapping to feedback rules

Build metadata (coverage %, speech metrics, missing patterns).
Generate a strict LLM prompt with metrics and allowed quotes (build_llm_prompt).

Aggregation & prioritization

Sanitize LLM output into required sections.
Append deterministic sections (phases, speech, Gordon, action items).
Enforce critical tone if low Gordon coverage.

Formatting & packaging

Combine sections into plain text and structured JSON.
Attach speech metrics, icon states, and Gordon summary.
Optionally generate email + PDF.

Why this flow:

Deterministic analysis provides reliability and safety (no hallucinated quotes), while the LLM provides natural-language coaching. The hybrid approach keeps feedback grounded and explainable.

Metrics & Indicators Generated metrics include:

Speech rate (wpm), average pause length, pause distribution.
Filler counts and ratio (per 100 words).
Long pause count.
Gordon coverage (overall and student-initiated).
Phase coverage and per-phase rubric scores.

These are relative, formative indicators, not absolute scores. Educators should interpret them as signals of technique, not performance ranking.

Output Formats

On-screen text

response or text field returned by /feedback, used for UI and TTS.

Structured JSON

structured.sections contains:
summary, gespreksvaardigheden, comprehension, phase_feedback, speech, gordon, action_items, closing, plus LLM sections and analysis metadata.
Used by client UI for structured display or analytics.

PDF attachment

Generated from structured sections in pdf_generator.py.
Includes summary, skills, comprehension, phases, speech, Gordon, actions, closing.

Email summary

Generated by LLM with email_formatter.py.
Sent via SMTP in emailsender.py. Configuration & Extensibility

Rule tuning

Thresholds live in feedback_formatter.py (THRESHOLDS).
Phase config is DEFAULT_PHASE_CONFIG in phase_detection.py.

Adding techniques

Gordon pattern keywords: gordon_patterns.py and phase_detection.py.
Pattern follow-up questions: PATTERN_QUESTIONS in feedback_formatter.py.

Prompt updates

Conversation feedback prompts: feedback.txt*.
LLM prompt template for structured sections: build_llm_prompt.

Feature flags

Speech analysis on/off via ENABLE_SPEECH_ANALYSIS in config.py.

Ethical & Privacy Considerations

Messages and metadata are stored per session in SQLite and cleared after feedback (clear_history in db_utils.py).
Audio recordings are not stored; only derived metadata and transcript details are kept.
Feedback is delivered to the student user; no teacher dashboard is implemented in this repo.
The formatter enforces quoting only from extracted transcript snippets to avoid fabricated evidence.

Known Limitations

Gordon patterns are keyword-based; semantic equivalents may be missed.
STT errors propagate into detection and metrics.
No word-level timestamps; pause metrics are derived from segment boundaries.
LLM feedback depends on prompt adherence and may be sparse if transcript is short.

Future Improvements

Use word-level timestamps for precise timing and pause analysis.
Replace keyword matching with classifier-based Gordon detection.
Add multilingual normalization for mixed-language transcripts.
Add teacher dashboard integration (if allowed) using structured JSON payloads.
Introduce personalization by scenario difficulty or student history.

Developer Notes

Do not change THRESHOLDS, phase rubric, or Gordon categories without educational review.
Keep LLM quoting constraints; they prevent hallucinated evidence.
Avoid altering structured.sections keys—PDF and email rely on them.
When modifying prompts, ensure they remain consistent with deterministic sections and metrics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback Generator (Talk2Care)

Overview

Educational Context

High-Level Architecture

Placement in backend:

Inputs:

Outputs:

Input Data Structure

Transcript format:

Utterance segmentation:

STT metadata:

transcript_details includes:

Preprocessing:

Why this flow:

On-screen text

Structured JSON

PDF attachment

Email summary

Rule tuning

Adding techniques

Prompt updates

Feature flags

Ethical & Privacy Considerations

Known Limitations

Future Improvements

Developer Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally