YuhHearDem3 - Quick Reference

Essential Commands

Transcription

# Basic transcription
python transcribe.py --order-file order.txt

# With custom segment duration
python transcribe.py --order-file order.txt --segment-minutes 30

# From specific start time
python transcribe.py --order-file order.txt --start-minutes 60

# Limit segments for testing
python transcribe.py --order-file order.txt --max-segments 2

Knowledge Graph Extraction

# Extract KG from video
python scripts/kg_extract_from_video.py --youtube-video-id "VIDEO_ID"

# Extract KG from bill excerpts
python scripts/kg_extract_from_bills.py --max-bills 10

# With custom window parameters
python scripts/kg_extract_from_video.py --youtube-video-id "VIDEO_ID" --window-size 15 --stride 10

# Limit windows for testing
python scripts/kg_extract_from_video.py --youtube-video-id "VIDEO_ID" --max-windows 5

# Enable debug mode
python scripts/kg_extract_from_video.py --youtube-video-id "VIDEO_ID" --debug

API Server

# Start chat API
python -m uvicorn api.search_api:app --reload --host 0.0.0.0 --port 8000

# Enable tracing
CHAT_TRACE=1 python -m uvicorn api.search_api:app --reload

Cron Transcription

# Process watchlist
python scripts/cron_transcription.py --process

# List watchlist
python scripts/cron_transcription.py --list

# Add to watchlist
python scripts/cron_transcription.py --add "VIDEO_ID"

# Remove from watchlist
python scripts/cron_transcription.py --remove "VIDEO_ID"

Database Management

# Clear KG tables
python scripts/clear_kg.py --yes

# Migrate chat schema
python scripts/migrate_chat_schema.py

# Backfill speaker roles
python scripts/backfill_speaker_video_roles.py

# Ingest transcript JSON into Postgres
python scripts/ingest_transcript_json.py --transcript-file transcription_output.json --youtube-video-id "VIDEO_ID"

# Ingest bills into Postgres
python scripts/ingest_bills.py --scrape

Order Papers

# Ingest order paper PDF
python scripts/ingest_order_paper_pdf.py --file "order_paper.pdf"

# Match papers to videos
python scripts/match_order_papers_to_videos.py

# Export order paper
python scripts/export_order_paper.py --id "ORDER_ID"

Testing

# Run all tests
python -m pytest tests/ -v

# Run specific test file
python -m pytest tests/test_chat_agent_v2_unit.py -v
python -m pytest tests/test_kg_agent_loop_unit.py -v

# Lint
ruff check .
ruff check . --fix

# Type check
mypy lib/

API Endpoints

Base URL: `http://localhost:8000`

Method	Endpoint	Description
POST	`/search`	Hybrid search
POST	`/search/temporal`	Search with date/speaker/entity filters
GET	`/search/trends`	Trend analysis for entities
GET	`/speakers`	List speakers
GET	`/speakers/{speaker_id}`	Speaker details
GET	`/videos/{youtube_video_id}/speakers/{speaker_id}/roles`	Speaker roles for a video
POST	`/chat/threads`	Create thread
POST	`/chat/threads/{thread_id}/messages`	Send message
GET	`/chat/threads/{thread_id}/messages/stream`	Stream message response
GET	`/health`	Health check
GET	`/api`	API metadata

Environment Variables

Variable	Description
`GOOGLE_API_KEY`	Google AI API key
`CHAT_TRACE`	Enable tracing (1/true/on)
`ENABLE_THINKING`	Enable model thinking

Key Files

Component	Location
Chat Agent	`lib/chat_agent_v2.py`
KG Agent Loop	`lib/kg_agent_loop.py`
Hybrid Graph-RAG	`lib/kg_hybrid_graph_rag.py`
Search API	`api/search_api.py`
Main Script	`transcribe.py`
KG Extraction	`lib/knowledge_graph/`
Order Papers	`lib/order_papers/`

Documentation

Document	Description
README.md	Project overview
COMPLETE_GUIDE.md	Full implementation guide
CODE_MAP_AND_REVIEW.md	Code structure
CHAT_TRACE.md	Debug tracing

Common Options

transcribe.py

Option	Default	Description
`--order-file	Path`	Required to order file
`--order-paper-id`	None	Order paper ID from database
`--segment-minutes`	30	Segment duration
`--overlap-minutes`	1	Segment overlap
`--start-minutes`	0	Start position
`--max-segments`	None	Limit segments
`--output-file`	Varies	Output file path
`--video`	None	YouTube ID/URL or gs:// URI

kg_extract_from_video.py

Option	Default	Description
`--youtube-video-id`	Required	Video ID
`--window-size`	30	Utterances per window
`--stride`	18	Utterances between windows
`--max-windows`	None	Limit windows
`--model`	gemini-2.5-flash	Model to use
`--debug`	False	Save failed responses

Database Tables

Chat Schema

chat_threads - Conversation threads
chat_messages - Messages with role/content
chat_thread_state - Persisted state

KG Schema

kg_nodes - Canonical nodes
kg_aliases - Alias index
kg_edges - Edges with provenance

Transcript Schema

paragraphs - Paragraphs with embeddings
sentences - Sentences with provenance
speakers - Speaker information
speaker_video_roles - Speaker roles per video

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YuhHearDem3 - Quick Reference

Essential Commands

Transcription

Knowledge Graph Extraction

API Server

Cron Transcription

Database Management

Order Papers

Testing

API Endpoints

Base URL: `http://localhost:8000`

Environment Variables

Key Files

Documentation

Common Options

transcribe.py

kg_extract_from_video.py

Database Tables

Chat Schema

KG Schema

Transcript Schema

FilesExpand file tree

QUICK_REFERENCE.md

Latest commit

History

QUICK_REFERENCE.md

File metadata and controls

YuhHearDem3 - Quick Reference

Essential Commands

Transcription

Knowledge Graph Extraction

API Server

Cron Transcription

Database Management

Order Papers

Testing

API Endpoints

Base URL: http://localhost:8000

Environment Variables

Key Files

Documentation

Common Options

transcribe.py

kg_extract_from_video.py

Database Tables

Chat Schema

KG Schema

Transcript Schema

Base URL: `http://localhost:8000`