Install

python -m pip install -r requirements.txt

*make sure you're using python 3.11.9

Running

cd ThirdEye
python .\gaze2\gaze_cursor.py --api

CLI

python agent_action.py \
  --image path.png \
  --doc_id X \
  --aoi_id Y \
  --aoi_type paragraph \
  --state confused

K2 Configuration Placeholders

The file /Users/utsavgupta/Documents/New project/agent_action.py includes editable placeholders:

DEFAULT_K2_BASE_URL = "https://YOUR_K2_BASE_URL"
DEFAULT_K2_MODEL = "YOUR_K2_MODEL"
DEFAULT_K2_API_KEY_ENV = "K2_API_KEY"

You can either edit those constants or pass values at runtime.

Exact Run Steps (with your API key)

Set your API key env var (replace the value):

export K2_API_KEY="PASTE_YOUR_REAL_KEY_HERE"

Run in K2 mode (replace URL/model):

python agent_action.py \
  --image /absolute/path/to/crop.png \
  --doc_id doc-123 \
  --aoi_id aoi-9 \
  --aoi_type paragraph \
  --state confused \
  --llm_mode k2 \
  --k2_base_url "https://YOUR_K2_BASE_URL" \
  --k2_model "YOUR_K2_MODEL" \
  --k2_api_key_env K2_API_KEY

The output JSON includes:

telemetry.llm_config.k2_api_key_present to confirm key visibility
telemetry.llm_preview placeholder text showing whether config is complete

Behavior:

Validates image path.
Acquires text in strict priority order:

AOIEvent.text_hint if non-empty and > 20 chars.
doc_text_provider.get_text(doc_id, aoi_id) (stub interface).
OCR fallback via pytesseract.
Image-only heuristics if OCR is poor/empty.

Routes by reader state (confused, interested, skimming, revising).
Returns 1–3 action cards with required buttons:
- Explain (explain_short)
- Explain deeper (explain_expanded)
- Dismiss (dismiss)
- I already know this (feedback_known)
- optional Make flashcards

Runnable examples (required)

Run all 3 examples:

python agent_action.py --run_examples

This prints JSON payloads for:

paragraph confusion
equation confusion
code confusion

Example output shape:

{
  "aoi_id": "aoi-p-1",
  "doc_id": "doc-paragraph",
  "state": "confused",
  "extracted_text": "Photosynthesis converts light energy into chemical energy...",
  "detected_language": "en",
  "actions": [
    {
      "title": "Direct explanation",
      "body": "Start here: ...",
      "buttons": [
        { "label": "Explain", "action_id": "explain_short" },
        { "label": "Explain deeper", "action_id": "explain_expanded" },
        { "label": "Dismiss", "action_id": "dismiss" },
        { "label": "I already know this", "action_id": "feedback_known" }
      ]
    }
  ],
  "suggested_prompts": [
    "[explain_short|short]\\n...",
    "[explain_short|expanded]\\n...",
    "[explain_expanded|short]\\n...",
    "[explain_expanded|expanded]\\n..."
  ],
  "telemetry": {
    "ocr_used": false,
    "confidence": 0.92,
    "heuristics": {
      "priority_order": [
        "text_hint_if_len_gt_20",
        "doc_text_provider_get_text",
        "ocr_with_pytesseract",
        "image_only_type_heuristics"
      ]
    }
  }
}

=======

ThirdEye

Your third eye: the browser that reads your mind

Features

Dwell-based context capture: Hover over content for 2 seconds to automatically search for related information
Centered context window: Captures text centered around your cursor position (10 lines before and after)
Toggle on/off: Enable or disable the extension without reloading the page
- Click the extension icon in the toolbar
- Use keyboard shortcut: Ctrl+Shift+G (or Cmd+Shift+G on Mac)
- Click the play/pause button in the overlay
Gaze tracking support: Optionally use eye tracking API for hands-free browsing
Works on special pages: Google Docs, Google Slides, and PDF.js viewers

Configuration

Edit content.js to customize:

DWELL_TIME_MS: How long to hover before triggering (default: 2000ms)
CONTEXT_LINES_BEFORE / CONTEXT_LINES_AFTER: Context window size (default: 10 lines each)
ENABLE_GAZE_MODE: Enable gaze tracking API (default: false)
GAZE_API_URL / ANALYZE_API_URL: API endpoints

159943b1c91f295fc323edce2134312738b770fa

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
.venv39		.venv39
Devfest		Devfest
backend		backend
contentGrabber		contentGrabber
data		data
gaze2		gaze2
.gitignore		.gitignore
AGENT_0_DUAL_DATA_INTEGRATION.md		AGENT_0_DUAL_DATA_INTEGRATION.md
AGENT_0_GOOGLE_DOCS_INTEGRATION.md		AGENT_0_GOOGLE_DOCS_INTEGRATION.md
AGENT_0_HISTORY_INTEGRATION.md		AGENT_0_HISTORY_INTEGRATION.md
AGENT_1_CV_IMPLEMENTATION_SUMMARY.md		AGENT_1_CV_IMPLEMENTATION_SUMMARY.md
AGENT_1_CV_INTEGRATION_COMPLETE.md		AGENT_1_CV_INTEGRATION_COMPLETE.md
AGENT_1_CV_TEST_RESULTS.md		AGENT_1_CV_TEST_RESULTS.md
AGENT_1_CV_WORKS.md		AGENT_1_CV_WORKS.md
AGENT_ARCHITECTURE.md		AGENT_ARCHITECTURE.md
AGENT_BUILD_COMPLETE.md		AGENT_BUILD_COMPLETE.md
AGENT_IMPLEMENTATION_STATUS.md		AGENT_IMPLEMENTATION_STATUS.md
ALL_AGENTS_COMPLETE.md		ALL_AGENTS_COMPLETE.md
BACKEND_INTEGRATION_GUIDE.md		BACKEND_INTEGRATION_GUIDE.md
BROWSER_HISTORY_COMPLETE.md		BROWSER_HISTORY_COMPLETE.md
BROWSER_HISTORY_SETUP.md		BROWSER_HISTORY_SETUP.md
FRONTEND_BACKEND_TESTING_GUIDE.md		FRONTEND_BACKEND_TESTING_GUIDE.md
GAZE_INTEGRATION.md		GAZE_INTEGRATION.md
GEMINI_AND_GOOGLE_SETUP.md		GEMINI_AND_GOOGLE_SETUP.md
GEMINI_MIGRATION.md		GEMINI_MIGRATION.md
GOOGLE_OAUTH_SCOPES_UPDATE.md		GOOGLE_OAUTH_SCOPES_UPDATE.md
GOOGLE_PERMISSIONS_SETUP.md		GOOGLE_PERMISSIONS_SETUP.md
GOOGLE_SCOPES_COMPLETE_GUIDE.md		GOOGLE_SCOPES_COMPLETE_GUIDE.md
IMPLEMENTATION_COMPLETE.md		IMPLEMENTATION_COMPLETE.md
IMPLEMENTATION_STATUS.md		IMPLEMENTATION_STATUS.md
INTEGRATION_STATUS.md		INTEGRATION_STATUS.md
K2THINK_API_KEY_CHECKLIST.md		K2THINK_API_KEY_CHECKLIST.md
K2THINK_FIXED.md		K2THINK_FIXED.md
LICENSE		LICENSE
PHASE0_COMPLETE.md		PHASE0_COMPLETE.md
PHASE0_VERIFICATION.md		PHASE0_VERIFICATION.md
PHASE3_PREPARATION.md		PHASE3_PREPARATION.md
PHASE3_PROGRESS.md		PHASE3_PROGRESS.md
QUICK_START.md		QUICK_START.md
QUICK_TEST_SUMMARY.md		QUICK_TEST_SUMMARY.md
README.md		README.md
TESTING_STEPS.md		TESTING_STEPS.md
TEST_AGENT_0_GUIDE.md		TEST_AGENT_0_GUIDE.md
UNIFIED_ENV_SETUP.md		UNIFIED_ENV_SETUP.md
agent_action_template.py		agent_action_template.py
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

Running

CLI

K2 Configuration Placeholders

Exact Run Steps (with your API key)

Runnable examples (required)

ThirdEye

Features

Configuration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Install

Running

CLI

K2 Configuration Placeholders

Exact Run Steps (with your API key)

Runnable examples (required)

ThirdEye

Features

Configuration

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages