Skip to content

SuryaViswanath/eye-in-hand

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project is about eye-in-hand. In this our aim is to understand the persons action based on his gaze, hand trajectory, and hand shape. Based on these parameters we aim to predict what is the object that the person could potentially pick up

Using pre-trained models to identify what's happening in the frame with regards to the person's actions

approach inspired from the NVIDIA Cosmos-reason1 paper: https://d1qx31qr3h6wln.cloudfront.net/publications/Cosmos_Reason1_Paper.pdf

Stages:

    system starts -> 
    
        check eye gaze direction -> does object exist -> set value
        
        check hand movement exists -> plot hand movement trajectory -> set value
        
        check handshape -> which object uses the handshape -> set value
        
        -> use the 3 values to predict the next action -> this is done till the action is done
        
        -> check if action is performed -> if not, repeat the process

How to run this:

Step 1 Clone the repository;

git clone https://github.com/SuryaViswanath/eye-in-hand.git

Step 2 Install the dependencies:

pip install -r requirements.txt

Step 3 Download and setup Ollama:

https://ollama.com/download

after installing ollama, run the following command:

ollama run deepseek-r1:1.5b

Start the LLM inference for reasoning capabilities:

ollama serve

Step 4 Run the system: python main.py

System Design: image

Example Outputs: Screenshot 2025-04-13 at 5 49 43 PM Screenshot 2025-04-13 at 5 50 14 PM

If you like what you are seeing, please star the repo

To Do:

  • Add object detection
  • Add live action anticipation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages