embodied_temporal_reasoning

Repository for project on temporal reasoning for intelligent human robot collaboration.

This project assists robots to perform temporal reasoning over the past, and carry out human-instructions in the present, in a generalized manner via Foundational Models.

Requirements

Whisper Model
- Download Whisper model from https://github.com/openai/whisper, and set up server file.
- In the client file, ros_whisper.py, set self.host and self.port to server's IP address and corresponding port number used by the server.
CogVLM2 Model
- Download CogVLM2 from https://github.com/THUDM/CogVLM2, and set up server file.
- In config/params.yaml, set cogvlm2_host_ip and cogvlm2_port to server's IP address and corresponding port number used by the server.
SAM2 Model
- Download CogVLM2 from https://github.com/IDEA-Research/Grounded-SAM-2, and set up server file.
- In config/params.yaml, set sam2_host_ip and sam2_port to server's IP address and corresponding port number used by the server.
Dataset
- Download the Dataset from https://drive.google.com/drive/folders/1c78MIOhFKuIKvPrMw79iLxZvnk47zg0X?usp=sharing.
- Set the dataset_folder_path in config/params.yaml and in baseline_params.yaml.
CogVLM - grounding model for Baselines
- Download CogVLM from https://github.com/THUDM/CogVLM, and set up server file.
- In config/baseline_params.yaml, set cogvlm_host_ip and cogvlm_port to server's IP address and corresponding port number used by the server.
Recording Real-time Video
- Set up realsense camera to record input video
- Set video_length in config/params.yaml to the maximum length of video that you want to record
- Video is stored in `output/run_output/input_video.mp4

Running the code

Testing Pipeline
- Set openai_api_key and pipeline_path variable in the config/params.yaml
- Run test_run.py for running pipeline on dataset. See output for each datapoint in output/dataset/ folder.
- To test with real-time data, record a video using realsense, convert input instruction using ros_whisper.py.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LICENSE		LICENSE
README.md		README.md
info.json		info.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

embodied_temporal_reasoning

Requirements

Whisper Model

CogVLM2 Model

SAM2 Model

Dataset

CogVLM - grounding model for Baselines

Recording Real-time Video

Running the code

Testing Pipeline

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

License

Project-LPEA/embodied_temporal_reasoning

Folders and files

Latest commit

History

Repository files navigation

embodied_temporal_reasoning

Requirements

Whisper Model

CogVLM2 Model

SAM2 Model

Dataset

CogVLM - grounding model for Baselines

Recording Real-time Video

Running the code

Testing Pipeline

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Packages