Multimodal Fusion for Cow Behavior Prediction

This is research-backed project exploring multimodal sensor fusion for precise and robust dairy cow behavior recognition using real-world barn data.
Developed at IIT Ropar, this work combines video, sensor and environmental signals using deep learning to classify behaviors like lying, standing, feeding and more.

This project was published at the 3rd International Conference on Agriculture-Centric Computation 2025 held at Guwahati, Assam, India, Paper ID: 93.

Overview

Traditional livestock monitoring methods (manual observation, single-sensor devices) are often inaccurate, labor-intensive, and lack temporal or spatial resolution.

In this project, we propose two deep learning architectures that fuse data from multiple sources:

RGB multi-view barn videos
UWB location tracking
IMMU motion data
Ankle posture and head direction sensors
Environmental THI readings

This fusion improves detection of behavior across varied lighting conditions, occlusion and different animals. It lays the groundwork for real-time livestock monitoring systems in commercial dairy farms.

Dataset – MMCows

We use the MMCows dataset, a large-scale multimodal cow behavior dataset collected from a functioning dairy barn.

Github Page: Click Here

Official Project Page: https://engineering.purdue.edu/neis/research/projects/mmcows.html

Dataset on Hugging Face: MMCows @ Hugging Face

Dataset Highlights:

16 dairy cows tagged with sensors for 24/7 monitoring
Multi-sensor streams: UWB, IMMU, pressure, ankle sensors, vaginal temperature, and RGB video
213,000+ labeled image bounding boxes across 20,000 images
7 behavior classes annotated per second: Walking, Standing, Lying, Feeding (Up/Down), Licking, Drinking
Multi-angle camera setup (GoPro HERO11, 4.5K resolution)
Synchronized timestamps for fusion

Methods

We developed and compared two multimodal deep learning models:

1. Fusion 1 – EfficientNet + DNN + Attention

EfficientNet-B0 extracts visual features
Sensor data is processed through a deep neural network
Attention mechanism fuses both streams

2. Fusion 2 – Vision Transformer + Sensor Token (Best Performing)

Sensor data embedded as a token alongside image patches
ViT encoder learns global dependencies across modalities

Split Strategies

Object-wise Split: Generalization across different cows
Temporal Split: Generalization across environmental and lighting variations

Behaviors Labels

Behavior	Description
Walking	Locomotion across barn
Standing	Stationary upright posture
Lying	Resting or sleeping state
Feeding ↑ / ↓	Eating behavior with head angle
Licking	Tongue contact with objects
Drinking	Water intake behavior

Key Results

Model	Average F1 Score	Best Behavior Accuracy
UWB only	0.717	Lying (0.961)
RGB (multi-view)	0.632	Lying (0.883)
Fusion 1 (EffNet)	0.810	Lying (0.998)
Fusion 2 (ViT)	0.836	Lying (1.000), Standing (0.972)

See experiments/ for full results and ablation studies.

Project Structure

cow-behavior-fusion/
├── data/ # Dataset EDA and setup
├── models/ # Model architectures (EfficientNet, ViT)
├── preprocessing/ # Image & sensor processing scripts
├── experiments/ # Evaluation, modality ablations
├── modules/ # Cow detection & classification
├── scripts/ # Training, evaluation, configs 
└── README.md

Setup

1. Clone the Repository

git clone https://github.com/your-username/cow-behavior-fusion.git
cd cow-behavior-fusion

2. Install Dependencies

Make sure you have Python 3.8+ and install the required packages. Also, ensure compatible versions of torch, numpy, opencv-python, and transformers are installed. CUDA-enabled GPU is recommended for training and faster inference.

3. Download the data

Place the data from Hugging Faces after downloading into suitable location.

4. Running the Code

Train Model

python scripts/train.py --model fusion2 --split temporal

Evaluate Model

python scripts/evaluate.py --weights saved_models/fusion2_best.pt

Preprocess Data

python preprocessing/srgb_proc.py

Citation

If you use this work in your research or development, please cite it.

Contributors

This project was developed by students and faculty at Indian Institute of Technology Ropar (IIT Ropar):

Ajeet Kumar, Department of CSE, IIT Ropar
Abhinav Upadhyay, Department of CSE, IIT Ropar
Varun Kukreti, Department of CSE, IIT Ropar
Vajja Yashaswini, Department of CSE, IIT Ropar
Dr. Neeraj Goel, Professor at Department of CSE, IIT Ropar
Dr. Mukesh Saini, Professor at Department of CSE, IIT Ropar

Acknowledgements

We would like to thank:

Authors of original MMCows dataset, which formed the foundation of this research.
IIT Ropar for their continuous guidance, support and research infrastructure.
The open-source community for tools like PyTorch, Transformers and OpenCV which made this project possible.

Contact

For questions, issues, or collaborations feel free to reach out to the contributors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multimodal Fusion for Cow Behavior Prediction

Overview

Dataset – MMCows

Dataset Highlights:

Methods

1. Fusion 1 – EfficientNet + DNN + Attention

2. Fusion 2 – Vision Transformer + Sensor Token (Best Performing)

Split Strategies

Behaviors Labels

Key Results

Project Structure

Setup

1. Clone the Repository

2. Install Dependencies

3. Download the data

4. Running the Code

Train Model

Evaluate Model

Preprocess Data

Citation

Contributors

Acknowledgements

Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data		data
experiments		experiments
modules		modules
preprocessing		preprocessing
README.md		README.md

implosion07/multimodal-fusion-cow-behavior-detection

Folders and files

Latest commit

History

Repository files navigation

Multimodal Fusion for Cow Behavior Prediction

Overview

Dataset – MMCows

Dataset Highlights:

Methods

1. Fusion 1 – EfficientNet + DNN + Attention

2. Fusion 2 – Vision Transformer + Sensor Token (Best Performing)

Split Strategies

Behaviors Labels

Key Results

Project Structure

Setup

1. Clone the Repository

2. Install Dependencies

3. Download the data

4. Running the Code

Train Model

Evaluate Model

Preprocess Data

Citation

Contributors

Acknowledgements

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages