Skip to content

Commit 67ca661

Browse files
update readme
1 parent dddcc03 commit 67ca661

File tree

3 files changed

+200
-233
lines changed

3 files changed

+200
-233
lines changed

README.md

Lines changed: 37 additions & 222 deletions
Original file line numberDiff line numberDiff line change
@@ -1,243 +1,58 @@
1-
# Mask R-CNN for Object Detection and Segmentation
1+
# Vehicle Detection in the WILD
22

3-
This is an implementation of [Mask R-CNN](https://arxiv.org/abs/1703.06870) on Python 3, Keras, and TensorFlow. The model generates bounding boxes and segmentation masks for each instance of an object in the image. It's based on Feature Pyramid Network (FPN) and a ResNet101 backbone.
3+
Object Recognition can solve many real-world problems around us but the current research in ML Domain happens
4+
to be focused on dataset that is standardised, clear and clean. In the Indian scenario context, there is a lot of uncertainty that
5+
we counter with because of non-standard practices that add more real challenge to understanding the scene, for better decision making.
6+
Example: Imagine a crowded two lane road in a metropolitan city. You will see lots of objects and its complex relationship in scene.
7+
All these interlinked relations makes it really hard to make decisions.
48

5-
![Instance Segmentation Sample](assets/street.png)
9+
For better understanding of task, I have trained MaskRCNN-Model and created dataset from scratch using cvat tools. I am able to achieve 0.54 mAP.
610

7-
The repository includes:
8-
* Source code of Mask R-CNN built on FPN and ResNet101.
9-
* Training code for MS COCO
10-
* Pre-trained weights for MS COCO
11-
* Jupyter notebooks to visualize the detection pipeline at every step
12-
* ParallelModel class for multi-GPU training
13-
* Evaluation on MS COCO metrics (AP)
14-
* Example of training on your own dataset
11+
## Dataset:
12+
1. Make a 3 video in busy city of Bangalore, keeping mobile camera in hand over a bike. Got around: 15000 Image. After cleaning and clearing, It concludes to 6000 Images
13+
2. Load the Data in CVAT and Annotation it with auto-annotation model and track feature of tools. It takes around 1 hours to get 1000 Images
14+
3. Exported in label-me format
1515

16+
## Model:
17+
1. Trained a Mask R-CNN Model for Object Detection and Segmentation: This is an implementation of [Mask R-CNN](https://arxiv.org/abs/1703.06870) on Python 3, Keras, and TensorFlow. Ref: [Matterport MaskRCNN](https://github.com/matterport/Mask_RCNN)
1618

17-
The code is documented and designed to be easy to extend. If you use it in your research, please consider citing this repository (bibtex below). If you work on 3D vision, you might find our recently released [Matterport3D](https://matterport.com/blog/2017/09/20/announcing-matterport3d-research-dataset/) dataset useful as well.
18-
This dataset was created from 3D-reconstructed spaces captured by our customers who agreed to make them publicly available for academic use. You can see more examples [here](https://matterport.com/gallery/).
19+
2. Jupyter Notebook, `Kaggle-training.ipyb`: Model Trained and Inference in Kaggle GPU Notebook
1920

20-
# Getting Started
21-
* [demo.ipynb](samples/demo.ipynb) Is the easiest way to start. It shows an example of using a model pre-trained on MS COCO to segment objects in your own images.
22-
It includes code to run object detection and instance segmentation on arbitrary images.
21+
3. Config File
22+
``` # define a configuration for the model
23+
class MathikereTrainConfig(Config):
24+
# define the name of the configuration
25+
NAME = "mathikere_cfg"
26+
# number of classes (background + no of class)
27+
NUM_CLASSES = 1 + 3
28+
29+
GPU_COUNT = 1
30+
IMAGES_PER_GPU = 1
31+
32+
# number of training steps per epoch
33+
STEPS_PER_EPOCH = 13
34+
```
2335

24-
* [train_shapes.ipynb](samples/shapes/train_shapes.ipynb) shows how to train Mask R-CNN on your own dataset. This notebook introduces a toy dataset (Shapes) to demonstrate training on a new dataset.
36+
4. Dataset Loader: MathikereDataset(Dataset) class
37+
5. Epoch: Head-epoch: 5 and E2E-Model: 20
2538

26-
* ([model.py](mrcnn/model.py), [utils.py](mrcnn/utils.py), [config.py](mrcnn/config.py)): These files contain the main Mask RCNN implementation.
39+
## Inference Video
40+
[![IMAGE ALT TEXT HERE](https://img.youtube.com/vi/n03m9lC32H0/0.jpg)](https://www.youtube.com/watch?v=n03m9lC32H0)
2741

2842

29-
* [inspect_data.ipynb](samples/coco/inspect_data.ipynb). This notebook visualizes the different pre-processing steps
30-
to prepare the training data.
43+
### CVAT Annotation
3144

32-
* [inspect_model.ipynb](samples/coco/inspect_model.ipynb) This notebook goes in depth into the steps performed to detect and segment objects. It provides visualizations of every step of the pipeline.
45+
![](assets/vehicleDetection.png)
3346

34-
* [inspect_weights.ipynb](samples/coco/inspect_weights.ipynb)
35-
This notebooks inspects the weights of a trained model and looks for anomalies and odd patterns.
36-
37-
38-
# Step by Step Detection
39-
To help with debugging and understanding the model, there are 3 notebooks
40-
([inspect_data.ipynb](samples/coco/inspect_data.ipynb), [inspect_model.ipynb](samples/coco/inspect_model.ipynb),
41-
[inspect_weights.ipynb](samples/coco/inspect_weights.ipynb)) that provide a lot of visualizations and allow running the model step by step to inspect the output at each point. Here are a few examples:
42-
43-
44-
45-
## 1. Anchor sorting and filtering
46-
Visualizes every step of the first stage Region Proposal Network and displays positive and negative anchors along with anchor box refinement.
47-
![](assets/detection_anchors.png)
48-
49-
## 2. Bounding Box Refinement
50-
This is an example of final detection boxes (dotted lines) and the refinement applied to them (solid lines) in the second stage.
51-
![](assets/detection_refinement.png)
52-
53-
## 3. Mask Generation
54-
Examples of generated masks. These then get scaled and placed on the image in the right location.
55-
56-
![](assets/detection_masks.png)
57-
58-
## 4.Layer activations
59-
Often it's useful to inspect the activations at different layers to look for signs of trouble (all zeros or random noise).
60-
61-
![](assets/detection_activations.png)
62-
63-
## 5. Weight Histograms
64-
Another useful debugging tool is to inspect the weight histograms. These are included in the inspect_weights.ipynb notebook.
65-
66-
![](assets/detection_histograms.png)
67-
68-
## 6. Logging to TensorBoard
69-
TensorBoard is another great debugging and visualization tool. The model is configured to log losses and save weights at the end of every epoch.
70-
71-
![](assets/detection_tensorboard.png)
72-
73-
## 6. Composing the different pieces into a final result
74-
75-
![](assets/detection_final.png)
76-
77-
78-
# Training on MS COCO
79-
We're providing pre-trained weights for MS COCO to make it easier to start. You can
80-
use those weights as a starting point to train your own variation on the network.
81-
Training and evaluation code is in `samples/coco/coco.py`. You can import this
82-
module in Jupyter notebook (see the provided notebooks for examples) or you
83-
can run it directly from the command line as such:
84-
85-
```
86-
# Train a new model starting from pre-trained COCO weights
87-
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=coco
88-
89-
# Train a new model starting from ImageNet weights
90-
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=imagenet
91-
92-
# Continue training a model that you had trained earlier
93-
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=/path/to/weights.h5
94-
95-
# Continue training the last model you trained. This will find
96-
# the last trained weights in the model directory.
97-
python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=last
98-
```
99-
100-
You can also run the COCO evaluation code with:
101-
```
102-
# Run COCO evaluation on the last trained model
103-
python3 samples/coco/coco.py evaluate --dataset=/path/to/coco/ --model=last
104-
```
105-
106-
The training schedule, learning rate, and other parameters should be set in `samples/coco/coco.py`.
107-
108-
109-
# Training on Your Own Dataset
110-
111-
Start by reading this [blog post about the balloon color splash sample](https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46). It covers the process starting from annotating images to training to using the results in a sample application.
112-
113-
In summary, to train the model on your own dataset you'll need to extend two classes:
114-
115-
```Config```
116-
This class contains the default configuration. Subclass it and modify the attributes you need to change.
117-
118-
```Dataset```
119-
This class provides a consistent way to work with any dataset.
120-
It allows you to use new datasets for training without having to change
121-
the code of the model. It also supports loading multiple datasets at the
122-
same time, which is useful if the objects you want to detect are not
123-
all available in one dataset.
124-
125-
See examples in `samples/shapes/train_shapes.ipynb`, `samples/coco/coco.py`, `samples/balloon/balloon.py`, and `samples/nucleus/nucleus.py`.
126-
127-
## Differences from the Official Paper
128-
This implementation follows the Mask RCNN paper for the most part, but there are a few cases where we deviated in favor of code simplicity and generalization. These are some of the differences we're aware of. If you encounter other differences, please do let us know.
129-
130-
* **Image Resizing:** To support training multiple images per batch we resize all images to the same size. For example, 1024x1024px on MS COCO. We preserve the aspect ratio, so if an image is not square we pad it with zeros. In the paper the resizing is done such that the smallest side is 800px and the largest is trimmed at 1000px.
131-
* **Bounding Boxes**: Some datasets provide bounding boxes and some provide masks only. To support training on multiple datasets we opted to ignore the bounding boxes that come with the dataset and generate them on the fly instead. We pick the smallest box that encapsulates all the pixels of the mask as the bounding box. This simplifies the implementation and also makes it easy to apply image augmentations that would otherwise be harder to apply to bounding boxes, such as image rotation.
132-
133-
To validate this approach, we compared our computed bounding boxes to those provided by the COCO dataset.
134-
We found that ~2% of bounding boxes differed by 1px or more, ~0.05% differed by 5px or more,
135-
and only 0.01% differed by 10px or more.
136-
137-
* **Learning Rate:** The paper uses a learning rate of 0.02, but we found that to be
138-
too high, and often causes the weights to explode, especially when using a small batch
139-
size. It might be related to differences between how Caffe and TensorFlow compute
140-
gradients (sum vs mean across batches and GPUs). Or, maybe the official model uses gradient
141-
clipping to avoid this issue. We do use gradient clipping, but don't set it too aggressively.
142-
We found that smaller learning rates converge faster anyway so we go with that.
143-
144-
## Citation
145-
Use this bibtex to cite this repository:
146-
```
147-
@misc{matterport_maskrcnn_2017,
148-
title={Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow},
149-
author={Waleed Abdulla},
150-
year={2017},
151-
publisher={Github},
152-
journal={GitHub repository},
153-
howpublished={\url{https://github.com/matterport/Mask_RCNN}},
154-
}
155-
```
156-
157-
## Contributing
158-
Contributions to this repository are welcome. Examples of things you can contribute:
159-
* Speed Improvements. Like re-writing some Python code in TensorFlow or Cython.
160-
* Training on other datasets.
161-
* Accuracy Improvements.
162-
* Visualizations and examples.
163-
164-
You can also [join our team](https://matterport.com/careers/) and help us build even more projects like this one.
16547

16648
## Requirements
167-
Python 3.6, TensorFlow 2.0, and other common packages listed in `requirements.txt`.
168-
169-
### MS COCO Requirements:
170-
To train or test on MS COCO, you'll also need:
171-
* pycocotools (installation instructions below)
172-
* [MS COCO Dataset](http://cocodataset.org/#home)
173-
* Download the 5K [minival](https://dl.dropboxusercontent.com/s/o43o90bna78omob/instances_minival2014.json.zip?dl=0)
174-
and the 35K [validation-minus-minival](https://dl.dropboxusercontent.com/s/s3tw5zcg7395368/instances_valminusminival2014.json.zip?dl=0)
175-
subsets. More details in the original [Faster R-CNN implementation](https://github.com/rbgirshick/py-faster-rcnn/blob/master/data/README.md).
176-
177-
If you use Docker, the code has been verified to work on
178-
[this Docker container](https://hub.docker.com/r/waleedka/modern-deep-learning/).
49+
Python 3.7.8, TensorFlow 2.0, and other common packages listed in `requirements.txt`.
17950

18051

18152
## Installation
18253
1. Clone this repository
18354
2. Install dependencies
184-
```bash
185-
pip3 install -r requirements.txt
186-
```
55+
``` pip3 install -r requirements.txt ```
18756
3. Run setup from the repository root directory
188-
```bash
189-
python3 setup.py install
190-
```
191-
3. Download pre-trained COCO weights (mask_rcnn_coco.h5) from the [releases page](https://github.com/matterport/Mask_RCNN/releases).
192-
4. (Optional) To train or test on MS COCO install `pycocotools` from one of these repos. They are forks of the original pycocotools with fixes for Python3 and Windows (the official repo doesn't seem to be active anymore).
193-
194-
* Linux: https://github.com/waleedka/coco
195-
* Windows: https://github.com/philferriere/cocoapi.
196-
You must have the Visual C++ 2015 build tools on your path (see the repo for additional details)
197-
198-
# Projects Using this Model
199-
If you extend this model to other datasets or build projects that use it, we'd love to hear from you.
200-
201-
### [4K Video Demo](https://www.youtube.com/watch?v=OOT3UIXZztE) by Karol Majek.
202-
[![Mask RCNN on 4K Video](assets/4k_video.gif)](https://www.youtube.com/watch?v=OOT3UIXZztE)
203-
204-
### [Images to OSM](https://github.com/jremillard/images-to-osm): Improve OpenStreetMap by adding baseball, soccer, tennis, football, and basketball fields.
205-
206-
![Identify sport fields in satellite images](assets/images_to_osm.png)
207-
208-
### [Splash of Color](https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46). A blog post explaining how to train this model from scratch and use it to implement a color splash effect.
209-
![Balloon Color Splash](assets/balloon_color_splash.gif)
210-
211-
212-
### [Segmenting Nuclei in Microscopy Images](samples/nucleus). Built for the [2018 Data Science Bowl](https://www.kaggle.com/c/data-science-bowl-2018)
213-
Code is in the `samples/nucleus` directory.
214-
215-
![Nucleus Segmentation](assets/nucleus_segmentation.png)
216-
217-
### [Detection and Segmentation for Surgery Robots](https://github.com/SUYEgit/Surgery-Robot-Detection-Segmentation) by the NUS Control & Mechatronics Lab.
218-
![Surgery Robot Detection and Segmentation](https://github.com/SUYEgit/Surgery-Robot-Detection-Segmentation/raw/master/assets/video.gif)
219-
220-
### [Reconstructing 3D buildings from aerial LiDAR](https://medium.com/geoai/reconstructing-3d-buildings-from-aerial-lidar-with-ai-details-6a81cb3079c0)
221-
A proof of concept project by [Esri](https://www.esri.com/), in collaboration with Nvidia and Miami-Dade County. Along with a great write up and code by Dmitry Kudinov, Daniel Hedges, and Omar Maher.
222-
![3D Building Reconstruction](assets/project_3dbuildings.png)
223-
224-
### [Usiigaci: Label-free Cell Tracking in Phase Contrast Microscopy](https://github.com/oist/usiigaci)
225-
A project from Japan to automatically track cells in a microfluidics platform. Paper is pending, but the source code is released.
226-
227-
![](assets/project_usiigaci1.gif) ![](assets/project_usiigaci2.gif)
228-
229-
### [Characterization of Arctic Ice-Wedge Polygons in Very High Spatial Resolution Aerial Imagery](http://www.mdpi.com/2072-4292/10/9/1487)
230-
Research project to understand the complex processes between degradations in the Arctic and climate change. By Weixing Zhang, Chandi Witharana, Anna Liljedahl, and Mikhail Kanevskiy.
231-
![image](assets/project_ice_wedge_polygons.png)
232-
233-
### [Mask-RCNN Shiny](https://github.com/huuuuusy/Mask-RCNN-Shiny)
234-
A computer vision class project by HU Shiyu to apply the color pop effect on people with beautiful results.
235-
![](assets/project_shiny1.jpg)
236-
237-
### [Mapping Challenge](https://github.com/crowdAI/crowdai-mapping-challenge-mask-rcnn): Convert satellite imagery to maps for use by humanitarian organisations.
238-
![Mapping Challenge](assets/mapping_challenge.png)
57+
``` python3 setup.py install ```
23958

240-
### [GRASS GIS Addon](https://github.com/ctu-geoforall-lab/i.ann.maskrcnn) to generate vector masks from geospatial imagery. Based on a [Master's thesis](https://github.com/ctu-geoforall-lab-projects/dp-pesek-2018) by Ondřej Pešek.
241-
![GRASS GIS Image](assets/project_grass_gis.png)
242-
# VehicleDetection
243-
# VehicleDetection

assets/vehicleDetection.png

1.25 MB
Loading

0 commit comments

Comments
 (0)