Region-based Convolutional Neural Network (RCNN) finetuned for an FPGA CNN-accelerated chip. Since the FPGA chips requires a specific convolutional layer, we finetune a VGG-Net pretrained from ImageNet and optimized for the FPGA by freezing all convolutional layers.
Closely modeled from the original RCNN paper. NOTE: My version of RCNN has removed the SVM classifier and Bounding Box regressor to better demonstrate the finetuning process at the expense of accuracy.
- Install Caffe. Refer to the Caffe website for more information
- Download the Image sets (JPEG) and Annotation data (XML). In my finetuning, I used the ILSVRC14 DET set. Store these files in
$WORKSPACE/ILSVRC14_DET
where$WORKSPACE
is your current working directory. - Create the necessary WindowData Layer by running
val1 should be used for training and val2 for validation. (You may need to put your own settings in
./scripts/fetch_selective_search_data.sh ./scripts/create_window_data.py window_data
create_window_data.py
. Use-h
flag for help) - Put your caffemodel into
$WORKSPACE/models/fpga_vgg16
. I used a variant of the VGG16 model with fixed point precision Convolutional Layers. - Train your network. Make sure that the source parameter of the data layer
vgg16_freeze_finetune_trainval_test.prototxt
points to the correct directory of your window_data file(s). Depending on your GPU, batch size may also need to be decreased. Change thenet
andsnapshot
parameter of thevgg16_freeze_finetune_solver.prototxt
to properly reflect your own working directory.Training should take roughly 10-13 hours on GPU. (Tested with GTX 1070 ~3GB Memory)cd $WORKSPACE caffe train solver=./models/fpga_vgg16/vgg16_freeze_finetune_solver.prototxt weights=./models/fpga_vgg15/<YOUR_PRETRAINED_MODEL>.caffemodel --gpu all
- Demo your network by running
mydetect.py
. UseVGG_ILSVRC_16_layers_deploy.prototxt
as the model when demoing. Refer tomydetect.py -h
for usage.
Training data can be graphed using the included CaffePlot.py, which has been slightly modified from the original.