Skip to content

Commit ab110c0

Browse files
committed
Merge pull request opencv#10979 from dkurt:unite_dnn_samples
2 parents cc06935 + 538fd42 commit ab110c0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+2301
-10810
lines changed

doc/tutorials/dnn/dnn_googlenet/dnn_googlenet.markdown

Lines changed: 28 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -13,50 +13,53 @@ We will demonstrate results of this example on the following picture.
1313
Source Code
1414
-----------
1515

16-
We will be using snippets from the example application, that can be downloaded [here](https://github.com/opencv/opencv/blob/master/samples/dnn/caffe_googlenet.cpp).
16+
We will be using snippets from the example application, that can be downloaded [here](https://github.com/opencv/opencv/blob/master/samples/dnn/classification.cpp).
1717

18-
@include dnn/caffe_googlenet.cpp
18+
@include dnn/classification.cpp
1919

2020
Explanation
2121
-----------
2222

2323
-# Firstly, download GoogLeNet model files:
24-
[bvlc_googlenet.prototxt ](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/dnn/bvlc_googlenet.prototxt) and
24+
[bvlc_googlenet.prototxt ](https://github.com/opencv/opencv_extra/blob/master/testdata/dnn/bvlc_googlenet.prototxt) and
2525
[bvlc_googlenet.caffemodel](http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel)
2626

2727
Also you need file with names of [ILSVRC2012](http://image-net.org/challenges/LSVRC/2012/browse-synsets) classes:
28-
[synset_words.txt](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/dnn/synset_words.txt).
28+
[classification_classes_ILSVRC2012.txt](https://github.com/opencv/opencv/tree/master/samples/dnn/classification_classes_ILSVRC2012.txt).
2929

3030
Put these files into working dir of this program example.
3131

3232
-# Read and initialize network using path to .prototxt and .caffemodel files
33-
@snippet dnn/caffe_googlenet.cpp Read and initialize network
33+
@snippet dnn/classification.cpp Read and initialize network
3434

35-
-# Check that network was read successfully
36-
@snippet dnn/caffe_googlenet.cpp Check that network was read successfully
35+
You can skip an argument `framework` if one of the files `model` or `config` has an
36+
extension `.caffemodel` or `.prototxt`.
37+
This way function cv::dnn::readNet can automatically detects a model's format.
3738

3839
-# Read input image and convert to the blob, acceptable by GoogleNet
39-
@snippet dnn/caffe_googlenet.cpp Prepare blob
40-
We convert the image to a 4-dimensional blob (so-called batch) with 1x3x224x224 shape after applying necessary pre-processing like resizing and mean subtraction using cv::dnn::blobFromImage constructor.
40+
@snippet dnn/classification.cpp Open a video file or an image file or a camera stream
4141

42-
-# Pass the blob to the network
43-
@snippet dnn/caffe_googlenet.cpp Set input blob
44-
In bvlc_googlenet.prototxt the network input blob named as "data", therefore this blob labeled as ".data" in opencv_dnn API.
42+
cv::VideoCapture can load both images and videos.
43+
44+
@snippet dnn/classification.cpp Create a 4D blob from a frame
45+
We convert the image to a 4-dimensional blob (so-called batch) with `1x3x224x224` shape
46+
after applying necessary pre-processing like resizing and mean subtraction
47+
`(-104, -117, -123)` for each blue, green and red channels correspondingly using cv::dnn::blobFromImage function.
4548

46-
Other blobs labeled as "name_of_layer.name_of_layer_output".
49+
-# Pass the blob to the network
50+
@snippet dnn/classification.cpp Set input blob
4751

4852
-# Make forward pass
49-
@snippet dnn/caffe_googlenet.cpp Make forward pass
50-
During the forward pass output of each network layer is computed, but in this example we need output from "prob" layer only.
53+
@snippet dnn/classification.cpp Make forward pass
54+
During the forward pass output of each network layer is computed, but in this example we need output from the last layer only.
5155

5256
-# Determine the best class
53-
@snippet dnn/caffe_googlenet.cpp Gather output
54-
We put the output of "prob" layer, which contain probabilities for each of 1000 ILSVRC2012 image classes, to the `prob` blob.
55-
And find the index of element with maximal value in this one. This index correspond to the class of the image.
56-
57-
-# Print results
58-
@snippet dnn/caffe_googlenet.cpp Print results
59-
For our image we get:
60-
> Best class: #812 'space shuttle'
61-
>
62-
> Probability: 99.6378%
57+
@snippet dnn/classification.cpp Get a class with a highest score
58+
We put the output of network, which contain probabilities for each of 1000 ILSVRC2012 image classes, to the `prob` blob.
59+
And find the index of element with maximal value in this one. This index corresponds to the class of the image.
60+
61+
-# Run an example from command line
62+
@code
63+
./example_dnn_classification --model=bvlc_googlenet.caffemodel --config=bvlc_googlenet.prototxt --width=224 --height=224 --classes=classification_classes_ILSVRC2012.txt --input=space_shuttle.jpg --mean="104 117 123"
64+
@endcode
65+
For our image we get prediction of class `space shuttle` with more than 99% sureness.

doc/tutorials/dnn/dnn_halide/dnn_halide.markdown

Lines changed: 4 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -74,46 +74,7 @@ When you build OpenCV add the following configuration flags:
7474

7575
- `HALIDE_ROOT_DIR` - path to Halide build directory
7676

77-
## Sample
78-
79-
@include dnn/squeezenet_halide.cpp
80-
81-
## Explanation
82-
Download Caffe model from SqueezeNet repository: [train_val.prototxt](https://github.com/DeepScale/SqueezeNet/blob/master/SqueezeNet_v1.1/train_val.prototxt) and [squeezenet_v1.1.caffemodel](https://github.com/DeepScale/SqueezeNet/blob/master/SqueezeNet_v1.1/squeezenet_v1.1.caffemodel).
83-
84-
Also you need file with names of [ILSVRC2012](http://image-net.org/challenges/LSVRC/2012/browse-synsets) classes:
85-
[synset_words.txt](https://raw.githubusercontent.com/opencv/opencv/master/samples/data/dnn/synset_words.txt).
86-
87-
Put these files into working dir of this program example.
88-
89-
-# Read and initialize network using path to .prototxt and .caffemodel files
90-
@snippet dnn/squeezenet_halide.cpp Read and initialize network
91-
92-
-# Check that network was read successfully
93-
@snippet dnn/squeezenet_halide.cpp Check that network was read successfully
94-
95-
-# Read input image and convert to the 4-dimensional blob, acceptable by SqueezeNet v1.1
96-
@snippet dnn/squeezenet_halide.cpp Prepare blob
97-
98-
-# Pass the blob to the network
99-
@snippet dnn/squeezenet_halide.cpp Set input blob
100-
101-
-# Enable Halide backend for layers where it is implemented
102-
@snippet dnn/squeezenet_halide.cpp Enable Halide backend
103-
104-
-# Make forward pass
105-
@snippet dnn/squeezenet_halide.cpp Make forward pass
106-
Remember that the first forward pass after initialization require quite more
107-
time that the next ones. It's because of runtime compilation of Halide pipelines
108-
at the first invocation.
109-
110-
-# Determine the best class
111-
@snippet dnn/squeezenet_halide.cpp Determine the best class
112-
113-
-# Print results
114-
@snippet dnn/squeezenet_halide.cpp Print results
115-
For our image we get:
116-
117-
> Best class: #812 'space shuttle'
118-
>
119-
> Probability: 97.9812%
77+
## Set Halide as a preferable backend
78+
@code
79+
net.setPreferableBackend(DNN_BACKEND_HALIDE);
80+
@endcode

doc/tutorials/dnn/dnn_yolo/dnn_yolo.markdown

Lines changed: 8 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -18,40 +18,26 @@ VIDEO DEMO:
1818
Source Code
1919
-----------
2020

21-
The latest version of sample source code can be downloaded [here](https://github.com/opencv/opencv/blob/master/samples/dnn/yolo_object_detection.cpp).
21+
Use a universal sample for object detection models written
22+
[in C++](https://github.com/opencv/opencv/blob/master/samples/dnn/object_detection.cpp) and
23+
[in Python](https://github.com/opencv/opencv/blob/master/samples/dnn/object_detection.py) languages
2224

23-
@include dnn/yolo_object_detection.cpp
24-
25-
How to compile in command line with pkg-config
26-
----------------------------------------------
27-
28-
@code{.bash}
29-
30-
# g++ `pkg-config --cflags opencv` `pkg-config --libs opencv` yolo_object_detection.cpp -o yolo_object_detection
31-
32-
@endcode
25+
Usage examples
26+
--------------
3327

3428
Execute in webcam:
3529

3630
@code{.bash}
3731

38-
$ yolo_object_detection -camera_device=0 -cfg=[PATH-TO-DARKNET]/cfg/yolo.cfg -model=[PATH-TO-DARKNET]/yolo.weights -class_names=[PATH-TO-DARKNET]/data/coco.names
39-
40-
@endcode
41-
42-
Execute with image:
43-
44-
@code{.bash}
45-
46-
$ yolo_object_detection -source=[PATH-IMAGE] -cfg=[PATH-TO-DARKNET]/cfg/yolo.cfg -model=[PATH-TO-DARKNET]/yolo.weights -class_names=[PATH-TO-DARKNET]/data/coco.names
32+
$ example_dnn_object_detection --config=[PATH-TO-DARKNET]/cfg/yolo.cfg --model=[PATH-TO-DARKNET]/yolo.weights --classes=object_detection_classes_pascal_voc.txt --width=416 --height=416 --scale=0.00392
4733

4834
@endcode
4935

50-
Execute in video file:
36+
Execute with image or video file:
5137

5238
@code{.bash}
5339

54-
$ yolo_object_detection -source=[PATH-TO-VIDEO] -cfg=[PATH-TO-DARKNET]/cfg/yolo.cfg -model=[PATH-TO-DARKNET]/yolo.weights -class_names=[PATH-TO-DARKNET]/data/coco.names
40+
$ example_dnn_object_detection --config=[PATH-TO-DARKNET]/cfg/yolo.cfg --model=[PATH-TO-DARKNET]/yolo.weights --classes=object_detection_classes_pascal_voc.txt --width=416 --height=416 --scale=0.00392 --input=[PATH-TO-IMAGE-OR-VIDEO-FILE]
5541

5642
@endcode
5743

modules/core/include/opencv2/core.hpp

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3159,7 +3159,7 @@ class CV_EXPORTS_W Algorithm
31593159

31603160
struct Param {
31613161
enum { INT=0, BOOLEAN=1, REAL=2, STRING=3, MAT=4, MAT_VECTOR=5, ALGORITHM=6, FLOAT=7,
3162-
UNSIGNED_INT=8, UINT64=9, UCHAR=11 };
3162+
UNSIGNED_INT=8, UINT64=9, UCHAR=11, SCALAR=12 };
31633163
};
31643164

31653165

@@ -3252,6 +3252,14 @@ template<> struct ParamType<uchar>
32523252
enum { type = Param::UCHAR };
32533253
};
32543254

3255+
template<> struct ParamType<Scalar>
3256+
{
3257+
typedef const Scalar& const_param_type;
3258+
typedef Scalar member_type;
3259+
3260+
enum { type = Param::SCALAR };
3261+
};
3262+
32553263
//! @} core_basic
32563264

32573265
} //namespace cv

modules/core/src/command_line_parser.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -104,6 +104,12 @@ static void from_str(const String& str, int type, void* dst)
104104
ss >> *(double*)dst;
105105
else if( type == Param::STRING )
106106
*(String*)dst = str;
107+
else if( type == Param::SCALAR)
108+
{
109+
Scalar& scalar = *(Scalar*)dst;
110+
for (int i = 0; i < 4 && !ss.eof(); ++i)
111+
ss >> scalar[i];
112+
}
107113
else
108114
CV_Error(Error::StsBadArg, "unknown/unsupported parameter type");
109115

modules/core/test/test_utils.cpp

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -261,4 +261,26 @@ TEST(AutoBuffer, allocate_test)
261261
EXPECT_EQ(6u, abuf.size());
262262
}
263263

264+
TEST(CommandLineParser, testScalar)
265+
{
266+
static const char * const keys3 =
267+
"{ s0 | 3 4 5 | default scalar }"
268+
"{ s1 | | single value scalar }"
269+
"{ s2 | | two values scalar (default with zeros) }"
270+
"{ s3 | | three values scalar }"
271+
"{ s4 | | four values scalar }"
272+
"{ s5 | | five values scalar }";
273+
274+
const char* argv[] = {"<bin>", "--s1=1.1", "--s3=1.1 2.2 3",
275+
"--s4=-4.2 1 0 3", "--s5=5 -4 3 2 1"};
276+
const int argc = 5;
277+
CommandLineParser parser(argc, argv, keys3);
278+
EXPECT_EQ(parser.get<Scalar>("s0"), Scalar(3, 4, 5));
279+
EXPECT_EQ(parser.get<Scalar>("s1"), Scalar(1.1));
280+
EXPECT_EQ(parser.get<Scalar>("s2"), Scalar(0));
281+
EXPECT_EQ(parser.get<Scalar>("s3"), Scalar(1.1, 2.2, 3));
282+
EXPECT_EQ(parser.get<Scalar>("s4"), Scalar(-4.2, 1, 0, 3));
283+
EXPECT_EQ(parser.get<Scalar>("s5"), Scalar(5, -4, 3, 2));
284+
}
285+
264286
}} // namespace

modules/dnn/include/opencv2/dnn/all_layers.hpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN
153153
*/
154154

155155
int inputNameToIndex(String inputName);
156-
int outputNameToIndex(String outputName);
156+
int outputNameToIndex(const String& outputName);
157157
};
158158

159159
/** @brief Classical recurrent layer

modules/dnn/include/opencv2/dnn/dnn.hpp

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -222,7 +222,7 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN
222222
/** @brief Returns index of output blob in output array.
223223
* @see inputNameToIndex()
224224
*/
225-
virtual int outputNameToIndex(String outputName);
225+
CV_WRAP virtual int outputNameToIndex(const String& outputName);
226226

227227
/**
228228
* @brief Ask layer if it support specific backend for doing computations.
@@ -683,6 +683,29 @@ CV__DNN_EXPERIMENTAL_NS_BEGIN
683683
*/
684684
CV_EXPORTS_W Net readNetFromTorch(const String &model, bool isBinary = true);
685685

686+
/**
687+
* @brief Read deep learning network represented in one of the supported formats.
688+
* @param[in] model Binary file contains trained weights. The following file
689+
* extensions are expected for models from different frameworks:
690+
* * `*.caffemodel` (Caffe, http://caffe.berkeleyvision.org/)
691+
* * `*.pb` (TensorFlow, https://www.tensorflow.org/)
692+
* * `*.t7` | `*.net` (Torch, http://torch.ch/)
693+
* * `*.weights` (Darknet, https://pjreddie.com/darknet/)
694+
* @param[in] config Text file contains network configuration. It could be a
695+
* file with the following extensions:
696+
* * `*.prototxt` (Caffe, http://caffe.berkeleyvision.org/)
697+
* * `*.pbtxt` (TensorFlow, https://www.tensorflow.org/)
698+
* * `*.cfg` (Darknet, https://pjreddie.com/darknet/)
699+
* @param[in] framework Explicit framework name tag to determine a format.
700+
* @returns Net object.
701+
*
702+
* This function automatically detects an origin framework of trained model
703+
* and calls an appropriate function such @ref readNetFromCaffe, @ref readNetFromTensorflow,
704+
* @ref readNetFromTorch or @ref readNetFromDarknet. An order of @p model and @p config
705+
* arguments does not matter.
706+
*/
707+
CV_EXPORTS_W Net readNet(const String& model, const String& config = "", const String& framework = "");
708+
686709
/** @brief Loads blob which was serialized as torch.Tensor object of Torch7 framework.
687710
* @warning This function has the same limitations as readNetFromTorch().
688711
*/

modules/dnn/src/dnn.cpp

Lines changed: 40 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -399,7 +399,7 @@ struct DataLayer : public Layer
399399
void forward(std::vector<Mat*>&, std::vector<Mat>&, std::vector<Mat> &) {}
400400
void forward(InputArrayOfArrays inputs, OutputArrayOfArrays outputs, OutputArrayOfArrays internals) {}
401401

402-
int outputNameToIndex(String tgtName)
402+
int outputNameToIndex(const String& tgtName)
403403
{
404404
int idx = (int)(std::find(outNames.begin(), outNames.end(), tgtName) - outNames.begin());
405405
return (idx < (int)outNames.size()) ? idx : -1;
@@ -2521,7 +2521,7 @@ int Layer::inputNameToIndex(String)
25212521
return -1;
25222522
}
25232523

2524-
int Layer::outputNameToIndex(String)
2524+
int Layer::outputNameToIndex(const String&)
25252525
{
25262526
return -1;
25272527
}
@@ -2813,5 +2813,43 @@ BackendWrapper::BackendWrapper(const Ptr<BackendWrapper>& base, const MatShape&
28132813

28142814
BackendWrapper::~BackendWrapper() {}
28152815

2816+
Net readNet(const String& _model, const String& _config, const String& _framework)
2817+
{
2818+
String framework = _framework.toLowerCase();
2819+
String model = _model;
2820+
String config = _config;
2821+
const std::string modelExt = model.substr(model.rfind('.') + 1);
2822+
const std::string configExt = config.substr(config.rfind('.') + 1);
2823+
if (framework == "caffe" || modelExt == "caffemodel" || configExt == "caffemodel" ||
2824+
modelExt == "prototxt" || configExt == "prototxt")
2825+
{
2826+
if (modelExt == "prototxt" || configExt == "caffemodel")
2827+
std::swap(model, config);
2828+
return readNetFromCaffe(config, model);
2829+
}
2830+
if (framework == "tensorflow" || modelExt == "pb" || configExt == "pb" ||
2831+
modelExt == "pbtxt" || configExt == "pbtxt")
2832+
{
2833+
if (modelExt == "pbtxt" || configExt == "pb")
2834+
std::swap(model, config);
2835+
return readNetFromTensorflow(model, config);
2836+
}
2837+
if (framework == "torch" || modelExt == "t7" || modelExt == "net" ||
2838+
configExt == "t7" || configExt == "net")
2839+
{
2840+
return readNetFromTorch(model.empty() ? config : model);
2841+
}
2842+
if (framework == "darknet" || modelExt == "weights" || configExt == "weights" ||
2843+
modelExt == "cfg" || configExt == "cfg")
2844+
{
2845+
if (modelExt == "cfg" || configExt == "weights")
2846+
std::swap(model, config);
2847+
return readNetFromDarknet(config, model);
2848+
}
2849+
CV_Error(Error::StsError, "Cannot determine an origin framework of files: " +
2850+
model + (config.empty() ? "" : ", " + config));
2851+
return Net();
2852+
}
2853+
28162854
CV__DNN_EXPERIMENTAL_NS_END
28172855
}} // namespace

modules/dnn/src/layers/recurrent_layers.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -355,7 +355,7 @@ int LSTMLayer::inputNameToIndex(String inputName)
355355
return -1;
356356
}
357357

358-
int LSTMLayer::outputNameToIndex(String outputName)
358+
int LSTMLayer::outputNameToIndex(const String& outputName)
359359
{
360360
if (outputName.toLowerCase() == "h")
361361
return 0;

0 commit comments

Comments
 (0)