Skip to content

Commit 6d612c6

Browse files
johan-hultberg-workjimmy-rubin-workJimmy Rubin
authored
vdo-larod: Read input resolution from model (#424)
Co-authored-by: Jimmy Rubin <jimmy.rubin@axis.com> Co-authored-by: Jimmy Rubin <jimmyrn@axis.com>
1 parent aa122f7 commit 6d612c6

File tree

11 files changed

+600
-545
lines changed

11 files changed

+600
-545
lines changed

vdo-larod/README.md

Lines changed: 70 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -18,12 +18,17 @@ Together with this README file you should be able to find a directory called app
1818

1919
## Detailed outline of example application
2020

21-
This application opens a client to VDO and starts fetching frames in the YUV format. It tries to find the smallest VDO stream resolution that fits the width and height required by the neural network.
21+
This application opens a client to VDO and starts fetching frames in the RGB or YUV format dependent on platform.
22+
Vdo is used to determine if a format is supported or not.
23+
The application will try to get the same resolution as requested from VDO. The only limit is the min and max
24+
resolution received from VDO.
25+
When using this it is important to also check the image so it looks good on the used camera.
2226

2327
Steps in application:
2428

2529
1. Fetch image data from VDO.
26-
2. Preprocess the images (crop to the size required by the neural network (if needed), scale and color convert) using larod with libyuv backend (depending on platform).
30+
2. If needed preprocess the images (crop to the size required by the neural network (if needed), scale and color convert) using larod with either cpu-proc (libyuv)
31+
or ambarella-cvflow-proc backend.
2732
3. Run inferences using the trained model on a specific chip with the preprocessing output as input on a larod backend specified by a command-line argument.
2833
4. Measure the total inference time (preprocessing and inference time) and determine if the framerate of the vdo streams needs to be adjusted.
2934
5. The model's confidence scores for the presence of person and car in the image are printed as the output.
@@ -281,24 +286,22 @@ In previous larod versions, the chip was referred to as a number instead of a st
281286
----- Contents of SYSTEM_LOG for 'vdo_larod' -----
282287

283288
vdo_larod[141742]: Starting /usr/local/packages/vdo_larod/vdo_larod
284-
vdo_larod[141742]: choose_stream_resolution: We select stream w/h=480 x 270 based on VDO channel info.
285-
vdo_larod[141742]: Creating VDO image provider and creating stream 480 x 270
289+
vdo_larod[141742]: Setting up larod connection with chip axis-a8-dlpu-tflite and model file /usr/local/packages/vdo_larod/model/model.tflite
290+
vdo_larod[141742]: Loading the model... This might take up to 5 minutes depending on your device model.
291+
vdo_larod[141742]: Model loaded successfully
292+
vdo_larod[3991067]: Detected model format RGB and input resolution 256x256
293+
vdo_larod[141742]: Created mmaped model output 0 with size 1
294+
vdo_larod[141742]: Created mmaped model output 1 with size 1
295+
vdo_larod[141742]: choose_stream_resolution: We select stream w/h=256 x 256 with format yuv based on VDO channel info.
296+
vdo_larod[141742]: Dump of vdo stream settings map =====
286297
vdo_larod[141742]: 'buffer.count'-----: <uint32 2>
287298
vdo_larod[141742]: 'dynamic.framerate': <true>
288299
vdo_larod[141742]: 'format'-----------: <uint32 3>
289300
vdo_larod[141742]: 'framerate'--------: <30.0>
290-
vdo_larod[141742]: 'height'-----------: <uint32 270>
301+
vdo_larod[141742]: 'height'-----------: <uint32 256>
291302
vdo_larod[141742]: 'input'------------: <uint32 1>
292303
vdo_larod[141742]: 'socket.blocking'--: <false>
293-
vdo_larod[141742]: 'width'------------: <uint32 480>
294-
vdo_larod[141742]: Dump of vdo stream settings map =====
295-
vdo_larod[141742]: Setting up larod connection with chip axis-a8-dlpu-tflite and model file /usr/local/packages/vdo_larod/model/model.tflite
296-
vdo_larod[141742]: Loading the model... This might take up to 5 minutes depending on your device model.
297-
vdo_larod[141742]: Model loaded successfully
298-
vdo_larod[141742]: Calculate crop image
299-
vdo_larod[141742]: Crop input image X=105 Y=0 (270 x 270)
300-
vdo_larod[141742]: Created mmaped model output 0 with size 1
301-
vdo_larod[141742]: Created mmaped model output 1 with size 1
304+
vdo_larod[141742]: 'width'------------: <uint32 256>
302305

303306
vdo_larod[141742]: Ran pre-processing for 3 ms
304307
vdo_larod[141742]: Ran inference for 14 ms
@@ -314,26 +317,23 @@ vdo_larod[141742]: Exit /usr/local/packages/vdo_larod/vdo_larod
314317

315318

316319
vdo_larod[3991067]: Starting /usr/local/packages/vdo_larod/vdo_larod
317-
vdo_larod[3991067]: choose_stream_resolution: We select stream w/h=480 x 360 based on VDO channel info.
318-
vdo_larod[3991067]: Creating VDO image provider and creating stream 480 x 360
319-
vdo_larod[3991067]: 'buffer.count'-----: <uint32 2>
320-
vdo_larod[3991067]: 'dynamic.framerate': <true>
321-
vdo_larod[3991067]: 'format'-----------: <uint32 3>
322-
vdo_larod[3991067]: 'framerate'--------: <30.0>
323-
vdo_larod[3991067]: 'height'-----------: <uint32 360>
324-
vdo_larod[3991067]: 'input'------------: <uint32 1>
325-
vdo_larod[3991067]: 'socket.blocking'--: <false>
326-
vdo_larod[3991067]: 'width'------------: <uint32 480>
327-
vdo_larod[3991067]: Dump of vdo stream settings map =====
328320
vdo_larod[3991067]: Setting up larod connection with chip a9-dlpu-tflite and model file /usr/local/packages/vdo_larod/model/model.tflite
329321
vdo_larod[3991067]: Loading the model... This might take up to 5 minutes depending on your device model.
330322
vdo_larod[3991067]: Model loaded successfully
331-
vdo_larod[3991067]: Calculate crop image
332-
vdo_larod[3991067]: Crop input image X=0 Y=60 (360 x 360)
323+
vdo_larod[3991067]: Detected model format RGB and input resolution 256x256
333324
vdo_larod[3991067]: Created mmaped model output 0 with size 1
334325
vdo_larod[3991067]: Created mmaped model output 1 with size 1
326+
vdo_larod[3991067]: choose_stream_resolution: We select stream w/h=256 x 256 with format rgb interleaved based on VDO channel info.
327+
vdo_larod[3991067]: Dump of vdo stream settings map =====
328+
vdo_larod[3991067]: 'buffer.count'-----: <uint32 2>
329+
vdo_larod[3991067]: 'dynamic.framerate': <true>
330+
vdo_larod[3991067]: 'format'-----------: <uint32 8>
331+
vdo_larod[3991067]: 'framerate'--------: <30.0>
332+
vdo_larod[3991067]: 'height'-----------: <uint32 256>
333+
vdo_larod[3991067]: 'input'------------: <uint32 1>
334+
vdo_larod[3991067]: 'socket.blocking'--: <false>
335+
vdo_larod[3991067]: 'width'------------: <uint32 256>
335336
vdo_larod[3991067]: Start fetching video frames from VDO
336-
vdo_larod[3991067]: Ran pre-processing for 13 ms
337337
vdo_larod[3991067]: Ran inference for 5 ms
338338
vdo_larod[3991067]: Person detected: 100.00% - Car detected: 3.14%
339339

@@ -346,28 +346,25 @@ vdo_larod[3991067]: Exit /usr/local/packages/vdo_larod/vdo_larod
346346
----- Contents of SYSTEM_LOG for 'vdo_larod' -----
347347

348348
vdo_larod[145071]: Starting /usr/local/packages/vdo_larod/vdo_larod
349-
vdo_larod[145071]: choose_stream_resolution: We select stream w/h=480 x 270 based on VDO channel info.
350-
vdo_larod[145071]: Creating VDO image provider and creating stream 480 x 270
351-
vdo_larod[145071]: Dump of vdo stream settings map =====
352-
vdo_larod[145071]: 'buffer.count'-----: <uint32 2>
353-
vdo_larod[145071]: 'dynamic.framerate': <true>
354-
vdo_larod[145071]: 'format'-----------: <uint32 3>
355-
vdo_larod[145071]: 'framerate'--------: <30.0>
356-
vdo_larod[145071]: 'height'-----------: <uint32 270>
357-
vdo_larod[145071]: 'input'------------: <uint32 1>
358-
vdo_larod[145071]: 'socket.blocking'--: <false>
359-
vdo_larod[145071]: 'width'------------: <uint32 480>
360349
vdo_larod[145071]: Setting up larod connection with chip cpu-tflite and model file /usr/local/packages/vdo_larod/model/model.tflite
361-
vdo_larod[145071]: Loading the model... This might take up to 5 minutes depending on your device model.
362-
vdo_larod[145071]: Model loaded successfully
363-
vdo_larod[145071]: Calculate crop image
364-
vdo_larod[145071]: Crop input image X=105 Y=0 (270 x 270)
365-
vdo_larod[145071]: Created mmaped model output 0 with size 1
366-
vdo_larod[145071]: Created mmaped model output 1 with size 1
367-
vdo_larod[145071]: Start fetching video frames from VDO
368-
vdo_larod[145071]: Ran pre-processing for 3 ms
369-
vdo_larod[145071]: Ran inference for 545 ms
370-
vdo_larod[145071]: Person detected: 100.00% - Car detected: 3.14%
350+
vdo_larod[3991067]: Loading the model... This might take up to 5 minutes depending on your device model.
351+
vdo_larod[3991067]: Model loaded successfully
352+
vdo_larod[3991067]: Detected model format RGB and input resolution 256x256
353+
vdo_larod[3991067]: Created mmaped model output 0 with size 1
354+
vdo_larod[3991067]: Created mmaped model output 1 with size 1
355+
vdo_larod[3991067]: choose_stream_resolution: We select stream w/h=256 x 256 with format rgb interleaved based on VDO channel info.
356+
vdo_larod[3991067]: Dump of vdo stream settings map =====
357+
vdo_larod[3991067]: 'buffer.count'-----: <uint32 2>
358+
vdo_larod[3991067]: 'dynamic.framerate': <true>
359+
vdo_larod[3991067]: 'format'-----------: <uint32 8>
360+
vdo_larod[3991067]: 'framerate'--------: <30.0>
361+
vdo_larod[3991067]: 'height'-----------: <uint32 256>
362+
vdo_larod[3991067]: 'input'------------: <uint32 1>
363+
vdo_larod[3991067]: 'socket.blocking'--: <false>
364+
vdo_larod[3991067]: 'width'------------: <uint32 256>
365+
vdo_larod[3991067]: Start fetching video frames from VDO
366+
vdo_larod[3991067]: Ran inference for 340 ms
367+
vdo_larod[3991067]: Person detected: 100.00% - Car detected: 3.14%
371368

372369
vdo_larod[145071]: Exit /usr/local/packages/vdo_larod/vdo_larod
373370
```
@@ -378,13 +375,23 @@ vdo_larod[145071]: Exit /usr/local/packages/vdo_larod/vdo_larod
378375
----- Contents of SYSTEM_LOG for 'vdo_larod' -----
379376

380377
vdo_larod[584171]: Starting /usr/local/packages/vdo_larod/vdo_larod
381-
vdo_larod[584171]: chooseStreamResolution: We select stream w/h=256 x 256 based on VDO channel info.
382-
vdo_larod[584171]: Creating VDO image provider and creating stream 256 x 256
383378
vdo_larod[584171]: Setting up larod connection with chip google-edge-tpu-tflite and model file /usr/local/packages/vdo_larod/model/model.tflite
384379
vdo_larod[584171]: Loading the model... This might take up to 5 minutes depending on your device model.
385380
vdo_larod[584171]: Model loaded successfully
381+
vdo_larod[584171]: Detected model format RGB and input resolution 256x256
386382
vdo_larod[584171]: Created mmaped model output 0 with size 1
387383
vdo_larod[584171]: Created mmaped model output 1 with size 1
384+
vdo_larod[584171]: chooseStreamResolution: We select stream w/h=256 x 256 based with format yuv based on VDO channel info.
385+
vdo_larod[3991067]: Dump of vdo stream settings map =====
386+
vdo_larod[3991067]: 'buffer.count'-----: <uint32 2>
387+
vdo_larod[3991067]: 'dynamic.framerate': <true>
388+
vdo_larod[3991067]: 'format'-----------: <uint32 3>
389+
vdo_larod[3991067]: 'framerate'--------: <30.0>
390+
vdo_larod[3991067]: 'height'-----------: <uint32 256>
391+
vdo_larod[3991067]: 'input'------------: <uint32 1>
392+
vdo_larod[3991067]: 'socket.blocking'--: <false>
393+
vdo_larod[3991067]: 'width'------------: <uint32 256>
394+
vdo_larod[584171]: Use preprocessing with input format yuv and output format rgb-interleaved
388395
vdo_larod[584171]: Start fetching video frames from VDO
389396

390397
vdo_larod[584171]: Ran pre-processing for 2 ms
@@ -401,15 +408,24 @@ vdo_larod[4165]: Exit /usr/local/packages/vdo_larod/vdo_larod
401408

402409

403410
vdo_larod[584171]: Starting /usr/local/packages/vdo_larod/vdo_larod
404-
vdo_larod[584171]: chooseStreamResolution: We select stream w/h=256 x 256 based on VDO channel info.
405-
vdo_larod[584171]: Creating VDO image provider and creating stream 256 x 256
406411
vdo_larod[584171]: Setting up larod connection with chip ambarella-cvflow and model file /usr/local/packages/vdo_larod/model/model.bin
407412
vdo_larod[584171]: Loading the model... This might take up to 5 minutes depending on your device model.
408413
vdo_larod[584171]: Model loaded successfully
414+
vdo_larod[584171]: Detected model format PLANAR RGB and input resolution 256x256
409415
vdo_larod[584171]: Created mmaped model output 0 with size 32
410416
vdo_larod[584171]: Created mmaped model output 1 with size 32
417+
vdo_larod[584171]: chooseStreamResolution: We select stream w/h=256 x 256 with format planar rgb based on VDO channel info.
418+
vdo_larod[3991067]: Dump of vdo stream settings map =====
419+
vdo_larod[3991067]: 'buffer.count'-----: <uint32 2>
420+
vdo_larod[3991067]: 'dynamic.framerate': <true>
421+
vdo_larod[3991067]: 'format'-----------: <uint32 9>
422+
vdo_larod[3991067]: 'framerate'--------: <30.0>
423+
vdo_larod[3991067]: 'height'-----------: <uint32 256>
424+
vdo_larod[3991067]: 'input'------------: <uint32 1>
425+
vdo_larod[3991067]: 'socket.blocking'--: <false>
426+
vdo_larod[3991067]: 'width'------------: <uint32 256>
427+
411428
vdo_larod[584171]: Start fetching video frames from VDO
412-
vdo_larod[584171]: Ran pre-processing for 1 ms
413429
vdo_larod[584171]: Ran inference for 50 ms
414430
vdo_larod[584171]: Person detected: 65.14% - Car detected: 11.92%
415431

0 commit comments

Comments
 (0)