Skip to content

Commit 4255590

Browse files
author
Anna Grebneva
authored
AC: Refactoring custom evaluators (models part) (#2880)
* Refactor models for custom evaluators * Align network_info section for cascade models * Fix pylint errors * Updated text-recognition configs * Fixed input/output setting for asr_encoder_prediction_joint_evaluator * Fixed adapter output_blob setting for asr_encoder_decoder_evaluator
1 parent 5d6a0ea commit 4255590

File tree

24 files changed

+1437
-3428
lines changed

24 files changed

+1437
-3428
lines changed

models/intel/formula-recognition-medium-scan-0001/accuracy-check.yml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,9 @@ evaluations:
33
module: custom_evaluators.custom_text_recognition_evaluator.TextRecognitionWithAttentionEvaluator
44
module_config:
55
network_info:
6-
7-
max_seq_len: '192'
6+
recognizer_encoder: {}
7+
recognizer_decoder: {}
8+
max_seq_len: '192'
89

910
launchers:
1011
- framework: dlsdk

models/intel/formula-recognition-polynomials-handwritten-0001/accuracy-check.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,9 @@ evaluations:
33
module: custom_evaluators.custom_text_recognition_evaluator.TextRecognitionWithAttentionEvaluator
44
module_config:
55
network_info:
6-
max_seq_len: "192"
6+
recognizer_encoder: {}
7+
recognizer_decoder: {}
8+
max_seq_len: "192"
79

810
launchers:
911
- framework: dlsdk

models/intel/text-recognition-0015/accuracy-check.yml

Lines changed: 70 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -5,74 +5,76 @@ evaluations:
55
model_type: SequentialTextRecognitionModel
66
lowercase: true
77
network_info:
8-
max_seq_len: "24"
9-
custom_label_map:
10-
0: "<s>"
11-
1: ""
12-
2: "</s>"
13-
3: "?"
14-
4: "0"
15-
5: "1"
16-
6: "2"
17-
7: "3"
18-
8: "4"
19-
9: "5"
20-
10: "6"
21-
11: "7"
22-
12: "8"
23-
13: "9"
24-
14: "a"
25-
15: "b"
26-
16: "c"
27-
17: "d"
28-
18: "e"
29-
19: "f"
30-
20: "g"
31-
21: "h"
32-
22: "i"
33-
23: "j"
34-
24: "k"
35-
25: "l"
36-
26: "m"
37-
27: "n"
38-
28: "o"
39-
29: "p"
40-
30: "q"
41-
31: "r"
42-
32: "s"
43-
33: "t"
44-
34: "u"
45-
35: "v"
46-
36: "w"
47-
37: "x"
48-
38: "y"
49-
39: "z"
50-
40: "A"
51-
41: "B"
52-
42: "C"
53-
43: "D"
54-
44: "E"
55-
45: "F"
56-
46: "G"
57-
47: "H"
58-
48: "I"
59-
49: "J"
60-
50: "K"
61-
51: "L"
62-
52: "M"
63-
53: "N"
64-
54: "O"
65-
55: "P"
66-
56: "Q"
67-
57: "R"
68-
58: "S"
69-
59: "T"
70-
60: "U"
71-
61: "V"
72-
62: "W"
73-
63: "X"
74-
64: "Y"
75-
65: "Z"
8+
recognizer_encoder: {}
9+
recognizer_decoder: {}
10+
max_seq_len: "24"
11+
custom_label_map:
12+
0: "<s>"
13+
1: ""
14+
2: "</s>"
15+
3: "?"
16+
4: "0"
17+
5: "1"
18+
6: "2"
19+
7: "3"
20+
8: "4"
21+
9: "5"
22+
10: "6"
23+
11: "7"
24+
12: "8"
25+
13: "9"
26+
14: "a"
27+
15: "b"
28+
16: "c"
29+
17: "d"
30+
18: "e"
31+
19: "f"
32+
20: "g"
33+
21: "h"
34+
22: "i"
35+
23: "j"
36+
24: "k"
37+
25: "l"
38+
26: "m"
39+
27: "n"
40+
28: "o"
41+
29: "p"
42+
30: "q"
43+
31: "r"
44+
32: "s"
45+
33: "t"
46+
34: "u"
47+
35: "v"
48+
36: "w"
49+
37: "x"
50+
38: "y"
51+
39: "z"
52+
40: "A"
53+
41: "B"
54+
42: "C"
55+
43: "D"
56+
44: "E"
57+
45: "F"
58+
46: "G"
59+
47: "H"
60+
48: "I"
61+
49: "J"
62+
50: "K"
63+
51: "L"
64+
52: "M"
65+
53: "N"
66+
54: "O"
67+
55: "P"
68+
56: "Q"
69+
57: "R"
70+
58: "S"
71+
59: "T"
72+
60: "U"
73+
61: "V"
74+
62: "W"
75+
63: "X"
76+
64: "Y"
77+
65: "Z"
7678

7779
launchers:
7880
- framework: dlsdk

models/intel/text-recognition-0016/accuracy-check.yml

Lines changed: 44 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -5,48 +5,50 @@ evaluations:
55
model_type: SequentialTextRecognitionModel
66
lowercase: true
77
network_info:
8-
max_seq_len: "24"
9-
custom_label_map:
10-
0: "<s>"
11-
1: ""
12-
2: "</s>"
13-
3: "?"
14-
4: "0"
15-
5: "1"
16-
6: "2"
17-
7: "3"
18-
8: "4"
19-
9: "5"
20-
10: "6"
21-
11: "7"
22-
12: "8"
23-
13: "9"
24-
14: "a"
25-
15: "b"
26-
16: "c"
27-
17: "d"
28-
18: "e"
29-
19: "f"
30-
20: "g"
31-
21: "h"
32-
22: "i"
33-
23: "j"
34-
24: "k"
35-
25: "l"
36-
26: "m"
37-
27: "n"
38-
28: "o"
39-
29: "p"
40-
30: "q"
41-
31: "r"
42-
32: "s"
43-
33: "t"
44-
34: "u"
45-
35: "v"
46-
36: "w"
47-
37: "x"
48-
38: "y"
49-
39: "z"
8+
recognizer_encoder: {}
9+
recognizer_decoder: {}
10+
max_seq_len: "24"
11+
custom_label_map:
12+
0: "<s>"
13+
1: ""
14+
2: "</s>"
15+
3: "?"
16+
4: "0"
17+
5: "1"
18+
6: "2"
19+
7: "3"
20+
8: "4"
21+
9: "5"
22+
10: "6"
23+
11: "7"
24+
12: "8"
25+
13: "9"
26+
14: "a"
27+
15: "b"
28+
16: "c"
29+
17: "d"
30+
18: "e"
31+
19: "f"
32+
20: "g"
33+
21: "h"
34+
22: "i"
35+
23: "j"
36+
24: "k"
37+
25: "l"
38+
26: "m"
39+
27: "n"
40+
28: "o"
41+
29: "p"
42+
30: "q"
43+
31: "r"
44+
32: "s"
45+
33: "t"
46+
34: "u"
47+
35: "v"
48+
36: "w"
49+
37: "x"
50+
38: "y"
51+
39: "z"
5052

5153
launchers:
5254
- framework: dlsdk

models/intel/text-spotting-0005/accuracy-check.yml

Lines changed: 21 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -7,31 +7,27 @@ evaluations:
77

88
recognizer_encoder: {}
99

10-
recognizer_decoder: {}
11-
12-
recognizer_decoder_inputs:
13-
prev_symbol: prev_symbol
14-
prev_hidden: prev_hidden
15-
encoder_outputs: encoder_outputs
16-
17-
recognizer_decoder_outputs:
18-
symbols_distribution: output
19-
cur_hidden: hidden
20-
21-
max_seq_len: '28'
22-
recognizer_confidence_threshold: '0.45'
23-
24-
alphabet: __abcdefghijklmnopqrstuvwxyz0123456789
25-
sos_index: '0'
26-
eos_index: '1'
27-
28-
adapter:
29-
type: mask_rcnn_with_text
30-
classes_out: labels
31-
boxes_out: boxes
32-
raw_masks_out: masks
33-
texts_out: texts
34-
confidence_threshold: 0.65
10+
recognizer_decoder:
11+
inputs:
12+
prev_symbol: prev_symbol
13+
prev_hidden: prev_hidden
14+
encoder_outputs: encoder_outputs
15+
outputs:
16+
symbols_distribution: output
17+
cur_hidden: hidden
18+
19+
max_seq_len: '28'
20+
alphabet: __abcdefghijklmnopqrstuvwxyz0123456789
21+
sos_index: '0'
22+
eos_index: '1'
23+
recognizer_confidence_threshold: '0.45'
24+
adapter:
25+
type: mask_rcnn_with_text
26+
classes_out: labels
27+
boxes_out: boxes
28+
raw_masks_out: masks
29+
texts_out: texts
30+
confidence_threshold: 0.65
3531

3632
launchers:
3733
- framework: dlsdk

models/intel/text-to-speech-en-0001/accuracy-check.yml

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -5,20 +5,21 @@ evaluations:
55
network_info:
66
forward_tacotron_duration: {}
77

8-
forward_tacotron_regression: {}
9-
forward_tacotron_regression_inputs:
10-
data: data
11-
data_mask: data_mask
12-
pos_mask: pos_mask
8+
forward_tacotron_regression:
9+
inputs:
10+
data: data
11+
data_mask: data_mask
12+
pos_mask: pos_mask
13+
max_regression_len: '512'
1314

14-
melgan: {}
15-
max_mel_len: '128'
16-
max_regression_len: '512'
17-
pos_mask_window: '4'
15+
melgan:
16+
max_mel_len: '128'
1817

19-
adapter:
20-
type: regression
21-
keep_shape: True
18+
pos_mask_window: '4'
19+
20+
adapter:
21+
type: regression
22+
keep_shape: True
2223

2324
launchers:
2425
- framework: dlsdk

models/intel/text-to-speech-en-multi-0001/accuracy-check.yml

Lines changed: 14 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -5,21 +5,22 @@ evaluations:
55
network_info:
66
forward_tacotron_duration: {}
77

8-
forward_tacotron_regression: {}
9-
forward_tacotron_regression_inputs:
10-
data: data
11-
data_mask: data_mask
12-
pos_mask: pos_mask
13-
speaker_embedding: speaker_embedding
8+
forward_tacotron_regression:
9+
inputs:
10+
data: data
11+
data_mask: data_mask
12+
pos_mask: pos_mask
13+
speaker_embedding: speaker_embedding
14+
max_regression_len: '512'
1415

15-
melgan: {}
16-
max_mel_len: '128'
17-
max_regression_len: '512'
18-
pos_mask_window: '4'
16+
melgan:
17+
max_mel_len: '128'
1918

20-
adapter:
21-
type: regression
22-
keep_shape: True
19+
pos_mask_window: '4'
20+
21+
adapter:
22+
type: regression
23+
keep_shape: True
2324

2425
launchers:
2526
- framework: dlsdk

tools/accuracy_checker/openvino/tools/accuracy_checker/adapters/audio_recognition.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -745,7 +745,8 @@ def set_alphabet(self):
745745
if 'vocabulary_file' in self.launcher_config:
746746
self.alphabet = read_txt(self.get_value_from_config('vocabulary_file'), ignore_space=True)
747747
else:
748-
self.alphabet = self.get_value_from_config('alphabet') or ' ' + string.ascii_lowercase + '\''
748+
self.alphabet = (''.join(self.get_value_from_config('alphabet')) if self.get_value_from_config('alphabet')
749+
else ' ' + string.ascii_lowercase + '\'')
749750
self.alphabet = self.alphabet.encode('ascii').decode('utf-8')
750751

751752
def process(self, raw, identifiers=None, frame_meta=None):

0 commit comments

Comments
 (0)