You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/proposals/algorithms/lifelong-learning/Unknown Task Recognition Algorithm Reproduction based on Lifelong Learning of Sedna.md
# Unknown Task Recognition Algorithm Reproduction based on Lifelong Learning of Sedna
1
+
# Unknown Task Recognition Algorithm Reproduction based on Lifelong Learning of Ianvs
2
2
Traditional machine learning performs test-set inference by training known samples to which its knowledge is limited. The model which is trained by traditional machine learning cannot be effectively recognized the unknown samples from new classes and they will be processed as known samples. Therefore, the recognition and processing of unknown samples or unknown tasks will become the main research direction of artificial intelligence in the future. This project aims to reproduce the CVPR2021 paper "Learning placeholders for open-set recognition" in the Semantic Segmentation dataset. This paper proposes placeholders that imitate the emergence of new classes, thus helping to transform closed-set training into open-set training.
3
3
4
4
5
5
6
6
## Goals
7
-
1. The reproduction should be completed based on the Cityscapes Semantic Segmentation dataset.
8
-
2. The reproduced code is successfully merged into the lifelong learning module of Sedna.
7
+
1. The reproduction should be completed based on the Cityscapes Semantic Segmentation dataset and the Synthia dataset.
8
+
2. The reproduced code is successfully merged into the lifelong learning module of Ianvs.
9
9
3. The recognition accuracy (e.g. f1_score) of unknown classes is greater than 0.9.
10
10
11
11
12
12
13
13
## Proposal
14
-
The goal of this Sedna-based lifelong learning recurrence unknown task identification algorithm project is to identify unknown samples and known samples in the inference dataset and categorize them for subsequent task assignment in the lifelong learning system created by Sedna, after the initial training phase of task definition and model training.
14
+
The goal of this Ianvs-based lifelong learning recurrence unknown task identification algorithm project is to identify unknown samples and known samples in the inference dataset and categorize them for subsequent task assignment in the lifelong learning system created by Ianvs, after the initial training phase of task definition and model training.
15
15
16
16
This project needs to complete the task definition part and the unknown task identification part.
17
17
@@ -29,11 +29,11 @@ This part of the training model algorithm uses the RFNet method mentioned in the
29
29
30
30
The entire network architecture of RFNet is shown in Fig. In the encoder part of the architecture, we design two independent branches to extract features for RGB and depth images separately RGB branch as the main branch and the Depth branch as the subordinate branch. In both branches, we choose ResNet18 [30] as the backbone to extract features from inputs because ResNet-18 has moderate depth and residual structure, and its small operation footprint is compatible with the real-time operation. After each layer of ResNet-18, the output features from the Depth branch are fused to the RGB branch after the Attention Feature Complementary (AFC) module. The spatial pyramid pooling (SPP) block gathers the fused RGB-D features from two branches and produces feature maps with multi-scale information. Finally, referring to SwiftNet, we design the efficient upsampling modules to restore the resolution of these feature maps with skip connections from the RGB branch.
Fig. shows some examples from the validation set of Cityscapes and Lost and Found, which demonstrates the excellent segmentation accuracy of our RFNet in various scenarios with or without small obstacles.
35
35
36
-

36
+
<imgsrc="images/example.png"style="zoom:33%;" />
37
37
38
38
39
39
@@ -45,19 +45,114 @@ For this project, we use two datasets to train the models separately: cityscape
Cityscapes has 5000 images of driving scenes in urban environments (2975train, 500 val,1525test). The Cityscapes dataset, the Cityscapes dataset, is a new large-scale dataset containing a set of different stereo video sequences of street scenes recorded in 50 different cities.
59
+
60
+
Below shows one example RNB figures in the dataset.
The Cityscapes dataset contains 2975 images. The Cityscapes dataset contains 2975 images, including street view images and corresponding labels. The Cityscapes dataset, jointly provided by three German organizations including Daimler, contains stereo vision data for more than 50 cities.
67
+
68
+
The directories of this dataset is as follows:
69
+
70
+
```
71
+
├─disparity
72
+
│ ├─test
73
+
│ │ ├─berlin
74
+
│ │ ├─bielefeld
75
+
│ │ ├─bonn
76
+
│ │ ├─...
77
+
│ │ └─munich
78
+
│ ├─train
79
+
│ │ ├─aachen
80
+
│ │ ├─bochum
81
+
│ │ ├─...
82
+
│ │ └─zurich
83
+
│ └─val
84
+
│ ├─frankfurt
85
+
│ ├─lindau
86
+
│ └─munster
87
+
├─gtFine
88
+
│ ├─test
89
+
│ │ ├─berlin
90
+
│ │ ├─bielefeld
91
+
│ │ ├─bonn
92
+
│ │ ├─...
93
+
│ │ └─munich
94
+
│ ├─train
95
+
│ │ ├─aachen
96
+
│ │ ├─bochum
97
+
│ │ ├─...
98
+
│ │ └─zurich
99
+
│ └─val
100
+
│ ├─frankfurt
101
+
│ ├─lindau
102
+
│ └─munster
103
+
└─leftImg8bit
104
+
├─test
105
+
│ ├─berlin
106
+
│ ├─bielefeld
107
+
│ ├─bonn
108
+
│ ├─...
109
+
│ └─munich
110
+
├─train
111
+
│ ├─aachen
112
+
│ ├─bochum
113
+
│ ├─...
114
+
│ └─zurich
115
+
└─val
116
+
├─frankfurt
117
+
├─lindau
118
+
└─munster
119
+
```
57
120
58
121
##### SYNTHIA-RAND-CITYSCAPES
59
122
60
-

123
+
###### Background
124
+
125
+
SYNTHIA datasets consists of a collection of photo-realistic frames rendered from a virtual city and comes with precise pixel-level semantic annotations for 13 classes: misc, sky, building, road, sidewalk, fence, vegetation, pole, car, sign, pedestrian, cyclist, lane-marking.
126
+
127
+
Below shows one example RNB figures in the dataset.
128
+
129
+
<imgsrc="images/mn_RGB.png"style="zoom:33%;" />
130
+
131
+
###### Data Explorer
132
+
133
+
It is a new set containing 9,000 random images with labels compatible with the CITYSCAPES test set. The list of classes is: void, sky, building, road, sidewalk, fence, vegetation, pole, car, traffic sign, pedestrian, bicycle, motorcycle, parking-slot, road-work, traffic light, terrain, rider, truck, bus, train, wall, lanemarking. These images are generated as random perturbation of the virtual world, therefore no temporal consistency is provided.
134
+
135
+
The directories of this dataset is as follows:
136
+
137
+
```
138
+
├─Depth
139
+
│ ├─0000000.png
140
+
│ ├─...
141
+
│ └─0009399.png
142
+
├─GT
143
+
│ ├─COLOR
144
+
│ │ ├─0000000.png
145
+
│ │ ├─...
146
+
│ │ ├─0009399.png
147
+
│ └─LABELS
148
+
│ ├─0000000.png
149
+
│ ├─...
150
+
│ └─0009399.png
151
+
└─RNB
152
+
├─0000000.png
153
+
├─...
154
+
└─0009399.png
155
+
```
61
156
62
157
63
158
@@ -69,26 +164,26 @@ This project aims to reproduce the CVPR2021 paper "Learning placeholders for ope
69
164
70
165
The following is the workflow of the unknown task identification module. When faced with an inference task, the unknown task identification algorithm can give a timely indication of which data are known and which are unknown in the data set.
71
166
72
-

167
+
<imgsrc="images/unknow.png"style="zoom:50%;" />
73
168
74
169
#### Main Work
75
170
76
171
Data placeholders and classification placeholders are set up in the paper to handle the unknown class recognition problem. Among them, the purpose of the data placeholders is to mimic the emergence of new classes and to transform closed training into open training. The purpose of reserving classification placeholders for new classes is to augment the closed-set classifier with a virtual classifier that adaptively outputs class-specific thresholds to distinguish known classes from unknown classes. Specifically, the paper augments the closed-set classifier with an additional classification placeholder that represents a class-specific threshold between known and unknown. The paper reserves the classification placeholder for open classes to obtain invariant information between target and non-target classes. To efficiently predict the distribution of new classes, the paper uses data placeholders, which mimic open classes at a limited complexity cost, as a way to enable the transformation of closed classifiers into open classifiers and adaptively predict class-specific thresholds during testing.
Retaining classifier placeholders aims at setting up additional virtual classifiers and optimizing them to represent the threshold between known and unknown classes. Assuming a well-trained closed-set classifier W; the paper first augments the output layer with additional virtual classifiers, as shown in Eq.
85
180
86
-

181
+
<imgsrc="images/eq1.png"style="zoom:50%;" />
87
182
88
183
89
184
The closed set classifier and the virtual classifier embed the same set (matrix) and create only one additional linear layer. The added indices are passed through the softmax layer to generate the posterior probabilities. By fine-tuning the model so that the virtual classifier outputs the second highest probability of a known class, the invariant information between the known class classifier and the virtual classifier can be transferred to the detection process by this method. Since the output is increased by using the virtual classifier, the classification loss can be expressed as Eq:
90
185
91
-

186
+
<imgsrc="images/eq2.png"style="zoom:50%;" />
92
187
93
188
94
189
L denotes cross entropy or other loss function. The first term in the formula corresponds to the output of the optimized expansion, which pushes the samples into the corresponding class groups to maintain accurate identification in the closed set. In the second term, matching the task to K+1 classes makes the virtual classifier output the second highest probability, tries to associate the position of the virtual classifier in the center space, and controls the distance to the virtual classifier as the second closest distance among all class centers. Thus, it seeks a trade-off between correctly classifying closed set instances and retaining the probability of a new class as a classifier placeholder. During training, it can be positioned between target and non-target classes. In the case of novel classes, the predictive power of the virtual classifier can be high because all known classes are non-target classes. Therefore, it is considered as an instance-related threshold that can be well adapted to each known class.
@@ -99,26 +194,30 @@ L denotes cross entropy or other loss function. The first term in the formula co
99
194
100
195
The purpose of learning data placeholders is to transform closed-set training into open-set training. The combined data placeholders should have two main characteristics, the distribution of these samples should look novel and the generation process should be fast. In this paper, we simulate new patterns with multiple mixtures. Equation 6 in the paper gives two samples from different categories and mixes them in the middle layer.
101
196
102
-

197
+
<imgsrc="images/eq3.png"style="zoom:50%;" />
103
198
104
199
The results of the mixture are passed through the later layers to obtain the new model embedding post. considering that the interpolation between two different clusters is usually a low confidence prediction region. The paper treats the embedding φpost (̃xpre) as an embedding of the open set class and trains it as a new class.
105
200
106
-

201
+
<imgsrc="images/eq4.png"style="zoom:50%;" />
107
202
108
203
It is clear that the formulation in the paper does not consume additional time complexity, which would generate new situations between multiple decision boundaries. In addition, streamwise blending allows better use of interpolation of deeper hidden representations to generate new patterns in the improved embedding space, which better represents the new distribution. As illustrated in the figure above, the blending instance pushes the decision boundaries in the embedding space to two separate locations of the classes. With the help of the data placeholders, the embedding of the known classes will be tighter, leaving more places for the new classes.
After training, the unknown task algorithm model is placed in the Lifelong Learning Paradigm section of the Test Case Controller module as one of the algorithms for unknown task identification in the lifelong learning paradigm.
121
218
219
+
After users upload their own lifelong learning algorithms to the local Ianvs, the Ianvs component provides the dataset and testing environment, and offers a built-in unknown task recognition algorithm as an aid for testing. The test results are updated in the local leaderboard.
0 commit comments