Skip to content

Commit 24f1829

Browse files
fuyiqunJackFu123
authored andcommitted
prediction: modify technical docs for VectorNet-TNT
Change-Id: Iebda86b469c47bf90d083b752dbf31e559d79b49
1 parent 6c01cf1 commit 24f1829

File tree

1 file changed

+10
-9
lines changed

1 file changed

+10
-9
lines changed

docs/technical_documents/jointly_prediction_planning_evaluator.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
The prediction module comprises 4 main functionalities: Container, Scenario, Evaluator and Predictor.
44

5-
An Evaluator predicts trajectories and speeds for surrounding obstacles of autonomous vehicle. An evaluator evaluates a path(lane sequence) with a probability by the given model stored in prediction/data/.
5+
An Evaluator predicts trajectories and speeds for surrounding obstacles of autonomous vehicle. An evaluator evaluates a path(lane sequence) with a probability by the given model stored in the directory `modules/prediction/data/`.
66

7-
In Apollo 7.0, a new model named Inter-TNT is introduced to generate short-term trajectories. This model applies VectorNet as encoder and TNT as decoder, and latest planning trajectory of autonomous vehicle is used to interact with surrounding obstacles. Compared with the prediction model based on semantic map released in Apollo 6.0, the performance is increased by more than 20% in terms of minADE and minFDE, and the inference time is reduced from 15 ms to 10 ms.
7+
In Apollo 7.0, a new model named Inter-TNT is introduced to generate short-term trajectories. This model applies VectorNet as encoder and TNT as decoder, and latest planning trajectory of autonomous vehicle is used to interact with surrounding obstacles. Compared with the prediction model based on semantic map released in Apollo 6.0, the performance is increased by more than 20% in terms of minADE and minFDE, and the inference time is reduced from 15 ms to 10 ms.
88

99
![Diagram](images/interaction_model_fig_1.png)
1010

@@ -25,36 +25,37 @@ Please refer [interaction filter](https://github.com/ApolloAuto/apollo/tree/mast
2525
```
2626

2727
# Network Architecture
28-
The network architecture of the proposed "Inter-TNT" is illustrated as below. The entire network is composed of three modules: an vectorized encoder, a target-driven decoder, and an interaction module. The vectorized trajectories of obstacles and autonomous vehicle (AV), along with HD maps, are first fed into the vectorized encoder to extract features. The target-driven decoder takes the extracted features as input and generate multi-modal trajectories for each obstacle. The main contribution of the proposed network is introducing an interaction mechanism which could measure the interaction between obstacles and autonomous vehicle by re-weighting confidences of multi-modal trajectories.
28+
The network architecture of the proposed "Inter-TNT" is illustrated as follows. The entire network is composed of three modules: an vectorized encoder, a target-driven decoder, and an interaction module. The vectorized trajectories of obstacles and autonomous vehicle (AV), along with HD maps, are first fed into the vectorized encoder to extract features. The target-driven decoder takes the extracted features as input and generates multi-modal trajectories for each obstacle. The main contribution of the proposed network is introducing an interaction mechanism, which could measure the interaction between obstacles and autonomous vehicle by re-weighting confidences of multi-modal trajectories.
2929

3030
![Diagram](images/VectorNet-TNT-Interaction.png)
3131

3232
## Encoder
33-
Basically, the encoder is mainly using an [VectorNet](https://arxiv.org/abs/2005.04259).
33+
Basically, the encoder is mainly using an [VectorNet](https://arxiv.org/abs/2005.04259).
3434

3535
### Representation
36-
The trajectories of AV all obstacles are represented as polylines in the form of sequential coordinate points. For each polyline, it contains start point, end point, obstacle length and some other attributes of vector. All points are transformed to the AV coordinate with North as the y-axis and (0, 0) as the position for ADC at time 0.
36+
The trajectories of AV and all obstacles are represented as polylines in the form of sequential coordinate points. For each polyline, it contains start point, end point, obstacle length and some other attributes of vector. All points are transformed to the AV coordinate with y-axis as the heading direction and (0, 0) as the position for AV at time 0.
3737

38-
After that, map elements are extracted from HDMap files. As elements of lane/road/junction/crosswalk are depicted in points in HD map, they are conveniently processed as polylines.
38+
After that, map elements are extracted from HDMap files. As elements of lane/road/junction/crosswalk are depicted in points in HD map, they are conveniently processed as polylines.
3939

4040
### VectorNet
4141
The polyline features are first extracted from a subgraph network and further fed into a globalgraph network (GCN) to encode contextual information.
4242

4343
## Decoder
44-
Our decoder implementation mainly follows the [TNT](https://arxiv.org/abs/2008.08294) paper. There are three steps in TNT. For more details, please refer to the original paper.
44+
Our decoder implementation mainly follows the [TNT](https://arxiv.org/abs/2008.08294) paper. There are three steps in TNT. For more details, please refer to the original paper.
4545

4646
### Target Prediction
4747
For each obstacle, N points around the AV are uniformly sampled and M points are selected as target points. These target points are considered to be the potential final points of the predicted trajectories.
4848

4949
### Motion Estimation
50-
After selecting the potential target points, M trajectories are generated for each obstacle with its corresponding feature from encoder as input.
50+
After selecting the potential target points, M trajectories are generated for each obstacle with its corresponding feature from encoder as input.
5151

5252
### Scoring and Selection
53-
Finally, a scoring and selection module is performed to generate likelihood scores of the M trajectories for each obstacle, and select a final set of trajectory predictions by likelihood scores.
53+
Finally, a scoring and selection module is performed to generate likelihood scores of the M trajectories for each obstacle, and select a final set of trajectory predictions by likelihood scores.
5454

5555
## Interaction with Planning Trajectory
5656
After TNT decoder, K predicted trajectories for each obstacle are generated. In order to measure the interaction between AV and obstacles, we calculate the position and velocity differences between the latest planning trajectory and predicted obstacle trajectories. Note that we can also calculate a cost between the ground truth obstacle trajectory and AV planning trajectory, thus producing the true costs. That's how the loss is calculated in this step.
5757

5858
# References
5959
1. Gao, Jiyang, et al. "Vectornet: Encoding hd maps and agent dynamics from vectorized representation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
6060
2. Zhao, Hang, et al. "Tnt: Target-driven trajectory prediction." arXiv preprint arXiv:2008.08294 (2020).
61+
3. Xu, Kecheng, et al. "Data driven prediction architecture for autonomous driving and its application on apollo platform." 2020 IEEE Intelligent Vehicles Symposium (IV). IEEE, 2020.

0 commit comments

Comments
 (0)