Skip to content

Commit 6204c94

Browse files
committed
break up wall of text sections
1 parent 943aad6 commit 6204c94

File tree

1 file changed

+6
-3
lines changed

1 file changed

+6
-3
lines changed

content/textbook/audits/staging/yuni-wyx-jt7347.mdx

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,8 @@ While ViNT does not explicitly set a hard line between when to explore or when t
5454
Instead, this encoder encodes the difference between current observation and the goal - just stack observations and goal together, pass through EfficientNet, then flatten to get goal tokens, similar to observation encoder.
5555
Attention forces goal to attend to / compare with window observation sequence.
5656

57-
2. **Transformer**: $P_{\text{past}}$, $P_{\text{obs}}$ (i.e. current), and $P_{\text{goal}}$ tokens are combined with positional encoding, and passed into decoder-only Transformer backbone (denoted as 'f' in the paper) with 4 multi-headed attention blocks (4 heads, 4 layers), and 2048 hidden units.
57+
2. **Transformer**: $P_{\text{past}}$, $P_{\text{obs}}$ (i.e. current), and $P_{\text{goal}}$ tokens are combined with positional encoding.
58+
These are passed into a decoder-only Transformer backbone (denoted as 'f' in the paper) with 4 multi-headed attention blocks (4 heads, 4 layers), and 2048 hidden units.
5859

5960
a. 6 tokens, model dimension of 512, 4 layers, 4 heads, 2048 feed-forward hidden dim, 128 per attention head (512 / 4).
6061

@@ -216,7 +217,8 @@ NoMaD was evaluated in 6 different real-world and outdoor environments using a L
216217
The model was compared with 6 different baselines:
217218

218219
1. **VIB**: Variational Information bottleneck, which models distribution of actions conditioned on observations.
219-
2. **Masked ViNT**: Essentially ViNT but with goal masking policy. Predicts point estimates of future actions instead of modeling the entire distribution.
220+
2. **Masked ViNT**: Essentially ViNT but with goal masking policy.
221+
Predicts point estimates of future actions instead of modeling the entire distribution.
220222
3. **Autoregressive**: Uses autoregressive prediction over a discrete distribution of actions.
221223
4. **Subgoal diffusion**: Basically just ViNT (diffusion generation of subgoals with navigation policy).
222224
5. **Random subgoals**: A variation of ViNT that instead of using diffusion, just randomly samples training data for candidate subgoal.
@@ -231,7 +233,8 @@ In comparison to ViNT, NoMaD presents a new approach to learning exploration and
231233
The main contribution is architecturally quite simple, but effective.
232234
The goal-masking effectively turns navigation from a single-task problem into a conditional behavior spectrum, allowing for a more unified end-to-end approach.
233235
Additionally, the way they used diffusion for only action generation instead of image generation greatly reduces computational costs for running NoMaD.
234-
Previously in ViNT, exploration vs. exploitation was a behavior encoded within the graph generation and subgoal ranking, but now with NoMaD, the low level collision avoidance and high level planning (exploration vs. subgoal seeking) is defined in one model architecture.
236+
Previously in ViNT, exploration vs. exploitation was a behavior encoded within the graph generation and subgoal ranking.
237+
Now with NoMaD, the low level collision avoidance and high level planning (exploration vs. subgoal seeking) is defined in one model architecture.
235238
Additionally, the probability distribution allows for more fine-tuned assignment of what actions are good and bad in all action space (e.g. high probability on turn left or turn right at a T junction, low probability of going straight and hitting wall).
236239

237240
Some things which could be expanded upon however, include:

0 commit comments

Comments
 (0)