Skip to content

Commit ac585e2

Browse files
authored
Update info in pp
Update info in pp
1 parent 2f70f0e commit ac585e2

File tree

1 file changed

+11
-3
lines changed

1 file changed

+11
-3
lines changed

index.html

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -55,12 +55,20 @@ <h1 class="title is-1 publication-title">Scaling Up Parameter Generation: A Recu
5555
<a href="https://scholar.google.com/citations?user=9lKm_5IAAAAJ" target="_blank">Dongwen Tang</a>, </span>
5656
<span class="author-block">
5757
<a href="https://wangbo-zhao.github.io/" target="_blank">Wangbo Zhao</a>, </span>
58+
<span class="author-block">
59+
<a href="https://kschuerholt.github.io/" target="_blank">Konstantin Schürholt</a>, </span>
60+
<span class="author-block">
61+
<a href="https://www.vita-group.space/" target="_blank">Zhangyang Wang</a>, </span>
5862
<span class="author-block">
5963
<a href="https://www.comp.nus.edu.sg/~youy/" target="_blank">Yang You</a> </span>
6064
</div>
6165
<div class="is-size-5 publication-authors">
6266
<span class="author-block">
6367
<a href="https://www.nus.edu.sg/" target="_blank">National University of Singapore</a> </span>
68+
<span class="author-block">
69+
<a href="https://www.unisg.ch/" target="_blank">University of St.Gallen</a> </span>
70+
<span class="author-block">
71+
<a href="https://www.utexas.edu/" target="_blank">University of Texas at Austin</a> </span>
6472
<span class="eql-cntrb"><small><br>Kai and Dongwen contributed equally to this work.</small> </span>
6573
</div>
6674
<div class="column has-text-centered">
@@ -137,12 +145,12 @@ <h2 class="title is-3">Abstract</h2>
137145
a novel framework that generates full neural network parameters—up to <b>hundreds of millions</b>—on a <b>single GPU</b>.
138146
Our approach first partitions a network’s parameters into non-overlapping ‘tokens’, each corresponding to a distinct portion of the model.
139147
A recurrent mechanism then learns the inter-token relationships,
140-
producing ‘prototypes’ which serve as conditions for a diffusion process that ultimately synthesizes the full parameters.
148+
producing ‘prototypes’ which serve as conditions for a diffusion process that ultimately synthesizes the parameters.
141149
Across a spectrum of architectures and tasks—including ResNets, ConvNeXts and ViTs on ImageNet-1K and COCO,
142150
and even LoRA-based LLMs—RPG achieves performance on par with fully trained networks while avoiding excessive memory overhead.
143151
Notably, it generalizes beyond its training set to generate valid parameters for previously unseen tasks,
144-
highlighting its flexibility in dynamic and open-ended scenarios. By overcoming the longstanding memory and scalability barriers,
145-
RPG serves as a critical advance in ‘AI generating AI’, potentially enabling efficient weight generation at scales previously deemed infeasible.
152+
highlighting its flexibility in open-ended scenarios. By overcoming the longstanding memory and scalability barriers,
153+
RPG serves as a critical advance in ‘<b>AI generating AI</b>’, potentially enabling efficient weight generation at scales previously deemed infeasible.
146154
</p>
147155
</div>
148156
</div>

0 commit comments

Comments
 (0)