You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
a novel framework that generates full neural network parameters—up to <b>hundreds of millions</b>—on a <b>single GPU</b>.
138
146
Our approach first partitions a network’s parameters into non-overlapping ‘tokens’, each corresponding to a distinct portion of the model.
139
147
A recurrent mechanism then learns the inter-token relationships,
140
-
producing ‘prototypes’ which serve as conditions for a diffusion process that ultimately synthesizes the full parameters.
148
+
producing ‘prototypes’ which serve as conditions for a diffusion process that ultimately synthesizes the parameters.
141
149
Across a spectrum of architectures and tasks—including ResNets, ConvNeXts and ViTs on ImageNet-1K and COCO,
142
150
and even LoRA-based LLMs—RPG achieves performance on par with fully trained networks while avoiding excessive memory overhead.
143
151
Notably, it generalizes beyond its training set to generate valid parameters for previously unseen tasks,
144
-
highlighting its flexibility in dynamic and open-ended scenarios. By overcoming the longstanding memory and scalability barriers,
145
-
RPG serves as a critical advance in ‘AI generating AI’, potentially enabling efficient weight generation at scales previously deemed infeasible.
152
+
highlighting its flexibility in open-ended scenarios. By overcoming the longstanding memory and scalability barriers,
153
+
RPG serves as a critical advance in ‘<b>AI generating AI</b>’, potentially enabling efficient weight generation at scales previously deemed infeasible.
0 commit comments