-
Notifications
You must be signed in to change notification settings - Fork 279
Description
Hi, I notice that examples in the project contain only one LLM, and users only need to optimize the instructions and inputs for it.
However, I wonder what will happen if we have a more complex system where multiple LLMs interact with each other, like GANs with a generator and a discriminator.
For example, a very simple case is that a LLM is used to generate test case, such as math questions, and another LLM is to solve these questions. Our target could be that to gradually increase the difficulty of the questions to make the solver LLM fail to give the correct answer and improve the capability of the solver LLM to correctly answer hard questions as much as possible. It looks like training GANs, but under the scenarios of LLMs.
For such an example, the variable we need to optimize will include the instructions of both LLMs. But each LLM will be effected by another. When we train the generator in GAN, we can use the gradients from the discriminator. However, can we use the gradients from the another LLMs?