Drag-and-Drop LLMs: Zero-Shot Prompt to Weights

A modular implementation of the Drag-and-Drop LLM system that enables zero-shot adaptation of Large Language Models through prompt-to-weights generation using cascaded hyper-convolutional decoders.

Overview

This system implements a novel approach to LLM adaptation that generates LoRA (Low-Rank Adaptation) parameters directly from text prompts, eliminating the need for traditional fine-tuning. The core innovation is a cascaded hyper-convolutional decoder that transforms Sentence-BERT embeddings into model-specific LoRA weights.

Key Features

Zero-shot adaptation: Generate model weights directly from prompts
Modular architecture: Clean separation of models, training, and evaluation
Cascaded hyper-convolutional decoder: Novel parameter generation architecture
LoRA integration: Efficient low-rank adaptation for foundation models
Easy-to-use scripts: Simple command-line interface for training and evaluation
Configurable: YAML-based configuration system

Installation

git clone https://github.com/sanowl/Drag-and-Drop-LLMs-Zero-Shot-Prompt-to-Weights
cd Drag-and-Drop-LLMs-Zero-Shot-Prompt-to-Weights
pip install -r requirements.txt

Verify Installation

Run the installation test to ensure all components are working:

python test_installation.py

This will verify:

✓ All model components can be imported
✓ Main model can be instantiated
✓ Individual components work correctly
✓ Dataset loading functions properly

Troubleshooting

If you encounter issues with missing models folder:

Ensure complete clone:
```
git status
git ls-files dnd_llm/models/
```

Check required model files:

dnd_llm/models/
├── __init__.py           # Package initialization
├── main_model.py         # Main DragAndDropLLM class  
├── encoders.py          # SentenceBERT text encoder
├── lora.py              # LoRA layer implementations
└── decoders.py          # Hyper-convolutional decoders

Test imports:

python -c "from dnd_llm import DragAndDropLLM; print('Success!')"

Quick Start

from dnd_llm import DragAndDropLLM, DnDTrainer, DatasetManager

# Initialize the model
model = DragAndDropLLM(
    foundation_model="Qwen/Qwen2.5-0.5B",
    text_encoder="sentence-transformers/all-MiniLM-L6-v2",
    lora_rank=8,
    lora_alpha=16.0
)

# Load datasets
datasets = DatasetManager.load_common_sense_datasets()

# Train the system
trainer = DnDTrainer(model, device='cuda')
trainer.train(datasets, num_epochs=5000, batch_size=128)

# Generate weights from prompts
test_prompts = [
    "An astronomer observes that a planet rotates faster after a meteorite impact. Which is the most likely effect of this increase in rotation?\\nA: Planetary density will decrease.\\nB: Planetary years will become longer.\\nC: Planetary days will become shorter.\\nD: Planetary gravity will become stronger."
]
generated_params = model(test_prompts)

Architecture

The system consists of several key components:

SentenceBERTEncoder: Extracts semantic embeddings from text prompts
CascadedHyperConvolutionalDecoder: Transforms embeddings to parameter space
QwenLoRALayer: LoRA implementation for efficient adaptation
DragAndDropLLM: Main system orchestrating the complete pipeline

Training

# Train with default configuration
python scripts/train.py --config configs/default.yaml

# Train with custom output directory
python scripts/train.py --config configs/default.yaml --output-dir ./my_experiment

# Resume training from checkpoint
python scripts/train.py --config configs/default.yaml --resume ./outputs/checkpoint.pth

Evaluation

# Evaluate on all tasks
python scripts/evaluate.py --checkpoint ./outputs/final_model.pth --task all

# Evaluate specific tasks
python scripts/evaluate.py --checkpoint ./outputs/final_model.pth --task common_sense
python scripts/evaluate.py --checkpoint ./outputs/final_model.pth --task coding
python scripts/evaluate.py --checkpoint ./outputs/final_model.pth --task math

# Evaluate specific datasets
python scripts/evaluate.py --checkpoint ./outputs/final_model.pth --datasets ARC-e,PIQA

Inference

# Generate weights from prompts
python scripts/inference.py \
    --checkpoint ./outputs/final_model.pth \
    --prompts "An astronomer observes that a planet rotates faster after a meteorite impact. A: Planetary density will decrease. B: Planetary years will become longer. C: Planetary days will become shorter. D: Planetary gravity will become stronger." "Write a Python function that returns the Fibonacci sequence up to n." \
    --output generated_weights.pth

Usage Examples

Basic Usage

from dnd_llm import DragAndDropLLM

# Initialize the model
model = DragAndDropLLM(
    foundation_model="Qwen/Qwen2.5-0.5B",
    text_encoder="sentence-transformers/all-MiniLM-L6-v2"
)

# Generate weights from prompts
prompts = [
    "What is the next prime number after 47?\\nA: 49\\nB: 51\\nC: 53\\nD: 55",
    "Write a Python function that returns the Fibonacci sequence up to n."
]
generated_params = model(prompts)

# Apply generated parameters
model.apply_parameters(generated_params)

Project Structure

Drag-and-Drop-LLMs-Zero-Shot-Prompt-to-Weights/
├── dnd_llm/                    # Main package
│   ├── __init__.py
│   ├── models/                 # Model components
│   │   ├── __init__.py
│   │   ├── encoders.py         # Text encoders
│   │   ├── lora.py            # LoRA implementations
│   │   ├── decoders.py        # Hyper-convolutional decoders
│   │   └── main_model.py      # Main DnD-LLM model
│   ├── training/              # Training components
│   │   ├── __init__.py
│   │   ├── trainer.py         # Training logic
│   │   ├── checkpoint.py      # Checkpoint collection
│   │   └── datasets.py        # Dataset management
│   ├── evaluation/            # Evaluation components
│   │   ├── __init__.py
│   │   ├── evaluator.py       # Evaluation logic
│   │   └── metrics.py         # Evaluation metrics
│   ├── utils/                 # Utilities
│   │   ├── __init__.py
│   │   ├── config.py          # Configuration management
│   │   └── logging.py         # Logging utilities
│   └── data/                  # Data handling
│       ├── __init__.py
│       └── loaders.py         # Data loaders
├── scripts/                   # Execution scripts
│   ├── train.py              # Training script
│   ├── evaluate.py           # Evaluation script
│   └── inference.py          # Inference script
├── configs/                   # Configuration files
│   ├── default.yaml          # Default configuration
│   └── experiments/          # Experiment configs
├── tests/                    # Unit tests
├── docs/                     # Documentation
├── requirements.txt          # Dependencies
├── setup.py                  # Package setup
└── README.md                 # This file

Configuration

The system uses YAML configuration files for easy customization. See configs/default.yaml for the complete configuration options.

Key Configuration Sections

Model: Foundation model settings, LoRA parameters, text encoder
Training: Learning rate, batch size, epochs, optimization settings
Evaluation: Datasets to evaluate on, metrics to compute
System: Device selection, output directories, logging levels

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Citation

If you find this repository useful, please cite work:

@misc{liang2025draganddropllmszeroshotprompttoweights,
  title={Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights},
  author={Zhiyuan Liang and Dongwen Tang and Yuhao Zhou and Xuanlei Zhao and Mingjia Shi and Wangbo Zhao and Zekai Li and Peihao Wang and Konstantin Schürholt and Damian Borth and Michael M. Bronstein and Yang You and Zhangyang Wang and Kai Wang},
  year={2025},
  eprint={2506.16406},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2506.16406}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Features

Core Components

SentenceBERTEncoder: Extracts semantic embeddings from text prompts
CascadedHyperConvolutionalDecoder: Transforms embeddings to parameter space
QwenLoRALayer: LoRA implementation for efficient adaptation
DragAndDropLLM: Main system orchestrating the complete pipeline

Supported Tasks

Common Sense Reasoning: ARC-e, OBQA, PIQA, HellaSwag, BoolQ, WinoGrande
Code Generation: HumanEval and other coding benchmarks
Mathematical Reasoning: gsm8K, MATH, and other math datasets
Cross-domain Transfer: Adaptation across different task domains

Acknowledgments

Built on top of PyTorch and Transformers
Inspired by LoRA and hypernetwork research
Thanks to the open-source AI community

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Drag-and-Drop LLMs: Zero-Shot Prompt to Weights

Overview

Key Features

Installation

Verify Installation

Troubleshooting

Quick Start

Architecture

Training

Evaluation

Inference

Usage Examples

Basic Usage

Project Structure

Configuration

Key Configuration Sections

Contributing

Citation

License

Features

Core Components

Supported Tasks

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
dnd_llm		dnd_llm
scripts		scripts
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py
test_installation.py		test_installation.py

Folders and files

Latest commit

History

Repository files navigation

Drag-and-Drop LLMs: Zero-Shot Prompt to Weights

Overview

Key Features

Installation

Verify Installation

Troubleshooting

Quick Start

Architecture

Training

Evaluation

Inference

Usage Examples

Basic Usage

Project Structure

Configuration

Key Configuration Sections

Contributing

Citation

License

Features

Core Components

Supported Tasks

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages