IPAD: Inverse Prompt for AI Detection

📘 Overview

Large Language Models (LLMs) have achieved human-level fluency in text generation, making it increasingly difficult to distinguish between human- and AI-authored content.
IPAD (Inverse Prompt for AI Detection) introduces a two-stage detection framework:

Prompt Inverter — predicts the underlying prompts that could have generated an input text.
Distinguisher — evaluates the alignment between the text and its predicted prompts to determine whether it was AI-generated.

All components — Prompt Inverter, Distinguisher (RC), and Distinguisher (PTCV) — are LoRA-fine-tuned versions of
microsoft/Phi-3-medium-128k-instruct,
trained using LLaMA-Factory for robust AI text detection under diverse and adversarial conditions.

🧩 Distinguisher (RC) — optimized for regular, unstructured text inputs (baseline detection).
🔬 Distinguisher (PTCV) — specialized for structured, compositional, or OOD text, exhibiting enhanced robustness.

🚀 Quick Usage

🧠 Prompt Inverter

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
from torch.nn.functional import softmax

base_model = "microsoft/Phi-3-medium-128k-instruct"
lora_model = "bellafc/IPAD/Prompt_Inverter"

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, lora_model)

# Example input
text = "What is the prompt that generates the input text ... ?"
inputs = tokenizer(text, return_tensors="pt").to(model.device)

gen = model.generate(
    **inputs,
    output_scores=True,
    return_dict_in_generate=True
)

generated_text = tokenizer.decode(gen.sequences[0], skip_special_tokens=True)
print("Generated:", generated_text)

🧩 Distinguishers

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
from torch.nn.functional import softmax

base_model = "microsoft/Phi-3-medium-128k-instruct"
lora_model = "bellafc/IPAD/Distinguisher_PTCV"  # or Distinguisher_RC

tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, lora_model)

# For RC
# text = "Can LLM generate the input text {text_to_detect} through the prompt {prompt_generated_by_PI}?"

# For PTCV
text = "Text2 is generated by LLM, determine whether text1 is also generated by LLM with a similar prompt. Text1: ... . Text2: ... ."

inputs = tokenizer(text, return_tensors="pt").to(model.device)

gen = model.generate(
    **inputs,
    max_new_tokens=10,
    output_scores=True,
    return_dict_in_generate=True
)

generated_text = tokenizer.decode(gen.sequences[0], skip_special_tokens=True)
probs = softmax(gen.scores[0], dim=-1)
yes_token_id = tokenizer(" yes", add_special_tokens=False).input_ids[0]

print("Generated:", generated_text)
print(f"P('yes') = {probs[0, yes_token_id].item():.4f}")

🧰 LLaMA-Factory Inference

You can directly run inference using the LLaMA-Factory

llamafactory-cli chat examples/inference/distinguisher_ptcv.yaml

Example YAML
model_name_or_path: microsoft/Phi-3-medium-128k-instruct
adapter_name_or_path: bellafc/IPAD/Distinguisher_PTCV
template: phi
infer_backend: vllm
max_new_tokens: 128
temperature: 0.7

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
README.md		README.md
gptonlyprompt_training.jsonl		gptonlyprompt_training.jsonl
humanonlyprompt_training.jsonl		humanonlyprompt_training.jsonl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IPAD: Inverse Prompt for AI Detection

📘 Overview

🚀 Quick Usage

🧠 Prompt Inverter

🧩 Distinguishers

🧰 LLaMA-Factory Inference

About

Uh oh!

Releases

Packages

Bellafc/IPAD-Inver-Prompt-for-AI-Detection

Folders and files

Latest commit

History

Repository files navigation

IPAD: Inverse Prompt for AI Detection

📘 Overview

🚀 Quick Usage

🧠 Prompt Inverter

🧩 Distinguishers

🧰 LLaMA-Factory Inference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages