Skip to content

Commit 2ea0a4f

Browse files
Add files via upload
1 parent dd8e1cf commit 2ea0a4f

File tree

3 files changed

+475
-0
lines changed

3 files changed

+475
-0
lines changed
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "29d21289",
6+
"metadata": {
7+
"id": "29d21289"
8+
},
9+
"source": [
10+
"# DeepSeek R1 Qwen3 (8B) - GRPO Agent Demo"
11+
]
12+
},
13+
{
14+
"cell_type": "markdown",
15+
"source": [
16+
"[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhivyaBharathy-web/PraisonAI/blob/main/examples/cookbooks/DeepSeek_Qwen3_GRPO.ipynb)\n"
17+
],
18+
"metadata": {
19+
"id": "yuOEagMH86WV"
20+
},
21+
"id": "yuOEagMH86WV"
22+
},
23+
{
24+
"cell_type": "markdown",
25+
"id": "0f798657",
26+
"metadata": {
27+
"id": "0f798657"
28+
},
29+
"source": [
30+
"This notebook demonstrates the usage of DeepSeek's Qwen3-8B model with GRPO (Guided Reasoning Prompt Optimization) for interactive conversational reasoning tasks.\n",
31+
"It is designed to simulate a lightweight agent-style reasoning capability in an accessible and interpretable way."
32+
]
33+
},
34+
{
35+
"cell_type": "markdown",
36+
"id": "80f3de9e",
37+
"metadata": {
38+
"id": "80f3de9e"
39+
},
40+
"source": [
41+
"## Dependencies"
42+
]
43+
},
44+
{
45+
"cell_type": "code",
46+
"execution_count": null,
47+
"id": "8d1c7f6c",
48+
"metadata": {
49+
"id": "8d1c7f6c"
50+
},
51+
"outputs": [],
52+
"source": [
53+
"!pip install -q transformers accelerate"
54+
]
55+
},
56+
{
57+
"cell_type": "markdown",
58+
"id": "78603e7b",
59+
"metadata": {
60+
"id": "78603e7b"
61+
},
62+
"source": [
63+
"## Tools"
64+
]
65+
},
66+
{
67+
"cell_type": "markdown",
68+
"id": "88e97fbc",
69+
"metadata": {
70+
"id": "88e97fbc"
71+
},
72+
"source": [
73+
"- `transformers`: For model loading and interaction\n",
74+
"- `AutoModelForCausalLM`, `AutoTokenizer`: Interfaces for DeepSeek's LLM"
75+
]
76+
},
77+
{
78+
"cell_type": "markdown",
79+
"id": "37d9bd54",
80+
"metadata": {
81+
"id": "37d9bd54"
82+
},
83+
"source": [
84+
"## YAML Prompt"
85+
]
86+
},
87+
{
88+
"cell_type": "code",
89+
"execution_count": null,
90+
"id": "adf5cae5",
91+
"metadata": {
92+
"id": "adf5cae5"
93+
},
94+
"outputs": [],
95+
"source": [
96+
"\n",
97+
"prompt:\n",
98+
" task: \"Reasoning over multi-step instructions\"\n",
99+
" context: \"User provides a math problem or logical question.\"\n",
100+
" model: \"deepseek-ai/deepseek-moe-16b-chat\"\n"
101+
]
102+
},
103+
{
104+
"cell_type": "markdown",
105+
"id": "6985f60c",
106+
"metadata": {
107+
"id": "6985f60c"
108+
},
109+
"source": [
110+
"## Main"
111+
]
112+
},
113+
{
114+
"cell_type": "code",
115+
"execution_count": null,
116+
"id": "d74bf686",
117+
"metadata": {
118+
"id": "d74bf686"
119+
},
120+
"outputs": [],
121+
"source": [
122+
"\n",
123+
"from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline\n",
124+
"\n",
125+
"model_id = \"deepseek-ai/deepseek-moe-16b-chat\"\n",
126+
"tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
127+
"model = AutoModelForCausalLM.from_pretrained(model_id, device_map=\"auto\")\n",
128+
"\n",
129+
"pipe = pipeline(\"text-generation\", model=model, tokenizer=tokenizer)\n",
130+
"\n",
131+
"prompt = \"If a train travels 60 miles in 1.5 hours, what is its average speed?\"\n",
132+
"output = pipe(prompt, max_new_tokens=60)[0]['generated_text']\n",
133+
"print(\"🧠 Reasoned Output:\", output)\n"
134+
]
135+
},
136+
{
137+
"cell_type": "markdown",
138+
"id": "c856167f",
139+
"metadata": {
140+
"id": "c856167f"
141+
},
142+
"source": [
143+
"## Output"
144+
]
145+
},
146+
{
147+
"cell_type": "markdown",
148+
"id": "41039ee8",
149+
"metadata": {
150+
"id": "41039ee8"
151+
},
152+
"source": [
153+
"### 🖼️ Output Summary\n",
154+
"\n",
155+
"Prompt: *\"If a train travels 60 miles in 1.5 hours, what is its average speed?\"*\n",
156+
"\n",
157+
"🧠 Output: The model provides a clear reasoning process, such as:\n",
158+
"\n",
159+
"> \"To find the average speed, divide the total distance by total time: 60 / 1.5 = 40 mph.\"\n",
160+
"\n",
161+
"💡 This shows the model's ability to walk through logical steps using GRPO-enhanced reasoning."
162+
]
163+
}
164+
],
165+
"metadata": {
166+
"colab": {
167+
"provenance": []
168+
}
169+
},
170+
"nbformat": 4,
171+
"nbformat_minor": 5
172+
}
Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "930bc11c",
6+
"metadata": {
7+
"id": "930bc11c"
8+
},
9+
"source": [
10+
"# Meta Synthetic Data Generator (LLaMA 3.2 - 3B)\n",
11+
"\n",
12+
"This notebook demonstrates how to use Meta's LLaMA 3.2 3B model to generate synthetic data for use in AI training or application prototyping."
13+
]
14+
},
15+
{
16+
"cell_type": "markdown",
17+
"source": [
18+
"[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DhivyaBharathy-web/PraisonAI/blob/main/examples/cookbooks/Meta_LLaMA3_SyntheticData.ipynb)\n"
19+
],
20+
"metadata": {
21+
"id": "KPi7FpbV9J2d"
22+
},
23+
"id": "KPi7FpbV9J2d"
24+
},
25+
{
26+
"cell_type": "markdown",
27+
"id": "80f68ecf",
28+
"metadata": {
29+
"id": "80f68ecf"
30+
},
31+
"source": [
32+
"## Dependencies"
33+
]
34+
},
35+
{
36+
"cell_type": "code",
37+
"execution_count": null,
38+
"id": "7eeb508d",
39+
"metadata": {
40+
"id": "7eeb508d"
41+
},
42+
"outputs": [],
43+
"source": [
44+
"!pip install -q transformers accelerate bitsandbytes"
45+
]
46+
},
47+
{
48+
"cell_type": "markdown",
49+
"id": "eda75a0c",
50+
"metadata": {
51+
"id": "eda75a0c"
52+
},
53+
"source": [
54+
"## Tools\n",
55+
"* `transformers` for model loading and text generation\n",
56+
"* `pipeline` for simplified inference\n",
57+
"* `AutoTokenizer`, `AutoModelForCausalLM` for LLaMA 3.2"
58+
]
59+
},
60+
{
61+
"cell_type": "markdown",
62+
"id": "6ee96383",
63+
"metadata": {
64+
"id": "6ee96383"
65+
},
66+
"source": [
67+
"## YAML Prompt"
68+
]
69+
},
70+
{
71+
"cell_type": "code",
72+
"execution_count": null,
73+
"id": "aa4e56ef",
74+
"metadata": {
75+
"id": "aa4e56ef"
76+
},
77+
"outputs": [],
78+
"source": [
79+
"prompt: |\n",
80+
" task: \"Generate a customer complaint email\"\n",
81+
" style: \"Professional\"\n"
82+
]
83+
},
84+
{
85+
"cell_type": "markdown",
86+
"id": "e9fbe400",
87+
"metadata": {
88+
"id": "e9fbe400"
89+
},
90+
"source": [
91+
"## Main"
92+
]
93+
},
94+
{
95+
"cell_type": "code",
96+
"execution_count": null,
97+
"id": "3e35336f",
98+
"metadata": {
99+
"id": "3e35336f"
100+
},
101+
"outputs": [],
102+
"source": [
103+
"from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline\n",
104+
"\n",
105+
"# Load tokenizer and model (LLaMA 3.2 - 3B)\n",
106+
"model_id = \"meta-llama/Meta-Llama-3-3B-Instruct\"\n",
107+
"tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
108+
"model = AutoModelForCausalLM.from_pretrained(model_id)\n",
109+
"\n",
110+
"# Create a simple text generation pipeline\n",
111+
"generator = pipeline(\"text-generation\", model=model, tokenizer=tokenizer)\n",
112+
"\n",
113+
"# Example synthetic data prompt\n",
114+
"prompt = \"Create a customer support query about a late delivery.\"\n",
115+
"\n",
116+
"# Generate synthetic text\n",
117+
"output = generator(prompt, max_length=60, do_sample=True)[0]['generated_text']\n",
118+
"print(\"📝 Synthetic Output:\", output)"
119+
]
120+
},
121+
{
122+
"cell_type": "markdown",
123+
"id": "ed28cc2e",
124+
"metadata": {
125+
"id": "ed28cc2e"
126+
},
127+
"source": [
128+
"## Output"
129+
]
130+
},
131+
{
132+
"cell_type": "markdown",
133+
"id": "6fc8f19e",
134+
"metadata": {
135+
"id": "6fc8f19e"
136+
},
137+
"source": [
138+
"🖼️ Output Preview (Text Summary):\n",
139+
"\n",
140+
"Prompt: \"Create a customer support query about a late delivery.\"\n",
141+
"\n",
142+
"📝 Output: The LLaMA model generates a realistic complaint, such as:\n",
143+
"\n",
144+
"\"Dear Support Team, I placed an order two weeks ago and have yet to receive it...\"\n",
145+
"\n",
146+
"🎯 This illustrates how the model can be used to generate realistic synthetic data for tasks like training chatbots or support models.\n"
147+
]
148+
}
149+
],
150+
"metadata": {
151+
"colab": {
152+
"provenance": []
153+
}
154+
},
155+
"nbformat": 4,
156+
"nbformat_minor": 5
157+
}

0 commit comments

Comments
 (0)