You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* fix syntax
* make the project run on CPU
* fix logging logic
* update readme and add config file
* update readme to reflect what the project does
* Update README.md
* Apply suggestions from code review
Co-authored-by: Alex Strick van Linschoten <[email protected]>
* neptune optional readme
---------
Co-authored-by: Alex Strick van Linschoten <[email protected]>
# 🎮 GameSense: An LLM That Transforms Gaming Conversations into Structured Data
2
2
3
-
Elevate your gaming platform with an AI that translates player language into actionable data. A model that understands gaming terminology, extracts key attributes, and structures conversations for intelligent recommendations and support.
3
+
GameSense is a specialized language model that converts unstructured gaming conversations into structured, actionable data. It listens to how gamers talk and extracts valuable information that can power recommendations, support systems, and analytics.
4
4
5
-
## 🚀 Product Overview
5
+
## 🎯 What GameSense Does
6
6
7
-
GameSense is a specialized language model designed specifically for gaming platforms and communities. By fine-tuning powerful open-source LLMs on gaming conversations and terminology, GameSense can:
7
+
**Input**: Gamers' natural language about games from forums, chats, reviews, etc.
8
8
9
-
-**Understand Gaming Jargon**: Recognize specialized terms across different game genres and communities
10
-
-**Extract Player Sentiment**: Identify frustrations, excitement, and other emotions in player communications
11
-
-**Structure Unstructured Data**: Transform casual player conversations into structured, actionable data
12
-
-**Generate Personalized Responses**: Create contextually appropriate replies that resonate with gamers
13
-
-**Power Intelligent Recommendations**: Suggest games, content, or solutions based on player preferences and history
9
+
**Output**: Structured data with categorized information about games, platforms, preferences, etc.
14
10
15
-
Built on ZenML's enterprise-grade MLOps framework, GameSense delivers a production-ready solution that can be deployed, monitored, and continuously improved with minimal engineering overhead.
11
+
Here's a concrete example from our training data:
16
12
17
-
## 💡 How It Works
13
+
### Input Example (Gaming Conversation)
14
+
```
15
+
"Dirt: Showdown from 2012 is a sport racing game for the PlayStation, Xbox, PC rated E 10+ (for Everyone 10 and Older). It's not available on Steam, Linux, or Mac."
16
+
```
17
+
18
+
### Output Example (Structured Information)
19
+
```
20
+
inform(
21
+
name[Dirt: Showdown],
22
+
release_year[2012],
23
+
esrb[E 10+ (for Everyone 10 and Older)],
24
+
genres[driving/racing, sport],
25
+
platforms[PlayStation, Xbox, PC],
26
+
available_on_steam[no],
27
+
has_linux_release[no],
28
+
has_mac_release[no]
29
+
)
30
+
```
31
+
32
+
This structured output can be used to:
33
+
- Answer specific questions about games ("Is Dirt: Showdown available on Mac?")
34
+
- Track trends in gaming discussions
35
+
- Power recommendation engines
36
+
- Extract user opinions and sentiment
37
+
- Build gaming knowledge graphs
38
+
- Enhance customer support
39
+
40
+
## 🚀 How GameSense Transforms Gaming Conversations
41
+
42
+
GameSense listens to gaming chats, forum posts, customer support tickets, social media, and other sources where gamers communicate. As gamers discuss different titles, features, opinions, and issues, GameSense:
43
+
44
+
1.**Recognizes gaming jargon** across different genres and communities
45
+
2.**Extracts key information** about games, platforms, features, and opinions
46
+
3.**Structures this information** into a standardized format
47
+
4.**Makes it available** for downstream applications
48
+
49
+
## 💡 Real-World Applications
18
50
19
-
GameSense leverages Parameter-Efficient Fine-Tuning (PEFT) techniques to customize powerful foundation models like Microsoft's Phi-2 or Llama 3.1 for gaming-specific applications. The system follows a streamlined pipeline:
51
+
### Community Analysis
52
+
Monitor conversations across Discord, Reddit, and other platforms to track what games are being discussed, what features players care about, and emerging trends.
20
53
21
-
1.**Data Preparation**: Gaming conversations are processed and tokenized
22
-
2.**Model Fine-Tuning**: The base model is efficiently customized using LoRA adapters
23
-
3.**Evaluation**: The model is rigorously tested against gaming-specific benchmarks
24
-
4.**Deployment**: High-performing models are automatically promoted to production
54
+
### Intelligent Customer Support
55
+
When a player says: "I can't get Dirt: Showdown to run on my Mac," GameSense identifies:
56
+
- The specific game (Dirt: Showdown)
57
+
- The platform issue (Mac)
58
+
- The fact that the game doesn't support Mac (from structured knowledge)
59
+
- Can immediately inform the player about platform incompatibility
60
+
61
+
### Smart Recommendations
62
+
When a player has been discussing racing games for PlayStation with family-friendly ratings, GameSense can help power recommendations for similar titles they might enjoy.
63
+
64
+
### Automated Content Moderation
65
+
By understanding the context of gaming conversations, GameSense can better identify toxic behavior while recognizing harmless gaming slang.
66
+
67
+
## 🧠 Technical Approach
68
+
69
+
GameSense uses Parameter-Efficient Fine-Tuning (PEFT) to customize powerful foundation models for understanding gaming language:
70
+
71
+
1. We start with a base model like Microsoft's Phi-2 or Llama 3.1
72
+
2. Fine-tune on the gem/viggo dataset containing structured gaming conversations
| viggo-train-0 | inform(name[Dirt: Showdown], release_year[2012], esrb[E 10+ (for Everyone 10 and Older)], genres[driving/racing, sport], platforms[PlayStation, Xbox, PC], available_on_steam[no], has_linux_release[no], has_mac_release[no]) | Dirt: Showdown from 2012 is a sport racing game for the PlayStation, Xbox, PC rated E 10+ (for Everyone 10 and Older). It's not available on Steam, Linux, or Mac. |[Dirt: Showdown from 2012 is a sport racing game for the PlayStation, Xbox, PC rated E 10+ (for Everyone 10 and Older). It's not available on Steam, Linux, or Mac.]|
166
+
| viggo-train-1 | inform(name[Dirt: Showdown], release_year[2012], esrb[E 10+...]) | Dirt: Showdown is a sport racing game... |[Dirt: Showdown is a sport racing game...]|
167
+
168
+
You can also train on your own gaming conversations by formatting them in a similar structure and updating the configuration.
169
+
98
170
### Training Acceleration
99
171
100
172
For faster training on high-end hardware:
@@ -148,7 +220,7 @@ For detailed instructions on data preparation, see our [data customization guide
148
220
149
221
GameSense includes built-in evaluation using industry-standard metrics:
150
222
151
-
-**ROUGE Scores**: Measure response quality and relevance
223
+
-**ROUGE Scores**: Measure how well the model can generate natural language from structured data
152
224
-**Gaming-Specific Benchmarks**: Evaluate understanding of gaming terminology
153
225
-**Automatic Model Promotion**: Only deploy models that meet quality thresholds
154
226
@@ -192,7 +264,7 @@ GameSense follows a modular architecture for easy customization:
192
264
193
265
To fine-tune GameSense on your specific gaming platform's data:
194
266
195
-
1.**Format your dataset**: Prepare your gaming conversations in a structured format
267
+
1.**Format your dataset**: Prepare your gaming conversations in a structured format similar to gem/viggo
196
268
2.**Update the configuration**: Point to your dataset in the config file
197
269
3.**Run the pipeline**: GameSense will automatically process and learn from your data
198
270
@@ -203,6 +275,55 @@ The [`prepare_data` step](steps/prepare_datasets.py) handles:
203
275
204
276
For custom data sources, you'll need to prepare the splits in a Hugging Face dataset format. The step returns paths to the stored datasets (`train`, `val`, and `test_raw` splits), with the test set tokenized later during evaluation.
205
277
278
+
You can structure conversations from:
279
+
- Game forums
280
+
- Support tickets
281
+
- Discord chats
282
+
- Streaming chats
283
+
- Reviews
284
+
- Social media posts
285
+
206
286
## 📚 Documentation
207
287
208
288
For learning more about how to use ZenML to build your own MLOps pipelines, refer to our comprehensive [ZenML documentation](https://docs.zenml.io/).
289
+
290
+
## Running on CPU-only Environment
291
+
292
+
If you don't have access to a GPU, you can still run this project with the CPU-only configuration. We've made several optimizations to make this project work on CPU, including:
293
+
294
+
- Smaller batch sizes for reduced memory footprint
295
+
- Fewer training steps
296
+
- Disabled GPU-specific features (quantization, bf16, etc.)
297
+
- Using smaller test datasets for evaluation
298
+
- Special handling for Phi-3.5 model caching issues on CPU
299
+
300
+
To run the project on CPU:
301
+
302
+
```bash
303
+
python run.py --config phi3.5_finetune_cpu.yaml
304
+
```
305
+
306
+
Note that training on CPU will be significantly slower than training on a GPU. The CPU configuration uses:
307
+
308
+
1. A smaller model (`phi-3.5-mini-instruct`) which is more CPU-friendly
309
+
2. Reduced batch size and increased gradient accumulation steps
310
+
3. Fewer total training steps (50 instead of 300)
311
+
4. Half-precision (float16) where possible to reduce memory usage
312
+
5. Smaller dataset subsets (100 training samples, 20 validation samples, 10 test samples)
313
+
6. Special compatibility settings for Phi models running on CPU
314
+
315
+
For best results, we recommend:
316
+
- Using a machine with at least 16GB of RAM
317
+
- Being patient! LLM training on CPU is much slower than on GPU
318
+
- If you still encounter memory issues, try reducing the `max_train_samples` parameter even further in the config file
319
+
320
+
### Known Issues and Workarounds
321
+
322
+
Some large language models like Phi-3.5 have caching mechanisms that are optimized for GPU usage and may encounter issues when running on CPU. Our CPU configuration includes several workarounds:
323
+
324
+
1. Disabling KV caching for model generation
325
+
2. Using `torch.float16 data` type to reduce memory usage
326
+
3. Disabling flash attention which isn't needed on CPU
327
+
4. Using standard AdamW optimizer instead of 8-bit optimizers that require GPU
328
+
329
+
These changes allow the model to run on CPU with less memory and avoid compatibility issues, although at the cost of some performance.
# Copyright (c) ZenML GmbH 2024. All rights reserved.
4
+
#
5
+
# Licensed under the Apache License, Version 2.0 (the "License");
6
+
# you may not use this file except in compliance with the License.
7
+
# You may obtain a copy of the License at
8
+
#
9
+
# http://www.apache.org/licenses/LICENSE-2.0
10
+
#
11
+
# Unless required by applicable law or agreed to in writing, software
12
+
# distributed under the License is distributed on an "AS IS" BASIS,
13
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14
+
# See the License for the specific language governing permissions and
15
+
# limitations under the License.
16
+
#
17
+
18
+
model:
19
+
name: llm-peft-phi-3.5-mini-instruct-cpu
20
+
description: "Fine-tune Phi-3.5-mini-instruct on CPU."
21
+
tags:
22
+
- llm
23
+
- peft
24
+
- phi-3.5
25
+
- cpu
26
+
version: 100_steps
27
+
28
+
settings:
29
+
docker:
30
+
parent_image: pytorch/pytorch:2.2.2-runtime
31
+
requirements: requirements.txt
32
+
python_package_installer: uv
33
+
python_package_installer_args:
34
+
system: null
35
+
apt_packages:
36
+
- git
37
+
environment:
38
+
MKL_SERVICE_FORCE_INTEL: "1"
39
+
# Explicitly disable MPS
40
+
PYTORCH_ENABLE_MPS_FALLBACK: "0"
41
+
PYTORCH_MPS_HIGH_WATERMARK_RATIO: "0.0"
42
+
43
+
parameters:
44
+
# Uses a smaller model for CPU training
45
+
base_model_id: microsoft/Phi-3.5-mini-instruct
46
+
use_fast: False
47
+
load_in_4bit: False
48
+
load_in_8bit: False
49
+
cpu_only: True # Enable CPU-only mode
50
+
# Extra conservative dataset size for CPU
51
+
max_train_samples: 50
52
+
max_val_samples: 10
53
+
max_test_samples: 5
54
+
system_prompt: |
55
+
Given a target sentence construct the underlying meaning representation of the input sentence as a single function with attributes and attribute values.
56
+
This function should describe the target string accurately and the function must be one of the following ['inform', 'request', 'give_opinion', 'confirm', 'verify_attribute', 'suggest', 'request_explanation', 'recommend', 'request_attribute'].
57
+
The attributes must be one of the following: ['name', 'exp_release_date', 'release_year', 'developer', 'esrb', 'rating', 'genres', 'player_perspective', 'has_multiplayer', 'platforms', 'available_on_steam', 'has_linux_release', 'has_mac_release', 'specifier']
58
+
59
+
60
+
steps:
61
+
prepare_data:
62
+
parameters:
63
+
dataset_name: gem/viggo
64
+
# These settings are now defined at the pipeline level
65
+
# max_train_samples: 100
66
+
# max_val_samples: 20
67
+
# max_test_samples: 10
68
+
69
+
finetune:
70
+
parameters:
71
+
max_steps: 25# Further reduced steps for CPU training
72
+
eval_steps: 5# More frequent evaluation
73
+
bf16: False # Disable bf16 for CPU compatibility
74
+
per_device_train_batch_size: 1# Smallest batch size for CPU
75
+
gradient_accumulation_steps: 2# Reduced for CPU
76
+
optimizer: "adamw_torch"# Use standard AdamW rather than 8-bit for CPU
0 commit comments