When I started working on this project, the first thing I did was properly explore and understand the activities in Sugarizer. I went through them myself so I could understand how children actually interact with the platform and what would be the best way to build something meaningful on top of it. The main thing I kept in mind throughout was: how will this actually make the experience better for the users?
Since Sugar is mainly used by children aged 5β12 years, I made sure that the model communicates in a tone that is simple, friendly, and suitable for that age group. When I think about my own childhood, I always enjoyed activities that made me think while still being fun. Sugar already does that really well, and adding a reflection agent ensures that children not only play but also actually learn from what they do.
I focused on providing something meaningful to the community through which children can learn and make the best of their experience. Note: This is just a sample prototype to help you understand what I am offering in the best way possible. Any bugs found please feel free create an issue will resolve it at the earliest.
The core of this system is the master LLM, which is fine-tuned on high-quality educational resources, including courses from Harvard University, aligned with concepts from Educational Psychology. This ensures that the model behaves like a professional educational psychologist and evaluates childrenβs responses in a meaningful and age-appropriate way.
Additionally:
- Audio and video pipelines are supported by specialized metrics
- These metrics help the model better understand and analyze user responses
Note: Proposal claims that the prototype can converse about gears and paint activity, but instead it is gears and 3d volume activity. You cannot access fine tuning code, I request you to try this link once or download the code and view it locally: https://colab.research.google.com/drive/1FEqDT1HXXjYEHDQ5peddYvmloZ8kB6yg You can access the list of names of books used for fine-tuning here: https://docs.google.com/document/d/1crlg5PF2uOPlvGaj117jvGo5SZEBNmHxtSZNr5Kk9IU/edit?usp=drive_link
Video example:
Video.Project.11.1.mp4
-
Activity Selection
- The user selects an activity inside Sugarizer
-
Data Retrieval
- The system retrieves the corresponding
.jsonfile from the Journal - This file contains all user interaction data
- The system retrieves the corresponding
-
Data Processing
-
JSON is parsed to extract:
- User actions
- Progress
- Key interaction data
-
-
Context Building
- Extracted data is sent to the master LLM to build context
-
User Interaction (SugarMind UI)
-
User clicks the SugarMind icon
-
Chooses reflection type:
- Emo Agent
- Logic Agent
- General Agent
-
-
Input Methods
-
Text Input
- Evaluated directly by the LLM
-
Video Input
- Uses child-focused engagement and expression metrics
-
Audio Input
- Converted via speech-to-text
- Evaluated using tone, confidence, and clarity
-
-
Multimodal Processing
- Inputs are processed based on format
- Prepared for unified analysis
-
Response Generation
-
Sent back to the master LLM
-
Generates:
- Personalized feedback
- Next reflective question
-
This unified approach ensures:
- Consistency
- Better context understanding
- More meaningful interaction
The system follows a three-agent architecture, inspired by prior work and extended for better learning outcomes.
Reflection is not just logical β emotional growth is equally important, especially for children.
Purpose:
- Encourage emotional expression
- Connect feelings with activities
- Support emotional development
Behavior:
- Asks questions about feelings and experiences
- Helps children reflect beyond outcomes
The Logic agent acts as a friendly guide based on educational psychology principles.
Focus Areas:
- Critical thinking
- Problem-solving
- Structured reasoning
Behavior:
- Encourages reflection on problem-solving approaches
- Helps children think about improvements
The Gen agent ensures overall understanding and learning reinforcement.
Focus Areas:
- What the child learned
- Concept reinforcement
- Simple reflective questioning
Behavior:
- Asks broad, easy-to-understand questions
- Confirms conceptual clarity
.
βββ activity_json/
β βββ 3D Volume Activity.json # Activity interaction data from Sugarizer
β βββ Gears Activity (1).json # User activity logs
β
βββ Prompts/
β βββ Activity_description/
β βββ __pycache__/
β βββ Gears.py # Prompt logic for Gears activity
β βββ three_d_Volume.py # Prompt logic for 3D Volume activity
β
βββ Agents/
β βββ Emo_agent.py # Emotional reflection agent
β βββ gen_agent.py # General reflection agent
β βββ logic_agent.py # Logical/critical thinking agent
β
βββ src/
β βββ __pycache__/
β βββ Create_vector_store.py # Vector DB creation for context retrieval
β βββ inference.py # Core inference pipeline
β βββ video_analysis.py # Video input processing & metrics
β
βββ Activity_description.lnk # Shortcut to activity descriptions
βββ get_activity_description.py # Extracts activity-related metadata
βββ create_dataset.py # Dataset preparation script
βββ create_prompts_files.py # Generates prompt templates
βββ index.html # Frontend entry (if used)
βββ new.txt # Misc file (can be cleaned)
βββ sample_website.py # Sample UI/demo script
βββ server.py # Backend server logic
βββ requirements.txt # Python dependencies
βββ README.md # Project documentation
- activity_json/ β Stores user interaction data from Sugarizer activities
- Prompts/ β Contains activity-specific prompt engineering logic
- Agents/ β Core reflection agents (Emo, Logic, Gen)
- src/ β Backend processing (inference, embeddings, video analysis)
- Scripts β Dataset creation, prompt generation, and activity parsing
- server.py / UI files β Handle application interface and interaction
- create_dataset.py: https://github.com/Shekar-77/SugarMind/blob/main/create_dataset.py was used extract data from the books
Follow these steps to set up and run S ugarMind locally:
git clone https://github.com/Shekar-77/SugarMind/
cd SugarMindconda create -n sugarmind python=3.12 -y
conda activate sugarmindpip install -r requirements.txt
pip install tf-keras
pip install transformers==4.49.0 tokenizers==0.21.0 webcolors==1.11.1
pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpuDownload the .tar model weights from the provided link:
π https://drive.google.com/drive/folders/1zJcU4NIrSMsNS82XiXIjlHKubATrFJAp?usp=drive_link
After downloading, extract the .tar file and place the folders in the project root directory as shown below:
.
βββ final_model/ # Extracted main model weights
βββ final_model_gguf/ # (Optional) GGUF / quantized weights
βββ activity_json/
βββ Prompts/
βββ Agents/
βββ src/
βββ server.py
βββ requirements.txt
βββ README.md
β οΈ Make sure the model folders (final_model,final_model_gguf) are at the root level of the project (same level asserver.py).
Once the weights are placed correctly, start the application:
python SugarMind.py- A Gradio link will appear in the terminal
- Open it in your browser (usually
http://127.0.0.1:7860)
You can also run terminal inference using:
python - m src.inference
You can un comment this code and run inference, change activity name to 'gears' to converse about it. Add the required folder path accordingly.
But have to insert recording of audio, video folder path.
- Ensure the
.tarfile is fully extracted before running - Folder names must match exactly (
final_model, etc.) - If the model is not found, check paths inside
inference.pyorserver.py