Welcome to Fredda, a Redditor chain-of-thought bot that generates both answers and its internal reasoning. Fredda is inspired by the classic Fred chatbot and uses a transformer-based model for clear, transparent dialogue processing.
- Clone this repository.
- Create a new Conda environment (Python 3.9 is recommended):
conda create -n fredda python=3.9conda activate fredda - Install the required packages:
pip install -r requirements.txt - Make sure the following files are in place:
- Dataset Artifacts: Download out_tokens.jsonl from this Google Drive folder and place it in the dataset folder. You must do this if you want to train the model yourself with the same or different parameters.
- Vocab Folder: The vocab folder (containing tokenizer.model and tokenizer.vocab) is included with the repository clone.
- Model Checkpoint: Download rpt1.pth (the model is above 100MB) from this Google Drive folder and place it in the checkpoints folder (you will need to make this folder). You must do this if you want to interact with the bot.
- To run the command-line inference, type:
To run the Discord bot, type:
python inference.py(Remember to update the Discord bot token in the configuration.)python discord_inference.py
If you would like to generate your own reasoning dataset by using reasoning_data.py please install ollama and download llama 3.2:1b in the terminal with ollama run llama3.2:1b
Note: Only the necessary dataset files are published:
- out_tokens.jsonl (download from Google Drive)
- The vocab folder (contains tokenizer.model and tokenizer.vocab)
- The model checkpoint rpt1.pth (download from Google Drive) All source code is available for full reproducibility.
- Introduction
- Project Overview
- Components
- Model Architecture & Training
- Inference Pipelines
- Reasoning Data Generation
- Tokenization & Dataset Preparation
- Datasets
- Training Environment & Logs
- Published Artifacts
- Pipeline Diagram
- Terminal Usage Example
- Collaborators and Acknowledgments
- Summary and Future Work
- License
Fredda is a Redditor chain-of-thought bot that produces both an answer and a hidden reasoning process (chain-of-thought). When you interact with Fredda, it returns a spoiler-wrapped explanation of its internal reasoning followed by the final answer.
Fredda covers the complete workflow from data preprocessing to real-time inference. Its main parts include:
- Model Training: A transformer decoder with 12 layers, 768-dimensional embeddings, and 12 attention heads.
- Inference: Available through a command-line interface and a Discord bot.
- Data Generation: Scripts generate chain-of-thought explanations for Reddit Q&A pairs.
- Tokenization: Uses SentencePiece to ensure consistent tokenization between training and inference.
- Transformer decoder with 12 layers and 768-d embeddings.
- Trained on 341,369 examples and tested on 37,929 examples.
- Sequence length set to 128 (to keep training time reasonable; training took approximately 8 hours using CUDA on an RTX 4070).
- Final model checkpoint saved as "rpt1.pth".
- Final training loss: Train Loss = 4.1279, Validation Loss = 4.1200.
- Detailed logs are stored in the "logs" folder.
- CLI Inference: Run the script with
python inference.pyto interact via the terminal. - Discord Bot: Run the script with
python discord_inference.py. The bot listens for mentions on specified channels and replies with a spoiler-wrapped chain-of-thought and final answer.
- Script "reasoning_data.py" processes Reddit Q&A pairs.
- Uses Ollama to generate one-line chain-of-thought explanations.
- Output is saved as "pairs.json".
- Script "tokenize_data.py" trains a SentencePiece tokenizer with a 16,000 token vocabulary.
- Generates two key artifacts:
- out_tokens.jsonl – a tokenized version of the dataset (download from Google Drive if replicating training).
- The vocab folder – containing "tokenizer.model" and "tokenizer.vocab".
Fredda was trained on a reformatted AskReddit Q&A dataset from Kaggle: https://www.kaggle.com/datasets/rodmcn/askreddit-questions-and-answers
- Training Examples: 341,369 examples.
- Testing Examples: 37,929 examples.
- The data was preprocessed and tokenized to ensure consistent input to the model.
- Environment: Ubuntu running on Windows Subsystem for Linux (WSL).
- Hardware: NVIDIA GeForce RTX 4070 (CUDA 12.8).
- Mixed Precision: Enabled (FP16 with TF32) to speed up training.
- Logs: Detailed logs are in the "logs" folder:
- training.log
- final_stats.txt
- losses_per_epoch.csv
- losses.png
- ollama_processor.log (for reasoning data generation)
Released files include:
- Dataset Artifacts:
- out_tokens.jsonl (download from Google Drive and place in the dataset folder)
- vocab folder (contains tokenizer.model and tokenizer.vocab; included with repo clone)
- Model Checkpoint:
- rpt1.pth (download from Google Drive and place in the checkpoints folder)
- All source code is available in the repository.
A diagram illustrating the full pipeline

Below is an example of how the terminal interaction might look:

(This is a sample output; your terminal may show additional details.)
- Lead Developer: Robert Senatorov (sole collaborator).
- Inspiration: Inspired by the Fred chatbot.
- Data Source: AskReddit Q&A dataset from Kaggle.
- Built a transformer-based chatbot that outputs both its answer and internal reasoning.
- Integrated both command-line and Discord-based inference pipelines.
- Made all source code and key dataset artifacts publicly available.
- Experiment with larger models and fine-tuning parameters.
- Expand integration to additional platforms.
- Enhance data diversity and augmentation strategies.
- Gather and incorporate user feedback to improve performance.
This project is licensed under the MIT License. See the LICENSE file for details.
