Skip to content

Helping students process and interact with their data.

Notifications You must be signed in to change notification settings

vitalune/pitt-llamaproject

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

21 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ€– LlamaIndex RAG Chatbot Template

final product demo

A Retrieval-Augmented Generation (RAG) chatbot template that answers questions based on your company's documents using LlamaIndex and OpenAI.

πŸ“˜ For Students: This is a template for your project. The main folder contains your workspace, and the examples/ folder shows a complete working version for reference.


πŸ’Ό Why Choose This High-Tech Option?

This project gives you a competitive edge. By building an AI-powered chatbot with industry-standard tools, you'll gain hands-on experience with technologies that Fortune 500 companies and cutting-edge startups use dailyβ€”from working with OpenAI's API and deploying to professional platforms like Hugging Face, to managing code with Git and GitHub. These aren't just buzzwords: they're resume-ready skills that distinguish you in any career path, whether you're pursuing roles in business, healthcare, law, marketing, or technology. You'll create a live, public portfolio piece that demonstrates technical problem-solving, modern AI fluency, and the ability to build real-world applicationsβ€”capabilities that employers across industries increasingly value. While the low-tech option is perfectly valid, this path transforms your class project into a genuine professional asset.


πŸ“‹ Table of Contents


πŸ“ Repository Structure

pitt-llama-project/
β”œβ”€β”€ README.md                    # ← You are here!
β”œβ”€β”€ .env.example                 # Template for your API key
β”œβ”€β”€ .gitignore                   # Protects sensitive files
β”œβ”€β”€ requirements.txt             # Python dependencies
β”œβ”€β”€ app.py                       # YOUR chatbot (work here!)
β”œβ”€β”€ data/                        # YOUR documents go here (currently empty)
β”‚   └── README.md
β”‚
└── examples/                    # πŸ‘€ Reference only
    β”œβ”€β”€ llama_test.ipynb         # Learning notebook for Colab
    β”œβ”€β”€ index.html         # Full website (HTML, CSS, JS) saved locally w/ embedded chatbot script
    β”œβ”€β”€ visuals/         # example images of UI 
    β”‚   β”œβ”€β”€ embeddedui-demo.png  # example website w/ embedded chatbot
    β”‚   └── ui-demo.jpeg         # example ui during local streamlit testing
    β”œβ”€β”€ data/                    # Example documents
    β”‚   β”œβ”€β”€ taylor_swift_biography.html
    β”‚   └── constitution.pdf
    └── storage/                 # Pre-built index for example

🎯 Where to Work

  • app.py - Your main chatbot code (already complete, no edits needed!)
  • data/ - Put YOUR company documents here
  • examples/ - Look here if you get stuck (don't edit this!)

πŸ“‚ What Gets Created

When you run the app, it will automatically create:

  • storage/ - Cached index of your documents (speeds up loading)

🎯 What This Does

This chatbot uses Retrieval-Augmented Generation (RAG) to answer questions about your documents:

  1. πŸ“– Reads your documents from the data/ folder
  2. πŸ” Creates a searchable index using AI embeddings
  3. πŸ’¬ Answers questions by finding relevant information and generating responses
  4. 🧠 Remembers conversation context within each chat session

Example Use Case: A customer support chatbot that answers questions about your company's products, policies, or services.


βœ… Prerequisites

Before starting, make sure you have:

  1. A Google account for Google Colab (Sign up here)
  2. Google Colab Pro (FREE for students!) (Get it here)
    • ✨ Faster execution
    • ⏱️ Longer runtime limits
    • πŸ’Ύ More storage
    • ⚑ Priority access to GPUs
    • πŸŽ“ 100% FREE with your .edu email - verification takes ~2 seconds!
  3. An OpenAI API key
    • πŸŽ“ I, Amir, will provide a shared API key for the class
    • No payment required! Use the key provided by me
    • (Alternative: Use Gemini API within your Google Colab Workspace for free! For more info, see this link)
  4. A LlamaCloud API key (Optional but Recommended)
    • πŸ†“ Free tier available at cloud.llamaindex.ai
    • Enables advanced parsing of PDFs with tables, charts, and complex layouts
    • Get 1,000 free pages per month
    • Not required but highly recommended for processing complex documents
  5. A GitHub account (Sign up here)
  6. A Hugging Face account (Sign up here) - for deployment

πŸš€ Setup Instructions

Step 1: Create Your Google Colab Account

  1. Go to Google Colab
  2. Sign in with your Google account
  3. Get Colab Pro for FREE:
    • Go to Colab Pro pricing page
    • Click "Get Colab Pro" and verify with your .edu email
    • Instant approval! No payment required for students πŸŽ‰
    • Enjoy faster runtimes and priority access

Step 2: Fork This Repository

  1. Go to the repository on GitHub
  2. Click the "Fork" button in the top right
  3. This creates your own copy of the project

Step 3: Connect Colab to Your GitHub

  1. In Google Colab, click File β†’ Open notebook
  2. Select the GitHub tab
  3. Enter your repository URL or search for your username
  4. Open examples/llama_test.ipynb to start learning!

Step 4: Set Up Your API Keys in Colab

Option A: Using Colab Secrets (Recommended)

  1. In your Colab notebook, click the πŸ”‘ key icon in the left sidebar
  2. Click "Add new secret"
  3. Add OPENAI_API_KEY:
    • Name: OPENAI_API_KEY
    • Value: sk-proj-xxxxxxxxxxxxxxxxxxxxx (your actual key)
    • Toggle "Notebook access" to ON
  4. (Optional) Add LLAMA_CLOUD_API_KEY:
    • Click "Add new secret" again
    • Name: LLAMA_CLOUD_API_KEY
    • Value: llx-xxxxxxxxxxxxxxxxxxxxx (your LlamaCloud key)
    • Toggle "Notebook access" to ON

Option B: Using Code (Less Secure)

from google.colab import userdata
import os

# This retrieves your secret keys
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')
# Optional: Enable advanced document parsing
os.environ['LLAMA_CLOUD_API_KEY'] = userdata.get('LLAMA_CLOUD_API_KEY')

⚠️ Important: Never hardcode your API keys directly in the notebook!


πŸ“„ Adding Your Data

Supported File Types

Complex Documents (Parsed with LlamaParse - requires LLAMA_CLOUD_API_KEY)

  • PDF documents (.pdf) - with advanced OCR, table extraction, and chart recognition
  • Word documents (.docx, .doc)
  • PowerPoint presentations (.pptx, .ppt)
  • Excel spreadsheets (.xlsx, .xls)

Simple Text Files (Parsed with SimpleDirectoryReader)

  • HTML files (.html)
  • Text files (.txt)
  • Markdown files (.md)
  • CSV files (.csv)
  • JSON files (.json)
  • XML files (.xml)

LlamaParse demo

New Feature: The app now uses LlamaParse for advanced document parsing! LlamaParse provides:

  • High-quality OCR for scanned documents
  • Intelligent table extraction (even from images and charts)
  • Multi-column layout handling
  • Chart and graph text extraction
  • Better handling of complex PDFs with mixed content

If LLAMA_CLOUD_API_KEY is not set, the app will fall back to SimpleDirectoryReader for all files.

How to Add Documents to Colab

Method 1: Upload Directly (Quick Testing)

  1. In your Colab notebook, run:
    from google.colab import files
    uploaded = files.upload()
  2. Select your documents to upload
  3. Files will be in the current directory

Method 2: Mount Google Drive (Recommended)

  1. Upload your documents to a folder in Google Drive (e.g., My Drive/chatbot-data/)
  2. In your Colab notebook:
    from google.colab import drive
    drive.mount('/content/drive')
  3. Access files from: /content/drive/MyDrive/chatbot-data/

Method 3: Push to GitHub (For Deployment)

  1. Add your documents to the data/ folder in your repository
  2. Commit and push to GitHub
  3. Pull the repository in Colab or deploy directly to Hugging Face

Tips for Better Results

  • βœ… Use clear, well-formatted documents
  • βœ… Include only relevant company information
  • βœ… Break very large documents into smaller, topic-focused files
  • ❌ Don't include sensitive data (passwords, private info)
  • ❌ Avoid image-only PDFs (text must be selectable)

πŸ§ͺ Development & Testing Options

You have two options for developing and testing your chatbot. Choose the one that works best for you!


🌐 Option 1: Google Colab (Recommended for Beginners)

Pros: No installation needed, works in browser, free GPU access Cons: Temporary URLs, session expires after inactivity

Use this if: You prefer browser-based development or don't want to install Python locally


πŸ’» Option 2: Local Development (Recommended for Advanced Users)

Pros: Persistent environment, faster development, works offline Cons: Requires Python installation and setup

Use this if: You're comfortable with terminal/command line and want full control


🌐 Option 1: Testing in Google Colab

Phase 1: Learning with the Example Notebook

The example notebook (examples/llama_test.ipynb) teaches you RAG concepts interactively.

  1. Open the example notebook in Colab:

    • Go to your forked repository
    • Navigate to examples/llama_test.ipynb
    • Click "Open in Colab" badge (or manually open via Colab)
  2. Install dependencies (First cell - run this first!):

    # STEP 1: Install all required packages
    print("πŸ“¦ Installing dependencies...")
    
    !pip install -q streamlit==1.50.0
    !pip install -q llama-index==0.14.4
    !pip install -q llama-index-core==0.14.4
    !pip install -q llama-index-llms-openai==0.6.4
    !pip install -q llama-index-embeddings-openai==0.5.1
    !pip install -q openai==1.109.1
    !pip install -q python-dotenv==1.1.1
    !pip install -q jedi==0.19.2
    
    print("βœ… All dependencies installed!")

    ⏱️ This takes 1-2 minutes. Wait for "βœ… All dependencies installed!" before continuing.

  3. Set up your API key (Second cell):

    # STEP 2: Configure OpenAI API Key
    from google.colab import userdata
    import os
    
    # Get API key from Colab secrets (you must add this first!)
    os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')
    print("βœ… API key loaded")
  4. Load and index documents (Third cell):

    # STEP 3: Load documents and create index
    from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
    from llama_index.llms.openai import OpenAI
    from llama_index.embeddings.openai import OpenAIEmbedding
    
    # Configure models
    llm = OpenAI(model="gpt-5-nano-2025-08-07", temperature=0.1)
    embed_model = OpenAIEmbedding(model="text-embedding-3-small")
    
    # Load documents from data folder
    documents = SimpleDirectoryReader("data").load_data()
    print(f"πŸ“„ Loaded {len(documents)} documents")
    
    # Create searchable index
    index = VectorStoreIndex.from_documents(
        documents,
        llm=llm,
        embed_model=embed_model
    )
    print("βœ… Index created successfully!")
  5. Query the chatbot (Fourth cell):

    # STEP 4: Ask questions!
    query_engine = index.as_query_engine()
    
    # Try your first question
    response = query_engine.query("Your question here")
    print(response)
  6. Test with example data first, then replace with your own documents

Phase 2: Running Your Streamlit App in Colab

Once you understand how RAG works from the notebook, transition to testing your actual app.py Streamlit application.

Why Transition to app.py?

  • πŸ““ Notebook (llama_test.ipynb): Learning tool, shows RAG step-by-step
  • πŸš€ Streamlit app (app.py): Production-ready chatbot with UI, what you'll deploy

Step-by-Step: Running app.py in Colab

  1. Create a new Colab notebook (or add cells to your existing one):

    • File β†’ New notebook
    • Or continue in your existing notebook
  2. Install dependencies (same as before):

    !pip install -q streamlit==1.50.0 llama-index==0.14.4 llama-index-core==0.14.4 llama-index-llms-openai==0.6.4 llama-index-embeddings-openai==0.5.1 openai==1.109.1 python-dotenv==1.1.1 jedi==0.19.2
  3. Clone your repository (if not already in Colab):

    # Clone your forked repository
    !git clone https://github.com/YOUR-USERNAME/YOUR-REPO-NAME.git
    %cd YOUR-REPO-NAME
  4. Set up your API key as environment variable:

    import os
    from google.colab import userdata
    
    # Set API key for the app to use
    os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')
  5. Upload your documents (if not already in the repo):

    # Option A: Upload directly to Colab
    from google.colab import files
    uploaded = files.upload()
    # Move uploaded files to data folder
    !mkdir -p data
    !mv *.pdf data/  # Adjust file extensions as needed
    
    # Option B: Mount Google Drive
    from google.colab import drive
    drive.mount('/content/drive')
    !cp -r /content/drive/MyDrive/chatbot-data/* data/
  6. Install localtunnel to expose Streamlit:

    # Install localtunnel for public URL
    !npm install -g localtunnel
  7. Run Streamlit in the background:

    # Run Streamlit app in background
    !streamlit run app.py &>/content/logs.txt &
    
    # Wait for Streamlit to start
    import time
    time.sleep(5)
    
    # Verify it's running
    !curl http://localhost:8501
  8. Expose with localtunnel to get a public URL:

    # Get a public URL using localtunnel
    !npx localtunnel --port 8501 &
    
    # Wait a moment for the URL
    import time
    time.sleep(3)
    
    # The URL will appear in the output above
    # Look for: "your url is: https://xxxxx.loca.lt"
  9. Access your chatbot:

    • Click the URL from localtunnel output (looks like https://xxxxx.loca.lt)
    • Click "Click to Continue" on the localtunnel page
    • Your Streamlit chatbot interface will appear! πŸŽ‰

chatbot ui demo

  1. Test your chatbot:
    • Ask questions about your documents
    • Verify responses are accurate
    • Test different types of queries

Important Notes for Running app.py in Colab:

⚠️ Limitations:

  • Localtunnel URLs are temporary (expire when Colab disconnects)
  • Not suitable for permanent hosting
  • Great for testing and development only

βœ… When to Use This:

  • Testing your app with real documents before deploying
  • Showing your team the chatbot interface during development
  • Debugging issues before Hugging Face deployment

πŸš€ For Production:

  • After testing in Colab, deploy to Hugging Face Spaces (permanent hosting)
  • Colab is for development and testing
  • Hugging Face is for production and embedding

Workflow Summary:

Step 1: Learn RAG concepts
└─→ Use llama_test.ipynb notebook

Step 2: Test with your data
└─→ Add your documents to data/
└─→ Run notebook cells to verify indexing works

Step 3: Test the Streamlit UI
└─→ Run app.py in Colab with localtunnel
└─→ Verify chatbot interface works correctly

Step 4: Deploy to production
└─→ Push to GitHub
└─→ Deploy to Hugging Face Spaces
└─→ Embed in your website

Step 5: Publish website
└─→ Enable GitHub Pages
└─→ Share your live URL!

Phase 3: When You're Ready for Production

Once you've tested everything in Colab and your chatbot works well:

  1. βœ… Make sure all your documents are in the data/ folder
  2. βœ… Push your code to GitHub
  3. βœ… Deploy to Hugging Face Spaces (see next section)
  4. βœ… Embed the permanent Hugging Face URL in your website

πŸ’» Option 2: Testing Locally on Your Computer

If you prefer to develop on your local machine, follow these steps.

Prerequisites

  • Python 3.9+ installed
  • Terminal/Command Prompt access
  • Text editor or IDE (VS Code recommended)

Setup Steps

  1. Clone your repository:

    git clone https://github.com/YOUR-USERNAME/YOUR-REPO-NAME.git
    cd YOUR-REPO-NAME
  2. Create a virtual environment:

    On macOS/Linux:

    python3 -m venv venv
    source venv/bin/activate

    On Windows:

    python -m venv venv
    venv\Scripts\activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Set up your API keys:

    # Copy the template
    cp .env.example .env
    
    # Edit .env and add your keys
    # OPENAI_API_KEY=your-provided-key-here
    # LLAMA_CLOUD_API_KEY=llx-your-key-here (optional but recommended)
  5. Add your documents to the data/ folder:

    # Place your PDF, HTML, TXT files in data/
    ls data/
  6. Run the Streamlit app:

    streamlit run app.py

    The app will open at http://localhost:8501 πŸŽ‰

Testing Locally

  1. First run: The app will index your documents (takes 10-30 seconds)
  2. Subsequent runs: Loads from cached storage/ folder (much faster)
  3. To re-index: Delete the storage/ folder and restart

Local Development Tips

βœ… Advantages:

  • Faster iteration (no need to reinstall packages each time)
  • Persistent storage (index cache survives between sessions)
  • Works offline (once dependencies are installed)
  • Better debugging experience

⚠️ Remember:

  • Keep your virtual environment activated when working
  • Never commit .env file to GitHub
  • Test thoroughly before deploying to Hugging Face

Workflow for Local Development

# 1. Activate environment
source venv/bin/activate  # or venv\Scripts\activate on Windows

# 2. Make changes to your code or data/

# 3. Test locally
streamlit run app.py

# 4. When satisfied, push to GitHub
git add .
git commit -m "Update chatbot"
git push

# 5. Deploy to Hugging Face (see next section)

🎯 Which Option Should You Choose?

Factor Google Colab Local Development
Setup Time ⚑ Instant πŸ• 10-15 minutes
No Installation βœ… Yes ❌ Need Python
Persistent Environment ❌ Sessions expire βœ… Always available
Speed 🐌 Slower ⚑ Faster
Best For Beginners, quick tests Serious development
Internet Required βœ… Always ❌ Only for deployment

Recommendation: Start with Google Colab to learn, then switch to local development if you want a better experience!

Understanding the Workflow

πŸ“ Colab Notebook β†’ πŸ§ͺ Test RAG Logic β†’ πŸš€ Deploy to Hugging Face β†’ 🌐 Embed in Website
  • Colab: Development and testing environment
  • Hugging Face: Production hosting for your Streamlit app
  • Website: User-facing integration

πŸ§ͺ Testing with the Example

Option 1: Use the Example Notebook in Colab

  1. Open examples/llama_test.ipynb in Google Colab
  2. Run all cells to see the chatbot in action
  3. Ask questions like:
    • "When did Taylor Swift become a superstar?"
    • "What are the amendments in the Constitution?"

Option 2: Copy Example Data for Testing

If you want to test with the example documents:

  1. Clone the example data to your Google Drive
  2. Or download from GitHub and upload to Colab
  3. Point your code to the example data folder

🌐 Deploying to Hugging Face

Why Deploy?

  • ✨ Makes your chatbot publicly accessible
  • πŸ†“ Free hosting for public projects
  • πŸ”— Easy to share with your team and embed in websites
  • 🎨 Professional Streamlit interface

Deployment Steps

  1. Create a new Space at huggingface.co/new-space

    • Name: your-company-chatbot
    • License: Apache 2.0
    • SDK: Streamlit ⚠️ Important!
    • Hardware: CPU Basic (free)
  2. Upload your files from your GitHub repository:

    • app.py βœ…
    • requirements.txt βœ…
    • data/ folder with YOUR documents βœ…
    • storage/ folder (optional - speeds up first load) ⚠️
  3. Add your API keys as Secrets:

    • Go to Space Settings β†’ Repository secrets
    • Add secret: OPENAI_API_KEY = your-key-here (required)
    • Add secret: LLAMA_CLOUD_API_KEY = llx-your-key-here (optional but recommended for better document parsing)
  4. Wait for build (2-3 minutes)

    • Check the "Logs" tab for any errors
    • Look for: "βœ… Index loaded" or "βœ… Index created"
    • Once running, your chatbot is live! πŸŽ‰

Your chatbot URL will be: https://huggingface.co/spaces/YOUR-USERNAME/your-company-chatbot

πŸ’‘ Pro Tips for Hugging Face Deployment

  • Upload the storage/ folder to skip indexing on first load (faster startup)
  • Test thoroughly in Colab or locally before deploying
  • Use descriptive Space names (e.g., acme-support-bot not test123)
  • The chatbot uses gpt-5-nano-2025-08-07 for responses and text-embedding-3-small for indexing (configured in app.py)

🌍 Embedding in Your Website

Once deployed to Hugging Face, you can embed your chatbot in your company website HTML page.

Option 1: Floating Chat Widget (Recommended)

See it in action: Check out visuals/embeddedui-demo.html for a working example!

Add this code before the closing </body> tag of your index.html:

<!-- Chatbot Widget Styles -->
<style>
  .chat-widget-container {
    position: fixed;
    bottom: 20px;
    right: 20px;
    z-index: 9999;
    width: 400px;
    height: 600px;
    border-radius: 12px;
    box-shadow: 0 8px 32px rgba(0, 0, 0, 0.2);
    overflow: hidden;
    display: none;
    background: white;
  }

  .chat-widget-container.open {
    display: block;
    animation: slideUp 0.3s ease;
  }

  @keyframes slideUp {
    from {
      opacity: 0;
      transform: translateY(20px);
    }
    to {
      opacity: 1;
      transform: translateY(0);
    }
  }

  .chat-widget-button {
    position: fixed;
    bottom: 20px;
    right: 20px;
    z-index: 10000;
    width: 60px;
    height: 60px;
    border-radius: 50%;
    background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
    border: none;
    color: white;
    font-size: 24px;
    cursor: pointer;
    box-shadow: 0 4px 15px rgba(0, 0, 0, 0.3);
    transition: all 0.3s ease;
  }

  .chat-widget-button:hover {
    transform: scale(1.1);
    box-shadow: 0 6px 20px rgba(0, 0, 0, 0.4);
  }

  @media (max-width: 768px) {
    .chat-widget-container {
      width: calc(100vw - 40px);
      height: calc(100vh - 140px);
      bottom: 10px;
      right: 10px;
    }
  }
</style>

<!-- Chatbot Toggle Button -->
<button class="chat-widget-button" onclick="toggleChat()" aria-label="Open chatbot">πŸ’¬</button>

<!-- Chatbot Container -->
<div class="chat-widget-container" id="chatWidget">
  <iframe 
    src="https://huggingface.co/spaces/YOUR-USERNAME/your-company-chatbot"
    width="100%" 
    height="100%" 
    frameborder="0"
    title="Company Chatbot">
  </iframe>
</div>

<!-- Toggle Script -->
<script>
  function toggleChat() {
    const widget = document.getElementById('chatWidget');
    const button = document.querySelector('.chat-widget-button');
    
    if (widget.classList.contains('open')) {
      widget.classList.remove('open');
      button.textContent = 'πŸ’¬';
      button.setAttribute('aria-label', 'Open chatbot');
    } else {
      widget.classList.add('open');
      button.textContent = 'βœ•';
      button.setAttribute('aria-label', 'Close chatbot');
    }
  }
</script>

Option 2: Full-Page Embed

<iframe 
  src="https://huggingface.co/spaces/YOUR-USERNAME/your-company-chatbot"
  width="100%" 
  height="600px" 
  frameborder="0"
  title="Company Chatbot">
</iframe>

⚠️ Important: Replace YOUR-USERNAME/your-company-chatbot with your actual Space URL!

Customization

  • Change colors by editing the CSS background gradients
  • Adjust size with width and height properties
  • Move position with bottom and right values
  • Customize the button emoji (πŸ’¬, πŸ€–, πŸ’‘, etc.)

πŸš€ Publishing Your Website to GitHub Pages

Once you have your chatbot embedded, publish your complete website live on GitHub Pages!

Step 1: Prepare Your Repository

Make sure your repository has:

  • βœ… index.html (your main website page with embedded chatbot)
  • βœ… style.css (your website styles)
  • βœ… app.py (your chatbot code)
  • βœ… data/ folder (your company documents)
  • βœ… requirements.txt
  • βœ… README.md

Step 2: Push Everything to GitHub

# Add all files
git add .

# Commit with a descriptive message
git commit -m "Add company website with AI chatbot"

# Push to your repository
git push origin main

Step 3: Enable GitHub Pages

  1. Go to your repository on GitHub
  2. Click Settings β†’ Pages (in the left sidebar)
  3. Under "Source", select:
    • Branch: main
    • Folder: / (root)
  4. Click Save
  5. Wait 1-2 minutes for deployment

Step 4: Access Your Live Website

Your website will be live at:

https://YOUR-USERNAME.github.io/YOUR-REPO-NAME/

πŸŽ‰ Your chatbot is now embedded in a live website!

What Gets Published

  • βœ… Your index.html website
  • βœ… All CSS, JavaScript, and assets
  • βœ… The embedded Hugging Face chatbot iframe
  • ❌ Backend files (app.py, data/) are not served by GitHub Pages
  • ℹ️ The chatbot itself runs on Hugging Face, not GitHub Pages

Updating Your Live Site

Every time you push to GitHub, your site automatically updates:

# Make changes to your HTML/CSS
git add index.html style.css
git commit -m "Update website design"
git push origin main
# Site updates in 1-2 minutes!

Pro Tips

  • Test your website locally by opening index.html in a browser before pushing
  • Make sure your Hugging Face Space URL in the iframe is correct
  • Use relative paths for CSS/JS files (e.g., ./style.css not /style.css)
  • Add a custom domain in GitHub Pages settings if you have one!

πŸ”§ Troubleshooting

Google Colab Issues

"OPENAI_API_KEY not found"

  • βœ… Colab: Make sure you added the secret (πŸ”‘ icon) and toggled "Notebook access" to ON
  • βœ… Local: Check that your .env file exists and contains the instructor-provided key
  • βœ… Hugging Face: Verify the secret is set in Settings β†’ Repository secrets
  • βœ… Make sure the key is exactly as provided by your instructor (no extra spaces)

"Runtime disconnected"

  • βœ… Colab Pro (FREE for students!) has longer runtimes than the free tier
  • βœ… Save your work frequently to GitHub or Google Drive
  • βœ… Consider running critical tasks in shorter sessions

"Module not found" error

  • βœ… Run the install cells at the start of your notebook
  • βœ… Use !pip install (with !) in Colab, not regular pip install
  • βœ… Make sure you ran the entire installation cell and waited for it to complete
  • βœ… If issues persist, restart runtime (Runtime β†’ Restart runtime) and run install cell again

Chatbot Issues

"Please add documents to the 'data' directory"

  • βœ… Make sure you uploaded files to the data folder
  • βœ… Check that files are in supported formats (PDF, HTML, TXT, etc.)
  • πŸ’‘ Try the example: upload files from examples/data/

Chatbot gives wrong answers

  • βœ… Make sure your documents contain the relevant information
  • βœ… Try rephrasing your question more specifically
  • βœ… Check if the document text is readable (not corrupted or image-only PDFs)
  • πŸ’‘ Test with the example first to verify it's working

Slow response times in Colab

  • ⏱️ First query after starting is always slower (building index)
  • ⚑ Subsequent queries should be faster (using cached index)
  • πŸš€ Get Colab Pro for FREE with your .edu email for better performance

Hugging Face Deployment Issues

Space won't start

  • βœ… Check the "Logs" tab for error messages
  • βœ… Verify OPENAI_API_KEY is set in Repository secrets
  • βœ… Make sure you selected "Streamlit" as the SDK
  • βœ… Confirm you uploaded requirements.txt and app.py

Embedded iframe not showing chatbot

  • βœ… Make sure your Hugging Face Space is running (check the Space URL directly)
  • βœ… Try hard refresh: Ctrl+Shift+R (Windows) or Cmd+Shift+R (Mac)
  • βœ… Check browser console for errors (F12 β†’ Console tab)
  • βœ… Verify the iframe src URL is correct

GitHub Pages Issues

Website not loading

  • βœ… Make sure GitHub Pages is enabled in Settings β†’ Pages
  • βœ… Wait 1-2 minutes after enabling for initial deployment
  • βœ… Check that branch is set to main and folder is / (root)

Chatbot iframe not appearing on live site

  • βœ… Verify your Hugging Face Space URL is correct in the iframe src
  • βœ… Check browser console for CORS or iframe errors
  • βœ… Test the Hugging Face Space URL directly in a browser first

CSS/JavaScript not loading

  • βœ… Use relative paths: ./style.css not /style.css
  • βœ… Check file names match exactly (case-sensitive on GitHub Pages)
  • βœ… Clear browser cache and hard refresh

πŸ“š Additional Resources


πŸ“ Project Integration (ENGCMP 0600)

This chatbot template is designed for your company project:

Project Step What to Do Where
Steps 1-5 Plan your company, identify documents needed Team planning
Step 6 Research and gather company documents data/ folder
Steps 7-9 Test and refine your chatbot Google Colab
Step 8 Deploy chatbot to production Hugging Face Spaces
Step 8 Build website and embed chatbot HTML/CSS with iframe
Step 9 Push repository and publish website GitHub β†’ GitHub Pages
Step 10 Present your live website with chatbot Final demo

Deliverables Checklist

  • βœ… Working chatbot with your company's documents
  • βœ… Chatbot deployed to Hugging Face Spaces
  • βœ… Company website with embedded chatbot
  • βœ… Website live on GitHub Pages
  • βœ… Complete repository pushed to GitHub
  • βœ… Documentation (README, etc.)
  • βœ… Google Colab notebook showing development process

πŸŽ“ Recommended Workflow

1. πŸ“˜ Learn RAG Concepts
   └─→ Open examples/llama_test.ipynb in Google Colab
   └─→ Understand how document indexing and retrieval works

2. πŸ“ Plan Your Company
   └─→ Identify what documents your chatbot needs
   └─→ Gather company information (products, policies, FAQs)

3. πŸ“„ Prepare Documents
   └─→ Collect and organize documents in supported formats
   └─→ Add to data/ folder

4. πŸ§ͺ Choose Development Environment
   └─→ Option A: Google Colab (browser-based, beginner-friendly)
   └─→ Option B: Local development (faster, more control)

5. πŸ”§ Test Your Chatbot
   └─→ Google Colab: Use localtunnel for temporary testing
   └─→ Local: Run streamlit run app.py for instant feedback
   └─→ Verify answers are accurate and relevant

6. πŸš€ Deploy to Production
   └─→ Push code to GitHub repository
   └─→ Deploy to Hugging Face Spaces (permanent hosting)
   └─→ Get your permanent chatbot URL

7. 🌐 Build Company Website
   └─→ Create index.html with company branding
   └─→ Embed Hugging Face chatbot using iframe code
   └─→ Style with CSS

8. πŸ“€ Publish Website
   └─→ Push website files to GitHub
   └─→ Enable GitHub Pages in repository settings
   └─→ Get your live website URL

9. βœ… Verify Everything Works
   └─→ Test chatbot on live website
   └─→ Ask various questions to ensure accuracy
   └─→ Check responsive design on mobile

10. 🎀 Present Your Project
    └─→ Demo your live website with working AI chatbot
    └─→ Explain your company and how the bot helps customers
    └─→ Share both GitHub and live website URLs

🀝 Support

If you run into issues:

  1. βœ… Check the Troubleshooting section above
  2. πŸ§ͺ Try running the examples/llama_test.ipynb to verify setup
  3. πŸ“‹ Review your code against this README
  4. πŸ” Check Hugging Face Space logs for error messages
  5. πŸ’¬ Ask your instructor or TA for help

πŸŽ“ Learning Resources

  • examples/llama_test.ipynb - Jupyter notebook explaining RAG concepts (start here!)
  • examples/README.md - How the example chatbot works
  • data/README.md - Tips for adding documents
  • visuals/ - UI demos and screenshots for reference

Good luck building your AI-powered chatbot! πŸš€

Remember: Develop in Google Colab, deploy to Hugging Face, embed in your website, publish on GitHub Pages!

About

Helping students process and interact with their data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages