Skip to content

UL-FRI-NLP-Course/ul-fri-nlp-course-project-2024-2025-chatbot-champions

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Natural language processing course 2024/2025: Conversational Agent with Retrieval-Augmented Generation

The project will involve natural language processing (NLP), web scraping, and retrieval-augmented generation (RAG) techniques to provide quality answers for news in real time.

Environment Setup

You can set up the required environment using either pip with requirements.txt or conda with environment.yml. Choose one method.

Option 1: Using pip (Recommended for Python-only dependencies)

  1. (Optional but Recommended) Create and activate a virtual environment:

    python -m venv venv
    # On Windows:
    # venv\Scripts\activate
    # On macOS/Linux:
    source venv/bin/activate
  2. Install the requirements:

    pip install -r requirements.txt

Option 2: Using conda (Recommended for complex dependencies or full environment replication)

Ensure you have Conda or Miniconda installed.

  1. Create the environment from the environment.yml file:

    conda env create -f environment.yml

    (This might take a few minutes)

  2. Activate the newly created environment (the environment name is usually defined inside the .yml file, check it if unsure):

    conda activate <your_environment_name>

You should now have the necessary dependencies installed to run the project.

Additional Setup Steps

After setting up your environment using either pip or conda and installing the requirements, you need to download the Slovene language model for spaCy. Run the following command in your terminal (ensure your virtual environment is activated if you are using one):

python -m spacy download sl_core_news_sm

This model is necessary for natural language processing tasks in Slovene within the project.

Configuration

  1. Environment Variables: Create a .env file inside the src/ directory (src/.env). This file is used to store sensitive information and configuration settings.

  2. Required Variables: Add the following variables to your src/.env file:

    # Necessary for connecting to the PostgreSQL database (adjust if needed - see src/docker-compose-yaml)
    DATABASE_URL=postgresql://test:test@localhost:5432/test
    
    # Optional: Add your OpenAI API key to use the OpenAI LLM provider
    # If omitted, and GEMINI_API_KEY is also omitted, the application will use the local (mocked) provider.
    # OPENAI_API_KEY=sk-...
    
    # Optional: Add your Gemini API key to use the Gemini LLM provider
    # If omitted, and OPENAI_API_KEY is also omitted, the application will use the local (mocked) provider.
    # GEMINI_API_KEY=AIza...

Running the Application

  1. Navigate to the Source Directory: Open your terminal and change to the src/ directory:

    cd src
  2. Start Services with Docker Compose: Ensure Docker is running. Then, start the database service defined in docker-compose.yml:

    docker compose up -d # The -d flag runs it in detached mode (background)

    Wait a few moments for the database container to initialize.

  3. Run the Main Script: Execute the main Python script to start the chatbot:

    python main.py

    Optional flags:

    • --use-chat-history: Enable conversation history.

    Note: For best results, capitalize all names and write questions grammatically correct.

  4. Stopping Services: When you are finished, stop the Docker Compose services:

    docker compose down

Authors

Blaž Špacapan      Matevž Jecl      Tilen Ožbot

Blaž Špacapan

Matevž Jecl

Tilen Ožbot

About

ul-fri-nlp-classroom-ul-fri-nlp-course-project-2024-2025-Project-template created by GitHub Classroom

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 76.7%
  • TeX 23.3%