This project is an intelligent Vector-Based Chatbot designed to retrieve accurate information from a complex knowledge base, such as an office manual or technical documentation. Unlike simple chatbots, it uses mathematical similarity to understand the context of user queries.
- Semantic Search Logic: Implemented Cosine Similarity to calculate the distance between user query vectors and the document vectors.
- Feature Extraction: Leveraged TF-IDF (Term Frequency-Inverse Document Frequency) for advanced text embedding.
- Confidence Scoring: Integrated a threshold system (Confidence < 0.2) to manage out-of-scope questions professionally.
- Language: Python
- Key Libraries: Scikit-Learn, NumPy, Pandas
- Environment: Optimized for Pydroid 3
System: Ask me anything from the office manual.
User: Tell me about parking.
Bot: Parking is free for employees in the basement level B1 and B2. (Confidence: 0.85)