An AI-powered assistant that generates personalized icebreakers and conversation starters based on LinkedIn profiles. Built with IBM watsonx.ai and LlamaIndex, it helps make professional introductions more personal and engaging.
- LinkedIn Profile Analysis: Extract professional data using ProxyCurl API or use mock data
- AI-Powered Insights: Generate interesting facts about a person's career/education
- Personalized Q&A: Answer specific questions about the person's background
- Two Interfaces: Command-line tool for quick usage and web UI for user-friendly interaction
- Flexible: Use mock data for practice or connect to real LinkedIn profiles
- Python 3.11+, < 3.13
- A ProxyCurl API key (optional - mock data available)
- Clone the repository:
git clone https://github.com/HaileyTQuach/icebreaker.git
cd icebreaker- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- (Optional) Add your ProxyCurl API key to
config.py:
PROXYCURL_API_KEY = "your-api-key-here"Run the bot using the terminal:
# Use mock data (no API key needed)
python main.py --mock
# OR use a real LinkedIn profile
python main.py --url "https://www.linkedin.com/in/username/" --api-key "your-api-key"Launch the web app:
python app.pyThen open your browser to the URL shown in the terminal (typically http://127.0.0.1:7860).
The Icebreaker Bot uses a Retrieval-Augmented Generation (RAG) pipeline:
- Data Extraction: LinkedIn profile data is retrieved via ProxyCurl API or mock data
- Text Processing: Profile data is split into manageable chunks
- Vector Embedding: Text chunks are converted to vector embeddings using IBM watsonx
- Storage: Embeddings are stored in a vector database
- Query & Generation: When asked a question, relevant profile sections are retrieved and an IBM watsonx LLM generates contextually accurate responses
icebreaker_bot/
├── requirements.txt # Dependencies
├── config.py # Configuration settings
├── modules/
│ ├── __init__.py
│ ├── data_extraction.py # LinkedIn profile data extraction
│ ├── data_processing.py # Data splitting and indexing
│ ├── llm_interface.py # LLM setup and interaction
│ └── query_engine.py # Query processing and response generation
├── app.py # Gradio web interface
└── main.py # CLI application
Here are some example questions you can ask:
- "What is this person's current job title?"
- "Where did they get their education?"
- "What skills do they have related to machine learning?"
- "How long have they been working at their current company?"
- "What was their career progression?"
You can switch between available models:
python main.py --mock --model "meta-llama/llama-3-3-70b-instruct"Or in the web interface, select from the dropdown menu.
Edit the prompt templates in config.py to change how responses are generated:
INITIAL_FACTS_TEMPLATE = """
You are an AI assistant that provides detailed answers based on the provided context.
...
"""If you're learning to build this project from scratch, check out the 1-start branch:
git checkout 1-startThis branch contains starter files with TODOs and guidance for implementation.
Test individual components:
# Test data extraction
python -c "from modules.data_extraction import extract_linkedin_profile; print(extract_linkedin_profile('https://www.linkedin.com/in/username/', mock=True))"
# Test the entire pipeline
python main.py --mock --testThis project is licensed under the MIT License - see the LICENSE file for details.
- IBM watsonx.ai for providing the LLM and embedding models
- LlamaIndex for the data indexing and retrieval framework
- ProxyCurl for LinkedIn profile data extraction
- Eden Marco for the original tutorial inspiration