An AI-powered interview assistant that provides real-time transcription and intelligent responses during technical interviews, now supporting both OpenAI and the latest Gemini models.
- ποΈ Real-time Transcription: High-accuracy voice-to-text for both the interviewer and candidate using Azure Cognitive Services.
- π€ AI-Powered Insights: Get intelligent suggestions with conversational context awareness, powered by leading models from OpenAI and Google.
- πΌοΈ Picture-in-Picture (PiP) Mode: Keep an eye on the AI log in a separate, floating window so you can focus on the interview.
- π» Code Formatting: Clear syntax highlighting for technical discussions makes code easy to read and understand.
- β¨ Enhanced UI: A refreshed and more intuitive user interface for a seamless experience.
- π Latest AI Models: Support for the newest models, including Gemini 2.5 Pro and Gemini 2.5 Flash.
- π Question History: Combine multiple questions from the history to ask the AI for a comprehensive analysis.
- β±οΈ Silence Detection: Automatically submits recognized speech after a configurable period of silence for a smoother workflow.
- βοΈ Highly Configurable: Tailor AI models, API keys, response length, and system prompts to your exact needs.
- Frontend: React, Redux, Material-UI
- AI Services: OpenAI GPT, Google Gemini, Azure Cognitive Services (Speech)
- Build Tools: npm
- Other Libraries: React Markdown, Highlight.js, Microsoft Cognitive Services Speech SDK
- Node.js (v18+)
- npm (v9+)
- OpenAI API key: Get your key from OpenAI.
- Gemini API key: Get your key from Google AI Studio.
- Azure Speech Service key: Get a free trial key from Microsoft Azure.
-
Clone the repository
git clone https://github.com/hariiprasad/interviewcopilot.git cd interviewcopilot -
Install dependencies
npm install
-
Run the development server
npm run dev
-
Access the application Open your browser to
http://localhost:3000
- Open the Settings dialog (βοΈ icon in the header).
- Enter your API credentials:
- OpenAI API Key (for OpenAI models)
- Gemini API Key (for Gemini models)
- Azure Speech Service Key
- Azure Region
- Configure your preferences:
- AI Model (Choose from OpenAI or Gemini models)
- AI System Prompt
- Auto-Submit & Manual modes
- AI Response Length (concise, medium, lengthy)
- Silence Timer Duration
-
System Audio Panel (Left)
- Start/Stop system audio capture for the interviewer.
- View and edit the transcribed questions.
- Manage and combine questions from history.
-
AI Assistant Log (Center)
- View real-time AI responses.
- Benefit from code formatting and syntax highlighting.
- Access all previous response history.
- Toggle auto-scroll and open the PiP window.
-
Your Mic Panel (Right)
- Start/Stop your microphone for candidate audio.
- Toggle manual input mode.
- Manually submit your responses to the AI.
Common Issues:
- Audio Permissions: Ensure your browser has microphone access. If permissions were denied, refresh the page and allow access when prompted.
- API Errors:
- Double-check that your API keys in settings are correct.
- Verify your internet connection.
- Ensure the correct API key is provided for the selected AI model (e.g., Gemini key for Gemini models).
- Transcription Issues: For best results, speak clearly with minimal background noise and verify your Azure Speech Service subscription is active.
We welcome contributions! Please follow these steps:
- Fork the repository.
- Create your feature branch (
git checkout -b feature/AmazingFeature). - Commit your changes (
git commit -m 'Add some AmazingFeature'). - Push to the branch (
git push origin feature/AmazingFeature). - Open a Pull Request.
This project is licensed under the MIT License.
- OpenAI for their GPT models.
- Google for the Gemini models.
- Microsoft Azure for Cognitive Services.
- The Material-UI team and the broader React community for their fantastic tools.Of course! Based on the new features like the enhanced UI, Picture-in-Picture (PiP) mode, and the addition of the latest Gemini models, here is an updated version of your
README.mdfile.
An AI-powered interview assistant that provides real-time transcription and intelligent responses during technical interviews, now supporting both OpenAI and the latest Gemini models.
- ποΈ Real-time Transcription: High-accuracy voice-to-text for both the interviewer and candidate using Azure Cognitive Services.
- π€ AI-Powered Insights: Get intelligent suggestions with conversational context awareness, powered by leading models from OpenAI and Google.
- πΌοΈ Picture-in-Picture (PiP) Mode: Keep an eye on the AI log in a separate, floating window so you can focus on the interview.
- π» Code Formatting: Clear syntax highlighting for technical discussions makes code easy to read and understand.
- β¨ Enhanced UI: A refreshed and more intuitive user interface for a seamless experience.
- π Latest AI Models: Support for the newest models, including Gemini 2.5 Pro and Gemini 2.5 Flash.
- π Question History: Combine multiple questions from the history to ask the AI for a comprehensive analysis.
- β±οΈ Silence Detection: Automatically submits recognized speech after a configurable period of silence for a smoother workflow.
- βοΈ Highly Configurable: Tailor AI models, API keys, response length, and system prompts to your exact needs.
- Frontend: React, Redux, Material-UI
- AI Services: OpenAI GPT, Google Gemini, Azure Cognitive Services (Speech)
- Build Tools: npm
- Other Libraries: React Markdown, Highlight.js, Microsoft Cognitive Services Speech SDK
- Node.js (v18+)
- npm (v9+)
- OpenAI API key: Get your key from OpenAI.
- Gemini API key: Get your key from Google AI Studio.
- Azure Speech Service key: Get a free trial key from Microsoft Azure.
-
Clone the repository
git clone https://github.com/hariiprasad/interviewcopilot.git cd interviewcopilot -
Install dependencies
npm install
-
Run the development server
npm run dev
-
Access the application Open your browser to
http://localhost:3000
- Open the Settings dialog (βοΈ icon in the header).
- Enter your API credentials:
- OpenAI API Key (for OpenAI models)
- Gemini API Key (for Gemini models)
- Azure Speech Service Key
- Azure Region
- Configure your preferences:
- AI Model (Choose from OpenAI or Gemini models)
- AI System Prompt
- Auto-Submit & Manual modes
- AI Response Length (concise, medium, lengthy)
- Silence Timer Duration
-
System Audio Panel (Left)
- Start/Stop system audio capture for the interviewer.
- View and edit the transcribed questions.
- Manage and combine questions from history.
-
AI Assistant Log (Center)
- View real-time AI responses.
- Benefit from code formatting and syntax highlighting.
- Access all previous response history.
- Toggle auto-scroll and open the PiP window.
-
Your Mic Panel (Right)
- Start/Stop your microphone for candidate audio.
- Toggle manual input mode.
- Manually submit your responses to the AI.
Common Issues:
- Audio Permissions: Ensure your browser has microphone access. If permissions were denied, refresh the page and allow access when prompted.
- API Errors:
- Double-check that your API keys in settings are correct.
- Verify your internet connection.
- Ensure the correct API key is provided for the selected AI model (e.g., Gemini key for Gemini models).
- Transcription Issues: For best results, speak clearly with minimal background noise and verify your Azure Speech Service subscription is active.
We welcome contributions! Please follow these steps:
- Fork the repository.
- Create your feature branch (
git checkout -b feature/AmazingFeature). - Commit your changes (
git commit -m 'Add some AmazingFeature'). - Push to the branch (
git push origin feature/AmazingFeature). - Open a Pull Request.
This project is licensed under the MIT License.
- OpenAI for their GPT models.
- Google for the Gemini models.
- Microsoft Azure for Cognitive Services.
- The Material-UI team and the broader React community for their fantastic tools.