This project demonstrates a real-time interaction with OpenAI's GPT-4o model using WebSockets. It consists of a React frontend and a Node.js backend that connects to OpenAI's Realtime API.
This project showcases a real-time chat application that interacts with OpenAI's GPT-4o model. Users can send audio messages, which are processed and responded to by the AI in both text and audio formats. The application is built with a React frontend and a Node.js backend, utilizing WebSockets for real-time communication.
- Real-time Communication: Uses WebSockets to maintain a persistent connection between the client and server.
- Audio Processing: Records audio from the user's microphone, processes it, and sends it to the backend.
- AI Interaction: Connects to OpenAI's Realtime API to receive AI-generated responses in text and audio.
- Automatic Audio Playback: Plays audio responses automatically using a hidden audio player.
- Node.js and npm installed on your machine.
- A valid OpenAI API key.
-
Clone the repository:
git clone https://github.com/yourusername/openai-realtime-api-demo.git cd openai-realtime-api-demo -
Install dependencies:
npm install
-
Set up environment variables: Create a
.envfile in the root directory and add your OpenAI API key:OPENAI_API_KEY=your_openai_api_key -
Start the backend server:
node backend/server.js
-
Start the frontend:
npm start
- Open your browser and navigate to
http://localhost:3000. - Click the "Start Recording" button to begin recording your message.
- Click "Stop Recording" to send the message to the AI.
- The AI's response will be displayed in text and played back in audio.
- src/App.js: The main React component handling UI and WebSocket communication.
- backend/server.js: Node.js server that connects to OpenAI's Realtime API and handles WebSocket communication with the client.
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.
This project is licensed under the MIT License. See the LICENSE file for details.