This is a Node.js-based API server that leverages Google's Gemini 1.5 Flash model to generate content from text prompts, images, documents, and audio files. It uses Express for routing and Multer for handling file uploads.
- Text Generation — Generate text from a prompt.
- Image Analysis — Send an image and receive a description or analysis.
- Document Processing — Upload documents (e.g., PDF, DOCX) for analysis.
- Audio Interpretation — Send audio files for transcription or insights.
Generate content from a raw text prompt.
Body (JSON):
{
"prompt": "Describe the impact of AI on modern society."
}Upload an image and receive a generated response based on the prompt.
Form-Data:
prompt: (string, optional)image: (file) PNG, JPEG, etc.
Upload a document for Gemini to analyze.
Form-Data:
prompt: (string, optional)document: (file) PDF, DOCX, etc.
Upload an audio file for transcription or analysis.
Form-Data:
prompt: (string, optional)audio: (file) MP3, WAV, etc.
- Clone the repository:
git clone https://github.com/alderrr/gemini-ai-api-project.git
cd gemini-ai-api-project/1.5-gemini-flash- Install dependencies:
npm install- Create a
.envfile and add your Google API key:
GOOGLE_API_KEY=your_google_api_key
- Start the server:
node index.js- Uploaded files are automatically deleted after processing.
- Make sure your Google API key has access to Gemini models.
This project is licensed under the MIT License.