|
1 | | -# Gemini Chat Interface |
| 1 | +# QuantaGem |
2 | 2 |
|
3 | | -> [!WARNING] |
4 | | -> **Work-in-progress:** This project is currently in a very early stage. Only the minimum core functionality is implemented, and many features are still missing or incomplete. |
| 3 | +QuantaGem is a high-performance, production-grade WebUI for Google's Gemini AI, built with a modern Full-Stack architecture. Unlike simpler interfaces, QuantaGem leverages the power of Vertex AI, persistent vector-like project storage, and a distributed microservices architecture to provide a robust environment for AI-driven workflows. |
5 | 4 |
|
6 | | -A simple, open-source Next.js web interface for interacting with Google's Gemini AI API. It provides a ChatGPT-like chat experience with basic streaming response functionality, using PostgreSQL to store chat history. |
| 5 | +## 🏗 Architecture & Tech Stack |
7 | 6 |
|
8 | | -## Features |
| 7 | +- **Frontend/Backend:** [Next.js 15+](https://nextjs.org/) (App Router, Server Components, Route Handlers). |
| 8 | +- **Language:** [TypeScript](https://www.typescriptlang.org/) with strict type safety. |
| 9 | +- **Styling:** [Tailwind CSS 4.0](https://tailwindcss.com/) with Lightning CSS. |
| 10 | +- **Database:** [PostgreSQL 18](https://www.postgresql.org/) for session, message, and project persistence. |
| 11 | +- **Object Storage:** [MinIO](https://min.io/) (S3-compatible) for handling chat attachments and project files. |
| 12 | +- **Cache/Rate Limiting:** [Redis 8](https://redis.io/) for secure authentication limiting. |
| 13 | +- **AI Integration:** [Google Vertex AI SDK](https://cloud.google.com/vertex-ai) (Gemini 2.0/2.5/3 and more). |
| 14 | +- **Speech-to-Text:** Local Python microservice using [Faster Whisper](https://github.com/SYSTRAN/faster-whisper). |
| 15 | +- **Deployment:** [Docker Compose](https://www.docker.com/) with Distroless (non-root) production images for maximum security. |
9 | 16 |
|
10 | | -- Interactive, ChatGPT-inspired chat interface |
11 | | -- Integration with Google's Gemini AI API |
12 | | -- Basic streaming responses |
13 | | -- Persistent conversation history (PostgreSQL) |
14 | | -- Support for free and paid Google API keys |
15 | | -- Markdown and code syntax highlighting |
16 | | -- Modern UI built with Next.js and Tailwind CSS |
| 17 | +## 🚀 Core Features |
17 | 18 |
|
18 | | -## Installation |
| 19 | +- **Advanced Chat Interface:** Supports streaming responses, Markdown, LaTeX math, and syntax highlighting. |
| 20 | +- **Project Management:** Organize chats into projects with dedicated system prompts and persistent file attachments. |
| 21 | +- **Vertex AI Integration:** Optimized for enterprise-grade Gemini models, including support for "Thinking" models with adjustable budgets. |
| 22 | +- **Multimodal Support:** Upload PDFs, images, and large source code folders (via Directory Picker API) for context-aware prompting. |
| 23 | +- **Search & Grounding:** Toggle Google Search grounding for real-time information retrieval. |
| 24 | +- **Voice Intelligence:** Built-in Speech-to-Text (local Whisper) and Text-to-Speech (Gemini TTS). |
| 25 | +- **Secure Auth:** JWT-based authentication with bcrypt hashing and Redis-backed rate limiting. |
| 26 | + |
| 27 | +## 🛠 Installation & Setup |
19 | 28 |
|
20 | 29 | ### Prerequisites |
21 | 30 |
|
22 | | -- [Docker](https://docs.docker.com/get-docker/) |
23 | | -- Google Gemini API key ([get here](https://aistudio.google.com/)) |
| 31 | +- [Docker & Docker Compose](https://docs.docker.com/get-docker/) |
| 32 | +- A Google Cloud Project with **Vertex AI API** enabled. |
| 33 | +- A Google Cloud Service Account key (JSON format). |
24 | 34 |
|
25 | | -### 1. Clone the repository: |
| 35 | +### 1. Clone the Repository |
26 | 36 |
|
27 | 37 | ```bash |
28 | 38 | git clone https://github.com/W4D-cmd/QuantaGem.git |
| 39 | +cd QuantaGem |
29 | 40 | ``` |
30 | 41 |
|
31 | 42 | ### 2. Environment Configuration |
32 | 43 |
|
33 | | -Create a new file `.env.local` in the root of the repository. Copy and paste the content of the `.env` file into it and set your API keys. |
34 | | - |
35 | | -> [!NOTE] |
36 | | -> If you do not have multiple Google accounts or wish to only use the free API simply put the same key for both entries. |
37 | | -
|
38 | | -```env |
39 | | -FREE_GOOGLE_API_KEY="your_free_google_api_key" |
40 | | -PAID_GOOGLE_API_KEY="your_paid_google_api_key" |
41 | | -``` |
| 44 | +Create a `.env.local` file in the root directory. You can use the provided `.env` as a template: |
42 | 45 |
|
43 | | -You must also set `JWT_SECRET` to a random, cryptographically strong string. |
44 | | -This secret is vital for securing user sessions and should be at least 32 characters (256 bits) long. |
45 | | -You can generate a suitable value using `node -e "console.log(require('crypto').randomBytes(32).toString('base64'))"` and add it to your `.env.local` file. |
46 | 46 | ```env |
47 | | -JWT_SECRET="your_jwt_secret" |
48 | | -``` |
| 47 | +GOOGLE_CLOUD_PROJECT="your-gcp-project-id" |
| 48 | +GOOGLE_CLOUD_LOCATION="global" |
| 49 | +GOOGLE_GENAI_USE_VERTEXAI="True" |
49 | 50 |
|
50 | | -### 3. Customizing the Speech-to-Text (STT) Model |
| 51 | +JWT_SECRET="generate-a-32-char-random-string" |
51 | 52 |
|
52 | | -The application uses the `medium` as the default model for Speech-to-Text transcription. You can change this to any other model from the Faster Whisper family to balance performance and accuracy according to your hardware and needs. |
| 53 | +POSTGRES_USER=quantagemuser |
| 54 | +POSTGRES_PASSWORD=quantagempass |
| 55 | +POSTGRES_DB=quantagemdb |
53 | 56 |
|
54 | | -To change the model, you need to edit the model identifier string in the STT service's source code. |
55 | | - |
56 | | -1. Open the file `stt-service/main.py`. |
57 | | -2. Locate the `model_size` variable at the top of the file. |
| 57 | +MINIO_ROOT_USER=minioadmin |
| 58 | +MINIO_ROOT_PASSWORD=minioadminsecret |
| 59 | +MINIO_DEFAULT_BUCKET=chat-files |
| 60 | +``` |
58 | 61 |
|
59 | | - ```python |
60 | | - model_size = "medium" |
61 | | - compute_type = "int8" |
62 | | - ``` |
| 62 | +### 3. GCP Authentication |
63 | 63 |
|
64 | | -3. Replace the string value of `model_size` (e.g., `"medium"`) with the name of your desired model from the list below (e.g., `"distil-large-v3"`). |
65 | | -4. Save the file and rebuild the Docker container using `docker compose up --build` for the changes to take effect. |
| 64 | +Create a directory named `secrets` in the root and place your Google Cloud Service Account JSON key inside it. Rename it to `gcp-key.json`: |
66 | 65 |
|
67 | | -<details> |
68 | | -<summary><b>Available Faster Whisper Models</b></summary> |
| 66 | +```bash |
| 67 | +mkdir secrets |
| 68 | +# Copy your key file |
| 69 | +cp /path/to/your/service-account-key.json secrets/gcp-key.json |
| 70 | +``` |
69 | 71 |
|
70 | | -Here is a list of available models, grouped by type. Larger models are more accurate but slower and require more resources. |
| 72 | +### 4. Deploy with Docker |
71 | 73 |
|
72 | | -#### Standard Models (Multilingual) |
73 | | -* `tiny` |
74 | | -* `base` |
75 | | -* `small` |
76 | | -* `medium` |
77 | | -* `large-v1` |
78 | | -* `large-v2` |
79 | | -* `large-v3` |
| 74 | +Start the entire stack in production mode: |
80 | 75 |
|
81 | | -#### English-Only Models (.en) |
82 | | -* `tiny.en` |
83 | | -* `base.en` |
84 | | -* `small.en` |
85 | | -* `medium.en` |
| 76 | +```bash |
| 77 | +docker compose up -d --build |
| 78 | +``` |
86 | 79 |
|
87 | | -#### Distilled Models (distil) |
88 | | -* `distil-small.en` |
89 | | -* `distil-medium.en` |
90 | | -* `distil-large-v2` |
91 | | -* `distil-large-v3` |
| 80 | +The application will be available at `http://localhost:3000`. |
92 | 81 |
|
93 | | -</details> |
| 82 | +## 🎤 Speech-to-Text (STT) Customization |
94 | 83 |
|
95 | | -## Running the Application |
| 84 | +The `stt-service` uses `faster-whisper-large-v3` by default on CPU. To adjust the model size for better performance on weaker hardware, modify `stt-service/main.py`: |
96 | 85 |
|
97 | | -Inside the cloned repository execute the following command to start up the docker environment including the database and the Next.js app: |
| 86 | +```python |
| 87 | +MODEL_SIZE = "Systran/faster-whisper-medium" # Options: tiny, base, small, medium, large-v3 |
| 88 | +COMPUTE_TYPE = "int8" |
| 89 | +CPU_THREADS = 4 # Adjust based on your CPU |
| 90 | +``` |
98 | 91 |
|
| 92 | +After modifying, rebuild the service: |
99 | 93 | ```bash |
100 | | -docker compose up --build |
| 94 | +docker compose up -d --build stt-service |
101 | 95 | ``` |
102 | 96 |
|
103 | | -Open your browser at [http://localhost:3000](http://localhost:3000). |
104 | | - |
105 | | -## Contributing |
106 | | - |
107 | | -Contributions are welcome! Please follow these steps: |
| 97 | +## 🔒 Security Posture |
108 | 98 |
|
109 | | -1. Fork the repository. |
110 | | -2. Create your feature branch (`git checkout -b feature/my-feature`). |
111 | | -3. Commit your changes (`git commit -am 'Add new feature'`). |
112 | | -4. Push to the branch (`git push origin feature/my-feature`). |
113 | | -5. Create a new pull request. |
| 99 | +- **Distroless Images:** The production container uses `gcr.io/distroless/nodejs24`, containing only the application and its runtime dependencies. |
| 100 | +- **Non-Root Execution:** The application runs as user `65532:65532`. |
| 101 | +- **Read-Only RootFS:** The container filesystem is read-only, using `tmpfs` only for required cache directories. |
| 102 | +- **Capability Drop:** All Linux capabilities are dropped in the Compose file. |
| 103 | +- **Security Headers:** Implements strict CORS and CSP headers. |
114 | 104 |
|
115 | | -## License |
| 105 | +## 📄 License |
116 | 106 |
|
117 | | -Licensed under the MIT License. See [LICENSE](LICENSE) for details. |
| 107 | +This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. |
0 commit comments