πΈ Upload β AI Analysis β Structured Output β all in seconds
A polished React + TypeScript frontend that accepts form images and uses Google Gemini (GenAI) to provide intelligent field guidance or structured validation of filled forms β built with production-minded reliability and developer hygiene.
| π€ Drag & drop upload | π Structured output |
β¨ Two modes: Human-readable guidance or machine-usable JSON with confidence scores
---Tanmay Kunjir β’ Anshika Mishra
This repository demonstrates fullβstack integration of computerβvision input and an LLM (Gemini) for practical form assistance: usable by product teams and demonstrative for technical recruiters.
| | @10anshika |
β¨ Features β’ π οΈ Tech Stack β’ β‘ Quick Start β’ ποΈ Architecture β’ π€ Gemini Integration β’ π§ͺ Testing & CI β’ π Security β’ π€ Contributing
- Why this project matters
- Features
- Tech stack
- Quick start
- Architecture overview
- Gemini integration notes
- Tests & CI
- Security & privacy
- Contributors & contact
- Connects image input to a generative model to produce structured, validated outputs suitable for downstream automation (data entry, audit, corrections).
- Shows production concerns: secret management, runtime schema validation, retries/backoff, file validation/compression, CI and tests.
- Clear technical ownership and design choices that recruiters look for: data flow, failure modes, and developer experience.
| Capability | Description |
|---|---|
| Image Upload | Drag & drop, preview, MIME + size validation |
| Compression | Client-side resizing to reduce latency & cost |
| Field Guidance | Human-readable instructions for correcting entries |
| Validation Mode | Machine-usable JSON with confidence scores |
| Model Safety | Safe JSON parsing + runtime schema validation |
| Dev Hygiene | Env separation, CI-ready, testable architecture |
| Layer | Technology | Why |
|---|---|---|
| π¨ Frontend | React + TypeScript + Vite | Type safety, fast HMR, optimized builds |
| π§ AI / LLM | Google Gemini (@google/genai) | State-of-the-art vision + language |
| π Validation | zod | Runtime schema validation |
| π§ͺ Testing | Jest + React Testing Library | Unit & integration tests |
| βοΈ CI | GitHub Actions | Automated builds & tests |
| Layer | Technology |
|---|---|
| Frontend | React, TypeScript, Vite |
| AI / LLM | Google Gemini (@google/genai) |
| Validation | zod / AJV (runtime schemas) |
| Testing | Jest, React Testing Library |
| CI | GitHub Actions |
- Clone and enter repo
git clone https://github.com/10anshika/Samadhan-AI.git
cd Samadhan-AI- Install
npm ci- Create
.env.localfrom.env.example(do not commit)
GEMINI_API_KEY=sk-xxxxxx
VITE_PUBLIC_BASE_URL=http://localhost:5173
GEMINI_MODEL=gemini-3-pro-preview- Run dev server
npm run dev- Build
npm run buildββββββββββββββββ Image (JPG/PNG) ββββββββββββββββββββββββ
β Browser β ββββββββββββββββββββββΆ β Image Validation β
β (React) β β + Compression β
ββββββββ¬ββββββββ βββββββββββ¬βββββββββββββ
β β
β Prompt + Image
β β
βΌ βΌ
ββββββββββββββββββββ ββββββββββββββββββββββββ
β UI State Layer β βββββ Structured ββ β Gemini Adapter β
β (Guidance / Val) β JSON β (safe parse + schema)β
ββββββββββββββββββββ ββββββββββββββββββββββββ
Design intent:
- Explicit boundaries between UI, preprocessing, and model adapter.
- No raw model text reaches the UI without schema validation.
- Use canonical env var
GEMINI_API_KEY. Replace anyAPI_KEYreferences. - Avoid direct
JSON.parseof model text. Use a cleaning step, safe parse, and a zod schema to validate the final object.
Recommended model output schema (conceptual)
const ModelOutputSchema = z.object({
mode: z.union([z.literal('guidance'), z.literal('validation')]),
fields: z.record(z.string(), z.object({ value: z.string(), confidence: z.number().min(0).max(1), suggestion: z.string().optional() })),
})Safe parse pattern
function safeJsonParse(text: string) {
const cleaned = text.replace(/^```(?:json)?
?|
?```$/g, '')
try { return JSON.parse(cleaned) } catch { throw new Error('Invalid JSON from model') }
}- Unit tests for
services/geminiService.tsthat mock@google/genaiand validate retries/parse behavior. - Snapshot and interaction tests for
ImageUploaderand result components. - Integration test (mocked) covering both guidance and validation flows.
name: CI
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm ci
- run: npm run build
- run: npm test --if-present- Images may contain sensitive data: add an explicit UI and README privacy notice: "Images are sent to a thirdβparty API for processing." Provide a local-only mode or an optional opt-out.
- Add
.env.localto.gitignore.
| Data Notice Images contain sensitive data. They are sent to a thirdβparty API for processing. | Local Option Provide local-only mode or opt-out | Env Security `.env.local` in `.gitignore` β always! |
π€ Contributing We β€οΈ contributions! Here's how to get started:
| Open an issue with clear steps to reproduce | Describe the problem you're solving, not just the solution |
| Better explanations, examples, typos | Check out |
- Add an animated hero GIF showing the upload β result flow in
assets/and reference it in this README. - Host a small demo (GitHub Pages / Vercel) and link it in the top section.
- Add screenshots for both guidance and validation result states.
- Add a
CONTRIBUTING.mdandCODE_OF_CONDUCT.mdto improve project maturity.
Built with β€οΈ by Tanmay Kunjir & Anshika Mishra
β Star us on GitHub β it helps others discover the project!
MIT β see LICENSE.