Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 78 additions & 0 deletions SEMANTIC_SEARCH_PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Semantic Search Implementation Plan

## Objective
Implement semantic search functionality that allows users to search products using natural language queries (e.g., "comfortable workspace gear" finds ergonomic mouse/keyboard) instead of just keyword matching.

## Technical Approach
- **Vector Embeddings**: Use OpenAI's `text-embedding-3-small` model to generate embeddings for product descriptions
- **Storage**: Store embeddings as JSON in SQLite alongside existing product data
- **Similarity**: Calculate cosine similarity in JavaScript/TypeScript (in-code approach for MVP)
- **API**: Add `/api/search?q=query` endpoint for semantic search
- **Frontend**: Integrate search input into existing Products component

## Implementation Phases

### Phase 1: Foundation & Basic Embedding Generation
**Goal**: Generate and store embeddings for existing products

- [ ] Add OpenAI client to project dependencies
- [ ] Create embedding generation script for existing products
- [ ] Add `embedding` column to products table schema
- [ ] Test embedding generation with current 3 products
- [ ] Verify embeddings are stored correctly in database

**Deliverable**: Products have embeddings stored in database

### Phase 2: Simple Search API
**Goal**: Basic semantic search endpoint working

- [ ] Create `/api/search?q=query` endpoint in routes/api.ts
- [ ] Implement query embedding generation
- [ ] Implement cosine similarity calculation function
- [ ] Return products ranked by similarity score
- [ ] Test with simple queries like "keyboard" or "comfortable mouse"

**Deliverable**: Working API that returns relevant products

### Phase 3: Frontend Integration
**Goal**: Users can search from the UI

- [ ] Add search input component to Products.js
- [ ] Connect search input to `/api/search` endpoint
- [ ] Display search results in existing product grid layout
- [ ] Add loading states for search requests
- [ ] Add error handling for failed searches

**Deliverable**: Complete search experience in browser

### Phase 4: Search Quality & UX
**Goal**: Improve search relevance and user experience

- [ ] Show relevance scores for debugging/transparency
- [ ] Add search result count and messaging
- [ ] Handle empty search results gracefully
- [ ] Test with natural language queries (e.g., "gear for late night coding")
- [ ] Add "clear search" functionality

**Deliverable**: Polished search experience

### Phase 5: Auto-Embedding for New Products
**Goal**: New products automatically get embeddings

- [ ] Hook embedding generation into product creation API
- [ ] Update existing seed script to include embeddings
- [ ] Test adding new products via API includes embeddings
- [ ] Ensure embedding generation doesn't block product creation

**Deliverable**: Self-maintaining embedding system

## Current Product Data
- **Coding Mug** - $14.99 - "A perfect mug for your late night coding sessions"
- **Mechanical Keyboard** - $129.99 - "Tactile keys for the ultimate typing experience"
- **Ergonomic Mouse** - $49.99 - "Comfortable mouse for all-day use"

## Notes
- Each phase builds incrementally and can be tested end-to-end
- In-code similarity calculation is suitable for MVP scale (<1000 products)
- Can optimize with SQLite vector extensions later if product catalog grows significantly
- OpenAI API access confirmed and working
744 changes: 39 additions & 705 deletions bun.lock

Large diffs are not rendered by default.

17 changes: 11 additions & 6 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,23 @@
"private": true,
"main": "src/index.js",
"scripts": {
"start": "bun run src/index.js",
"dev": "bun run --watch src/index.js",
"db:setup": "bun run src/db/setup.js",
"db:seed": "bun run src/db/seed.js",
"start": "bun run src/index.ts",
"dev": "bun run --watch src/index.ts",
"build": "tsc",
"typecheck": "tsc --noEmit",
"db:setup": "bun run src/db/setup.ts",
"db:seed": "bun run src/db/seed.ts",
"test": "bun test"
},
"license": "MIT",
"dependencies": {
"dotenv": "^16.5.0",
"hono": "^4.7.8"
"hono": "^4.7.8",
"openai": "^4.103.0"
},
"devDependencies": {
"jest": "^29.7.0"
"@types/node": "^22.15.21",
"bun-types": "^1.2.13",
"typescript": "^5.8.3"
}
}
26 changes: 26 additions & 0 deletions src/config.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import { config } from "dotenv";

// Load environment variables
config();

// Define configuration interface
interface AppConfig {
port: number;
dbPath: string;
}

// Validate critical environment variables
if (process.env.PORT && isNaN(Number(process.env.PORT))) {
throw new Error("PORT environment variable must be a number");
}

// Process and export configuration settings
const port = process.env.PORT ? parseInt(process.env.PORT, 10) : 3000;
const dbPath = process.env.DB_PATH || "./data/database.sqlite";

const appConfig: AppConfig = {
port,
dbPath,
};

export default appConfig;
22 changes: 22 additions & 0 deletions src/db/db.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import { Database } from "bun:sqlite";
import fs from "fs";
import path from "path";
import config from "../config.ts";

// Use an absolute path to avoid path resolution issues
const dbPath: string = config.dbPath;
console.log("Using database at:", dbPath);

// Ensure the database directory exists
const dbDir: string = path.dirname(dbPath);
if (!fs.existsSync(dbDir)) {
fs.mkdirSync(dbDir, { recursive: true });
}

// Initialize database connection
const db: Database = new Database(dbPath, { create: true });

// Enable WAL mode for better performance
db.exec("PRAGMA journal_mode=WAL");

export default db;
70 changes: 70 additions & 0 deletions src/db/embeddings.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
import OpenAI from "openai";
import db from "./db.ts";
import { Product } from "../types/product.ts";

const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});

export async function generateEmbedding(text: string): Promise<number[]> {
try {
const response = await openai.embeddings.create({
model: "text-embedding-3-small",
input: text,
});

return response.data[0].embedding;
} catch (error) {
console.error("Error generating embedding:", error);
throw error;
}
}

export async function generateProductEmbedding(product: Product): Promise<number[]> {
const textToEmbed = `${product.name} - ${product.description}`;
return generateEmbedding(textToEmbed);
}

export async function updateProductEmbeddings(): Promise<void> {
console.log("Generating embeddings for existing products...");

// Get all products without embeddings
const products = db.query("SELECT * FROM products WHERE embedding IS NULL").all() as Product[];

console.log(`Found ${products.length} products without embeddings`);

for (const product of products) {
try {
console.log(`Generating embedding for: ${product.name}`);

const embedding = await generateProductEmbedding(product);
const embeddingJson = JSON.stringify(embedding);

// Update the product with the embedding
db.run(
"UPDATE products SET embedding = ? WHERE id = ?",
embeddingJson,
product.id
);

console.log(`✓ Generated embedding for: ${product.name}`);
} catch (error) {
console.error(`Failed to generate embedding for ${product.name}:`, error);
}
}

console.log("Embedding generation complete!");
}

// Run if executed directly
if (import.meta.main) {
console.log("Running embedding generation as main module");
try {
await updateProductEmbeddings();
console.log("Embedding generation completed successfully");
process.exit(0);
} catch (err) {
console.error("Error generating embeddings:", err);
process.exit(1);
}
}
141 changes: 141 additions & 0 deletions src/db/seed.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
import db from "./db.ts";
import setupDatabase from "./setup.ts";
import { Product } from "../types/product.ts";

// Seed database with initial data
function seedDatabase(): void {
console.log("Seeding database...");

// Make sure tables exist
setupDatabase();

// Clear existing data
db.run("DELETE FROM products");

// Insert sample products from diverse categories
const products: Product[] = [
// Tech/Office products
{
name: "Coding Mug",
description: "A perfect mug for your late night coding sessions",
price: 14.99,
},
{
name: "Mechanical Keyboard",
description: "Tactile keys for the ultimate typing experience",
price: 129.99,
},
{
name: "Ergonomic Mouse",
description: "Comfortable mouse for all-day use",
price: 49.99,
},
{
name: "Wireless Headphones",
description: "Premium noise-canceling headphones for music and calls",
price: 199.99,
},
{
name: "Bluetooth Speaker",
description: "Portable speaker with rich bass and crystal clear sound",
price: 89.99,
},
// Books
{
name: "JavaScript: The Good Parts",
description: "Essential guide to JavaScript programming best practices",
price: 29.99,
},
{
name: "The Hobbit",
description: "Classic fantasy adventure novel by J.R.R. Tolkien",
price: 12.99,
},
{
name: "Cooking for Beginners",
description: "Easy recipes and techniques for new home cooks",
price: 24.99,
},
// Toys
{
name: "LEGO Space Station",
description: "Build your own space station with 500+ pieces",
price: 79.99,
},
{
name: "Remote Control Drone",
description: "Easy-to-fly drone with HD camera for kids and adults",
price: 149.99,
},
{
name: "Puzzle 1000 Pieces",
description: "Beautiful landscape jigsaw puzzle for relaxing afternoons",
price: 19.99,
},
// Sports/Fitness
{
name: "Yoga Mat",
description: "Non-slip exercise mat for yoga, pilates, and stretching",
price: 34.99,
},
{
name: "Running Shoes",
description: "Lightweight athletic shoes with superior cushioning",
price: 119.99,
},
// Home/Kitchen
{
name: "Coffee Maker",
description: "Programmable drip coffee maker with thermal carafe",
price: 69.99,
},
{
name: "Table Lamp",
description: "Modern desk lamp with adjustable brightness settings",
price: 45.99,
},
];

// Prepare the insert statement
const insertProduct = db.prepare(
"INSERT INTO products (name, description, price) VALUES (?, ?, ?)",
);

// Create a transaction function
const insertProducts = db.transaction((productsToInsert: Product[]): number => {
for (const product of productsToInsert) {
insertProduct.run(product.name, product.description, product.price);
}
return productsToInsert.length;
});

// Execute the transaction
const insertedCount = insertProducts(products);
console.log(`Inserted ${insertedCount} products using transaction`);

// Verify data was inserted
const result = db
.query("SELECT COUNT(*) as count FROM products")
.get();
const dbCount = (result as { count: number }).count;
console.log(`Database seeded with ${dbCount} sample products!`);

// Show the products
const products_in_db = db.query("SELECT * FROM products").all();
console.log("Products in database:", products_in_db);
}

// Run seeder if this file is executed directly
if (import.meta.main) {
console.log("Running seed as main module");
try {
seedDatabase();
console.log("Database seeding completed successfully");
process.exit(0);
} catch (err) {
console.error("Error seeding database:", err);
process.exit(1);
}
}

export default seedDatabase;
Loading