freusmes_all

Combination of all 5 repos

Changelog

Note: This README is updated with today's date and a concise summary after every change.

2025-10-11

Resolve merge conflicts per requested rules and document changes.
Files and line changes:
- fresumes/Dockerfile
  - Cleaned conflict markers at lines 46–50; kept CMD ["nginx", "-g", "daemon off;"].
- fresumes_marketing/components/common/Footer.js
  - Lines 31–38: chose incoming label "Resume Database" for /app; removed conflict markers.
- fresumes_marketing/components/index/Header.jsx
  - Lines 1–7: kept simplified imports (framer-motion, next/link), removed DotLottieReact import.
  - Lines 41–59: kept incoming button block; fixed href to /app (not app/).
- fresumes/ads.txt
  - Line 1: kept google.com, pub-8223588809519131, DIRECT, f08c47fec0942fa0; removed conflict markers.
- fresumes_marketing/public/ads.txt
  - Line 1: kept same ads.txt entry; removed conflict markers.
- fresumes/webapp/public/ads.txt
  - Line 1: kept same ads.txt entry; removed conflict markers.
Rationale applied:
- Accepted all incoming frontend changes and pagination.
- Preserved existing search/query logic (no changes detected in this merge span).
- Accepted all fresumes_marketing changes.
Next actions:
- Rebuild containers and restart services.
- Verify /ads.txt serves correctly on marketing and app.
- Validate /_expo/static/* assets load without 404s.

2025-10-02

Fix advanced search fallback for filter-only queries (email/phone).
- Root cause: frontend clears text when email/phone detected, sending empty q to backend; Typesense requires a non-empty q.
- Solution: default q='*' in advanced search within backend/src/services/typesenseService.ts when q is undefined/empty, enabling filter-only searches.
- Logging: backend/src/controllers/resumeController.ts now logs the effective q value (including defaulted '*') for easier debugging of email-based searches.
Investigation summary:
- Frontend query flow: FresumesSearchBar parses natural language to a unified filterState (including email/phone) and triggers changes via handleSearchSubmit/handleSearchChange. ListView.js merges top bar and sidebar filters and calls resumeSearch → ResumeRepository.findAll.
- List loading: InfiniteFlatList requests data via searchCallback(nextPage), marks endOfData on empty results, and renders EmptyInfiniteList when no entries.
- Download tracking: ListViewCard.js increments download counts via v1/resume/${item.key}/increment-download.
Backend email filter handling:
- typesenseService.ts indexes email and uses exact filter matching in searchResumesAdvanced.
- resumeController.ts extracts the email query parameter and passes it through to the Typesense service.
Verification steps:
- Rebuild/restart backend (npm run build && npm start or Docker).
- In the web app, paste an email into the search bar; expect results and backend logs like Found X results for q=*.
- Natural language with email/phone is treated as advanced filters; free-text q is cleared by the UI as intended.
Optional next step (not implemented): Frontend could skip sending empty q when advanced filters are present; backend fallback already supports this.
Files changed:
- backend/src/services/typesenseService.ts — default q='*' for advanced search if empty.
- backend/src/controllers/resumeController.ts — log effective q and pass optional q.

2025-01-27

Fixed Typesense Container Health Check Issue:
- Resolved Typesense container showing "unhealthy" status during docker-compose up
- Root Cause: Original health check used curl -f http://localhost:8108/health but Typesense Docker image doesn't include curl by default
- Solution: Updated health check in docker-compose.yml to use process-based verification:
```
healthcheck:
  test: ["CMD-SHELL", "ps aux | grep '[t]ypesense-server' || exit 1"]
  interval: 15s
  timeout: 5s
  retries: 3
  start_period: 30s
```
- Benefits:
  - Typesense now starts with (healthy) status consistently
  - All services start successfully without dependency failures
  - Improved startup reliability with 30s start period
  - Reduced overhead with 15s health check interval

2025-09-28

Align frontend search sort options with Typesense tokens in fresumes/components/common/FresumesSearchBar.js:
- Most relevant → default relevance
- Most recent → sortBy=recent
- Most viewed → sortBy=viewed
- Most Downloaded → sortBy=downloaded
Remove legacy Elasticsearch parameters (sortField, sortOrder) from search requests.
Optimize backend Typesense search in backend/src/services/typesenseService.ts:
- Reduce query_by to name,email,phone,jobTitle,skills,location and add query_by_weights.
- Set num_typos=1 and search_cutoff_ms=50 for faster queries.
- Limit payload with include_fields (core fields) and exclude_fields=content.
- Decrease default per-page results (search: 10, suggestions: 3) for responsiveness.
- Add in-memory caching for search results and suggestions with TTL and size bounds.
Security and configuration:
- Backend uses a server-side Typesense admin key; the frontend does not use any Typesense key.
- Add/update backend/.env with Typesense settings (TYPESENSE_HOST, TYPESENSE_PORT, TYPESENSE_PROTOCOL, TYPESENSE_API_KEY).

Refer to backend/src/services/typesenseService.ts and fresumes/infastructure/ResumeRepository.js for parameter mappings and API usage.

Bulk Seeding from Docker Volume (Backend)

This workflow lets you ingest many PDF resumes from a mounted server directory into MongoDB, generate page images, and optionally defer Typesense indexing to a single chunked reindex afterward.

Prerequisites
- Ensure environment variables are set for backend: MONGO_URL, GOTENBERG_URL, TYPESENSE_HOST, TYPESENSE_PORT, TYPESENSE_PROTOCOL, TYPESENSE_API_KEY.
- Place PDFs on the server in a directory you can mount, e.g., /upload.
- Ensure Docker Compose mounts /upload into the backend container at /app/uploads and persists public/images.
Example docker-compose service (excerpt)
- Mount seed PDFs and images volume on backend:
  - backend service: volumes: - /upload:/app/uploads - ./backend/public/images:/app/public/images
Build and start services
- docker compose build
- docker compose up -d

-- Run the seeding script (reads PDFs from /app/uploads)

Seed without Typesense indexing (recommended for speed):
- docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=3 --no-index"
Seed with per-resume Typesense indexing:
- docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=3"
- To skip already existing records entirely, add --skip-existing:
  - docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=3 --skip-existing"
- To defer Typesense indexing during large imports, add --no-index and run npm run reindex afterward:
  - docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=2 --no-index"
  - docker compose exec backend sh -lc "npm run reindex"
- Performance flags (help with non-English/OCR-heavy PDFs):
  - Fast mode (skip full-OCR, limited page OCR):
    - --parser-fast (example) docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=2 --parser-fast"
  - Restrict OCR languages (improves speed/accuracy):
    - --ocr-langs=eng+deu (Tesseract language codes)
  - Limit OCR pages processed:
    - --ocr-max-pages=2
  - Disable OCR entirely (only text extraction):
    - --no-ocr
  - Reduce GPT input length to speed parsing:
    - --gpt-text-len=6000
Notes:
- Adjust --concurrency based on CPU/IO; start with 3.
- Deduplication: resumes are skipped if pdfName already exists in MongoDB.
- Images are written under backend/public/images/<resumeId>/... and served by /v1/images/:resumeId/:pageNumber.
Chunked Typesense reindex after seeding
- Recreate schema with infix if mismatched and import in chunks using upsert:
  - docker compose exec backend npm run reindex -- --chunk=500
- Operational guidance:
  - Start with --chunk=500 and increase if Typesense resources allow.
  - 10k documents typically completes in minutes to tens of minutes.
  - Upsert action ensures idempotency; safe to retry on failures.
Troubleshooting
- If images fail to generate, verify Python venv exists in container and poppler-utils and tesseract-ocr are installed (handled by Dockerfile).
- Ensure GOTENBERG_URL points to a reachable Gotenberg container.
- If Typesense import errors occur, confirm TYPESENSE_* envs and collection readiness.

Recommended Three-Stage Seeding (GPT + OCR, skip conversion)

Use GPT for structured extraction and OCR for scanned PDFs; skip conversion (all inputs are PDFs), defer indexing, then backfill images.

Stage 1 — Parse and store to MongoDB (no images, no conversion, defer indexing)
- docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=8 --skip-existing --allow-partial --no-images --no-convert --no-index"
- Notes:
  - Requires OPENAI_API_KEY and MONGO_URL configured in the backend service.
  - Keep OCR enabled by default; optionally tune --ocr-max-pages=2 and --ocr-langs=eng.
  - Adjust --concurrency to your CPU/IO and rate limits.
Stage 2 — Generate page images for saved resumes
- docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run generate-images -- --concurrency=6"
- Writes pageImages and totalPages into MongoDB for records missing images.
Stage 3 — Index into Typesense (chunked)
- docker compose exec backend sh -lc "npm run reindex -- --chunk=1000"
- Ensures the latest resume data (including pdfName) is available to search.
Windows host directory mount example
- Map your host folder (e.g., D:\\Resumes\\all_pdfs) to container path /app/uploads in docker-compose.yml:
```
services:
  backend:
    volumes:
      - D:\\Resumes\\all_pdfs:/app/uploads
      - ./backend/public/images:/app/public/images
```
- Then run Stage 1 with --dir=/app/uploads as shown above. \n## 2025-11-06
Optimized fresumes web ad rendering to prevent scroll freezes:
AdCard now initializes AdSense only when visible (IntersectionObserver) and pushes once.
Reduced ad height usage in ListViewCard to 280px to avoid large reflows.
Notes: Preview requires starting the Expo web dev server in fresumes/.
Upload reliability improvements (Docker Desktop):
- Backend now returns 400 Bad Request for non-resume uploads instead of generic 500.
- Set NEXT_PUBLIC_API_BASE_URL=http://localhost/api in docker-compose.yml for marketing site to ensure API calls route via Nginx.
- If you still see a 500, check container logs:
  - docker-compose logs -n 200 backend
  - docker-compose logs -n 100 gotenberg
- Added STRICT_RESUME_VALIDATION=false to backend service to accept valid resumes even when parser extracts minimal fields.
- Multi-file uploads no longer abort on a single bad file; response includes { results, rejections }.
- When the parser sets isResume=false, the backend now rejects the upload and deletes temp files (original upload and converted PDF) to avoid storing non-resumes.

2025-11-07

Webapp favicon and title update:
- Updated fresumes/webapp/index.html to use <link rel="icon" type="image/png" href="/logo.png" /> and title HostResumes.
- Adjusted fresumes/Dockerfile to copy assets/logo.png into Nginx site root (/usr/share/nginx/html/logo.png) so the favicon resolves in production.
- Rebuilt and restarted containers: docker-compose build then docker-compose up -d.
- If favicon doesn’t appear, hard-refresh the browser (Ctrl+F5) and clear cache.
- Added favicon link to Expo web index (fresumes/web/index.html), ensuring /app uses /logo.png as its tab icon.
- Gateway routing fix: Updated nginx/nginx.conf to proxy root-level /favicon.ico and /apple-touch-icon.png to the fresumes app service, avoiding the marketing site favicon overriding the app.
- Apply the change with docker-compose restart nginx.
- Rule added: When UI changes don’t show, check and update nginx/nginx.conf for the app and restart Nginx.
- Temporarily disabled ad rendering by turning off ad scheduling in ListViewCard to remove AdCard from the UI while investigating a persistent scroll lock.
- Rebuilt and restarted containers; previewed /app with no console errors.
Search flow verification (frontend ↔ backend):
- Confirmed backend mounts under /v1; available endpoints: /resumes, /resumes/search, /resumes/advanced-search, /skills, /job_titles, /languages, /resume_languages, /locations, download and image routes.
- Verified ResumeRepository.js uses API_BASE_URL from EXPO_PUBLIC_API_ORIGIN or defaults to http://localhost/api and selects the correct endpoint based on filters/text.
- Verified FresumesSearchBar.js natural-language parsing populates advanced filters; when email/phone is detected, UI clears free-text and relies on advanced search.
- Active FresumesFiltersSidebar.js emits filter-only updates (no suggestions fetch); saved variant used /suggestions but backend provides distinct endpoints above.
- Typesense advanced search defaults q intelligently (phone digits/name/jobTitle or *), maps sortBy tokens (viewed→likes:desc, downloaded→downloads:desc), and includes fallbacks for missing infix index and request timeouts.
Next steps:
- If reintroducing suggestions UI, wire to /v1/skills, /v1/job_titles, and /v1/locations.
Validate end-to-end search with email/phone/name and multi-location filters via the web app; rebuild if gateway routing changes.

2025-11-07

Search refinements in fresumes/components/common/FresumesSearchBar.js:
- Removed auto-clearing of free-text q when email or phone is detected; text queries now work alongside email/phone filters.
- Normalized and deduplicated locations before sending to backend; always passed as a trimmed array.
Impact: Enables combined text + email/phone searches and ensures stable multi-location filtering.
How to validate on running stack (docker-compose already up):
- In /app, try queries like react developer in Seattle, Austin and john@example.com Bangalore.
- Confirm results update without clearing the text field and locations filter applies correctly.
Image 404s after search (root cause and fix):
- Cause: Typesense had stale IDs not present on disk/Mongo, leading /api/images/:id/:page to 404.
- Fix: Reindexed Typesense from Mongo (docker-compose exec backend npm run reindex). Added missing-only mode to generateImages script for future checks.
- Validation: Backend image endpoint returns 200 for valid IDs; stale IDs no longer appear in search results.

2025-11-13

Integrated AI-assisted search refinement using GPT-4o Mini in backend advanced search.
Adds optional ai=1 flag to advanced-search requests to enable refinement per-query.
Environment toggle AI_SEARCH_ENABLED=1 enables refinement globally (backend service env).
Refinement outputs structured filters (names, emails, phones, skills, job titles, locations, languages) and a corrected free-text q, merged before Typesense query.
Affected files:
- backend/src/utils/enhancedResumeParser.py — new refine_search_query_with_chatgpt and CLI --refine-search mode.
- backend/src/utils/enhancedResumeParser.ts — wrapper refineSearchQueryWithAI to call Python helper.
- backend/src/controllers/resumeController.ts — merges AI-refined fields into searchResumesAdvanced parameters.
- backend/src/services/typesenseService.ts — already supports multi-value arrays; constructs effective q.
Build & restart backend:
- docker-compose build backend; docker-compose up -d backend
Usage examples:
- GET /api/v1/resumes/advanced-search?q=react developer&skills=react,node&ai=1
- Global enable: set AI_SEARCH_ENABLED=1 in backend env and call advanced search without ai=1.

2025-11-13 (Search fixes)

Multi-token q union in basic search:
- Enhanced GET /resumes/search to split q on comma, "and", &, or | and return the union of unique results across tokens.
- Works for locations, job titles, skills, names, and other fields included in query_by.
Multi-location parameter handling:
- Frontend now serializes locations arrays using | to preserve full strings with commas (e.g., Lille, France|Paris, France).
- Backend advanced search parses locations by | and filters with OR over exact matches.
Files changed:
- backend/src/controllers/resumeController.ts — multi-token union handling in searchResumes; advanced locations parsing uses |.
- fresumes/infastructure/ResumeRepository.js — objectToQueryParamArray uses | when serializing locations arrays.
Build & restart:
- docker-compose build backend fresumes && docker-compose up -d backend fresumes
Validation:
- GET /api/resumes/search?q=Mexico, United States&limit=5 and GET /api/resumes/search?q=United States, Mexico&limit=5 both return combined results.
- Verified advanced search with locations=Lille,%20France|Paris,%20France returns appropriate OR matches.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.trae/rules		.trae/rules
backend		backend
fresumes		fresumes
fresumes_marketing		fresumes_marketing
.gitignore		.gitignore
Mongo data.txt		Mongo data.txt
README.md		README.md
deploy.php		deploy.php
package-lock.json		package-lock.json
package.json		package.json
parser.json		parser.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

freusmes_all

Changelog

2025-10-11

2025-10-02

2025-01-27

2025-09-28

Bulk Seeding from Docker Volume (Backend)

Recommended Three-Stage Seeding (GPT + OCR, skip conversion)

2025-11-07

2025-11-07

2025-11-13

2025-11-13 (Search fixes)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

freusmes_all

Changelog

2025-10-11

2025-10-02

2025-01-27

2025-09-28

Bulk Seeding from Docker Volume (Backend)

Recommended Three-Stage Seeding (GPT + OCR, skip conversion)

2025-11-07

2025-11-07

2025-11-13

2025-11-13 (Search fixes)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages