Skip to content

sahibhussain8239/Fresumes

Repository files navigation

freusmes_all

Combination of all 5 repos

Changelog

Note: This README is updated with today's date and a concise summary after every change.

2025-10-11

  • Resolve merge conflicts per requested rules and document changes.
  • Files and line changes:
    • fresumes/Dockerfile
      • Cleaned conflict markers at lines 46–50; kept CMD ["nginx", "-g", "daemon off;"].
    • fresumes_marketing/components/common/Footer.js
      • Lines 31–38: chose incoming label "Resume Database" for /app; removed conflict markers.
    • fresumes_marketing/components/index/Header.jsx
      • Lines 1–7: kept simplified imports (framer-motion, next/link), removed DotLottieReact import.
      • Lines 41–59: kept incoming button block; fixed href to /app (not app/).
    • fresumes/ads.txt
      • Line 1: kept google.com, pub-8223588809519131, DIRECT, f08c47fec0942fa0; removed conflict markers.
    • fresumes_marketing/public/ads.txt
      • Line 1: kept same ads.txt entry; removed conflict markers.
    • fresumes/webapp/public/ads.txt
      • Line 1: kept same ads.txt entry; removed conflict markers.
  • Rationale applied:
    • Accepted all incoming frontend changes and pagination.
    • Preserved existing search/query logic (no changes detected in this merge span).
    • Accepted all fresumes_marketing changes.
  • Next actions:
    • Rebuild containers and restart services.
    • Verify /ads.txt serves correctly on marketing and app.
    • Validate /_expo/static/* assets load without 404s.

2025-10-02

  • Fix advanced search fallback for filter-only queries (email/phone).
    • Root cause: frontend clears text when email/phone detected, sending empty q to backend; Typesense requires a non-empty q.
    • Solution: default q='*' in advanced search within backend/src/services/typesenseService.ts when q is undefined/empty, enabling filter-only searches.
    • Logging: backend/src/controllers/resumeController.ts now logs the effective q value (including defaulted '*') for easier debugging of email-based searches.
  • Investigation summary:
    • Frontend query flow: FresumesSearchBar parses natural language to a unified filterState (including email/phone) and triggers changes via handleSearchSubmit/handleSearchChange. ListView.js merges top bar and sidebar filters and calls resumeSearchResumeRepository.findAll.
    • List loading: InfiniteFlatList requests data via searchCallback(nextPage), marks endOfData on empty results, and renders EmptyInfiniteList when no entries.
    • Download tracking: ListViewCard.js increments download counts via v1/resume/${item.key}/increment-download.
  • Backend email filter handling:
    • typesenseService.ts indexes email and uses exact filter matching in searchResumesAdvanced.
    • resumeController.ts extracts the email query parameter and passes it through to the Typesense service.
  • Verification steps:
    • Rebuild/restart backend (npm run build && npm start or Docker).
    • In the web app, paste an email into the search bar; expect results and backend logs like Found X results for q=*.
    • Natural language with email/phone is treated as advanced filters; free-text q is cleared by the UI as intended.
  • Optional next step (not implemented): Frontend could skip sending empty q when advanced filters are present; backend fallback already supports this.
  • Files changed:
    • backend/src/services/typesenseService.ts — default q='*' for advanced search if empty.
    • backend/src/controllers/resumeController.ts — log effective q and pass optional q.

2025-01-27

  • Fixed Typesense Container Health Check Issue:
    • Resolved Typesense container showing "unhealthy" status during docker-compose up
    • Root Cause: Original health check used curl -f http://localhost:8108/health but Typesense Docker image doesn't include curl by default
    • Solution: Updated health check in docker-compose.yml to use process-based verification:
      healthcheck:
        test: ["CMD-SHELL", "ps aux | grep '[t]ypesense-server' || exit 1"]
        interval: 15s
        timeout: 5s
        retries: 3
        start_period: 30s
    • Benefits:
      • Typesense now starts with (healthy) status consistently
      • All services start successfully without dependency failures
      • Improved startup reliability with 30s start period
      • Reduced overhead with 15s health check interval

2025-09-28

  • Align frontend search sort options with Typesense tokens in fresumes/components/common/FresumesSearchBar.js:
    • Most relevant → default relevance
    • Most recentsortBy=recent
    • Most viewedsortBy=viewed
    • Most DownloadedsortBy=downloaded
  • Remove legacy Elasticsearch parameters (sortField, sortOrder) from search requests.
  • Optimize backend Typesense search in backend/src/services/typesenseService.ts:
    • Reduce query_by to name,email,phone,jobTitle,skills,location and add query_by_weights.
    • Set num_typos=1 and search_cutoff_ms=50 for faster queries.
    • Limit payload with include_fields (core fields) and exclude_fields=content.
    • Decrease default per-page results (search: 10, suggestions: 3) for responsiveness.
    • Add in-memory caching for search results and suggestions with TTL and size bounds.
  • Security and configuration:
    • Backend uses a server-side Typesense admin key; the frontend does not use any Typesense key.
    • Add/update backend/.env with Typesense settings (TYPESENSE_HOST, TYPESENSE_PORT, TYPESENSE_PROTOCOL, TYPESENSE_API_KEY).

Refer to backend/src/services/typesenseService.ts and fresumes/infastructure/ResumeRepository.js for parameter mappings and API usage.

Bulk Seeding from Docker Volume (Backend)

This workflow lets you ingest many PDF resumes from a mounted server directory into MongoDB, generate page images, and optionally defer Typesense indexing to a single chunked reindex afterward.

  • Prerequisites

    • Ensure environment variables are set for backend: MONGO_URL, GOTENBERG_URL, TYPESENSE_HOST, TYPESENSE_PORT, TYPESENSE_PROTOCOL, TYPESENSE_API_KEY.
    • Place PDFs on the server in a directory you can mount, e.g., /upload.
    • Ensure Docker Compose mounts /upload into the backend container at /app/uploads and persists public/images.
  • Example docker-compose service (excerpt)

    • Mount seed PDFs and images volume on backend:
      • backend service: volumes: - /upload:/app/uploads - ./backend/public/images:/app/public/images
  • Build and start services

    • docker compose build
    • docker compose up -d

-- Run the seeding script (reads PDFs from /app/uploads)

  • Seed without Typesense indexing (recommended for speed):

    • docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=3 --no-index"
  • Seed with per-resume Typesense indexing:

    • docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=3"
    • To skip already existing records entirely, add --skip-existing:
      • docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=3 --skip-existing"
    • To defer Typesense indexing during large imports, add --no-index and run npm run reindex afterward:
      • docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=2 --no-index"
      • docker compose exec backend sh -lc "npm run reindex"
    • Performance flags (help with non-English/OCR-heavy PDFs):
      • Fast mode (skip full-OCR, limited page OCR):
        • --parser-fast (example) docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=2 --parser-fast"
      • Restrict OCR languages (improves speed/accuracy):
        • --ocr-langs=eng+deu (Tesseract language codes)
      • Limit OCR pages processed:
        • --ocr-max-pages=2
      • Disable OCR entirely (only text extraction):
        • --no-ocr
      • Reduce GPT input length to speed parsing:
        • --gpt-text-len=6000
  • Notes:

    • Adjust --concurrency based on CPU/IO; start with 3.
    • Deduplication: resumes are skipped if pdfName already exists in MongoDB.
    • Images are written under backend/public/images/<resumeId>/... and served by /v1/images/:resumeId/:pageNumber.
  • Chunked Typesense reindex after seeding

    • Recreate schema with infix if mismatched and import in chunks using upsert:
      • docker compose exec backend npm run reindex -- --chunk=500
    • Operational guidance:
      • Start with --chunk=500 and increase if Typesense resources allow.
      • 10k documents typically completes in minutes to tens of minutes.
      • Upsert action ensures idempotency; safe to retry on failures.
  • Troubleshooting

    • If images fail to generate, verify Python venv exists in container and poppler-utils and tesseract-ocr are installed (handled by Dockerfile).
    • Ensure GOTENBERG_URL points to a reachable Gotenberg container.
    • If Typesense import errors occur, confirm TYPESENSE_* envs and collection readiness.

Recommended Three-Stage Seeding (GPT + OCR, skip conversion)

Use GPT for structured extraction and OCR for scanned PDFs; skip conversion (all inputs are PDFs), defer indexing, then backfill images.

  • Stage 1 — Parse and store to MongoDB (no images, no conversion, defer indexing)

    • docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=8 --skip-existing --allow-partial --no-images --no-convert --no-index"
    • Notes:
      • Requires OPENAI_API_KEY and MONGO_URL configured in the backend service.
      • Keep OCR enabled by default; optionally tune --ocr-max-pages=2 and --ocr-langs=eng.
      • Adjust --concurrency to your CPU/IO and rate limits.
  • Stage 2 — Generate page images for saved resumes

    • docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run generate-images -- --concurrency=6"
    • Writes pageImages and totalPages into MongoDB for records missing images.
  • Stage 3 — Index into Typesense (chunked)

    • docker compose exec backend sh -lc "npm run reindex -- --chunk=1000"
    • Ensures the latest resume data (including pdfName) is available to search.
  • Windows host directory mount example

    • Map your host folder (e.g., D:\\Resumes\\all_pdfs) to container path /app/uploads in docker-compose.yml:
      services:
        backend:
          volumes:
            - D:\\Resumes\\all_pdfs:/app/uploads
            - ./backend/public/images:/app/public/images
    • Then run Stage 1 with --dir=/app/uploads as shown above. \n## 2025-11-06
  • Optimized fresumes web ad rendering to prevent scroll freezes:

  • AdCard now initializes AdSense only when visible (IntersectionObserver) and pushes once.

  • Reduced ad height usage in ListViewCard to 280px to avoid large reflows.

  • Notes: Preview requires starting the Expo web dev server in fresumes/.

  • Upload reliability improvements (Docker Desktop):

    • Backend now returns 400 Bad Request for non-resume uploads instead of generic 500.
    • Set NEXT_PUBLIC_API_BASE_URL=http://localhost/api in docker-compose.yml for marketing site to ensure API calls route via Nginx.
    • If you still see a 500, check container logs:
      • docker-compose logs -n 200 backend
      • docker-compose logs -n 100 gotenberg
    • Added STRICT_RESUME_VALIDATION=false to backend service to accept valid resumes even when parser extracts minimal fields.
    • Multi-file uploads no longer abort on a single bad file; response includes { results, rejections }.
    • When the parser sets isResume=false, the backend now rejects the upload and deletes temp files (original upload and converted PDF) to avoid storing non-resumes.

2025-11-07

  • Webapp favicon and title update:

    • Updated fresumes/webapp/index.html to use <link rel="icon" type="image/png" href="/logo.png" /> and title HostResumes.
    • Adjusted fresumes/Dockerfile to copy assets/logo.png into Nginx site root (/usr/share/nginx/html/logo.png) so the favicon resolves in production.
    • Rebuilt and restarted containers: docker-compose build then docker-compose up -d.
    • If favicon doesn’t appear, hard-refresh the browser (Ctrl+F5) and clear cache.
    • Added favicon link to Expo web index (fresumes/web/index.html), ensuring /app uses /logo.png as its tab icon.
    • Gateway routing fix: Updated nginx/nginx.conf to proxy root-level /favicon.ico and /apple-touch-icon.png to the fresumes app service, avoiding the marketing site favicon overriding the app.
    • Apply the change with docker-compose restart nginx.
    • Rule added: When UI changes don’t show, check and update nginx/nginx.conf for the app and restart Nginx.
    • Temporarily disabled ad rendering by turning off ad scheduling in ListViewCard to remove AdCard from the UI while investigating a persistent scroll lock.
    • Rebuilt and restarted containers; previewed /app with no console errors.
  • Search flow verification (frontend ↔ backend):

    • Confirmed backend mounts under /v1; available endpoints: /resumes, /resumes/search, /resumes/advanced-search, /skills, /job_titles, /languages, /resume_languages, /locations, download and image routes.
    • Verified ResumeRepository.js uses API_BASE_URL from EXPO_PUBLIC_API_ORIGIN or defaults to http://localhost/api and selects the correct endpoint based on filters/text.
    • Verified FresumesSearchBar.js natural-language parsing populates advanced filters; when email/phone is detected, UI clears free-text and relies on advanced search.
    • Active FresumesFiltersSidebar.js emits filter-only updates (no suggestions fetch); saved variant used /suggestions but backend provides distinct endpoints above.
    • Typesense advanced search defaults q intelligently (phone digits/name/jobTitle or *), maps sortBy tokens (viewedlikes:desc, downloadeddownloads:desc), and includes fallbacks for missing infix index and request timeouts.
  • Next steps:

    • If reintroducing suggestions UI, wire to /v1/skills, /v1/job_titles, and /v1/locations.
  • Validate end-to-end search with email/phone/name and multi-location filters via the web app; rebuild if gateway routing changes.

2025-11-07

  • Search refinements in fresumes/components/common/FresumesSearchBar.js:
    • Removed auto-clearing of free-text q when email or phone is detected; text queries now work alongside email/phone filters.
    • Normalized and deduplicated locations before sending to backend; always passed as a trimmed array.
  • Impact: Enables combined text + email/phone searches and ensures stable multi-location filtering.
  • How to validate on running stack (docker-compose already up):
    • In /app, try queries like react developer in Seattle, Austin and john@example.com Bangalore.
    • Confirm results update without clearing the text field and locations filter applies correctly.
  • Image 404s after search (root cause and fix):
    • Cause: Typesense had stale IDs not present on disk/Mongo, leading /api/images/:id/:page to 404.
    • Fix: Reindexed Typesense from Mongo (docker-compose exec backend npm run reindex). Added missing-only mode to generateImages script for future checks.
    • Validation: Backend image endpoint returns 200 for valid IDs; stale IDs no longer appear in search results.

2025-11-13

  • Integrated AI-assisted search refinement using GPT-4o Mini in backend advanced search.
  • Adds optional ai=1 flag to advanced-search requests to enable refinement per-query.
  • Environment toggle AI_SEARCH_ENABLED=1 enables refinement globally (backend service env).
  • Refinement outputs structured filters (names, emails, phones, skills, job titles, locations, languages) and a corrected free-text q, merged before Typesense query.
  • Affected files:
    • backend/src/utils/enhancedResumeParser.py — new refine_search_query_with_chatgpt and CLI --refine-search mode.
    • backend/src/utils/enhancedResumeParser.ts — wrapper refineSearchQueryWithAI to call Python helper.
    • backend/src/controllers/resumeController.ts — merges AI-refined fields into searchResumesAdvanced parameters.
    • backend/src/services/typesenseService.ts — already supports multi-value arrays; constructs effective q.
  • Build & restart backend:
    • docker-compose build backend; docker-compose up -d backend
  • Usage examples:
    • GET /api/v1/resumes/advanced-search?q=react developer&skills=react,node&ai=1
    • Global enable: set AI_SEARCH_ENABLED=1 in backend env and call advanced search without ai=1.

2025-11-13 (Search fixes)

  • Multi-token q union in basic search:
    • Enhanced GET /resumes/search to split q on comma, "and", &, or | and return the union of unique results across tokens.
    • Works for locations, job titles, skills, names, and other fields included in query_by.
  • Multi-location parameter handling:
    • Frontend now serializes locations arrays using | to preserve full strings with commas (e.g., Lille, France|Paris, France).
    • Backend advanced search parses locations by | and filters with OR over exact matches.
  • Files changed:
    • backend/src/controllers/resumeController.ts — multi-token union handling in searchResumes; advanced locations parsing uses |.
    • fresumes/infastructure/ResumeRepository.jsobjectToQueryParamArray uses | when serializing locations arrays.
  • Build & restart:
    • docker-compose build backend fresumes && docker-compose up -d backend fresumes
  • Validation:
    • GET /api/resumes/search?q=Mexico, United States&limit=5 and GET /api/resumes/search?q=United States, Mexico&limit=5 both return combined results.
    • Verified advanced search with locations=Lille,%20France|Paris,%20France returns appropriate OR matches.

About

This is the project that i was working on since 4 april 2025 but now i am pushing it on git on 8 sep

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors