Combination of all 5 repos
Note: This README is updated with today's date and a concise summary after every change.
- Resolve merge conflicts per requested rules and document changes.
- Files and line changes:
fresumes/Dockerfile- Cleaned conflict markers at lines 46–50; kept
CMD ["nginx", "-g", "daemon off;"].
- Cleaned conflict markers at lines 46–50; kept
fresumes_marketing/components/common/Footer.js- Lines 31–38: chose incoming label "Resume Database" for
/app; removed conflict markers.
- Lines 31–38: chose incoming label "Resume Database" for
fresumes_marketing/components/index/Header.jsx- Lines 1–7: kept simplified imports (
framer-motion,next/link), removedDotLottieReactimport. - Lines 41–59: kept incoming button block; fixed
hrefto/app(notapp/).
- Lines 1–7: kept simplified imports (
fresumes/ads.txt- Line 1: kept
google.com, pub-8223588809519131, DIRECT, f08c47fec0942fa0; removed conflict markers.
- Line 1: kept
fresumes_marketing/public/ads.txt- Line 1: kept same
ads.txtentry; removed conflict markers.
- Line 1: kept same
fresumes/webapp/public/ads.txt- Line 1: kept same
ads.txtentry; removed conflict markers.
- Line 1: kept same
- Rationale applied:
- Accepted all incoming frontend changes and pagination.
- Preserved existing search/query logic (no changes detected in this merge span).
- Accepted all
fresumes_marketingchanges.
- Next actions:
- Rebuild containers and restart services.
- Verify
/ads.txtserves correctly on marketing and app. - Validate
/_expo/static/*assets load without 404s.
- Fix advanced search fallback for filter-only queries (email/phone).
- Root cause: frontend clears
textwhen email/phone detected, sending emptyqto backend; Typesense requires a non-emptyq. - Solution: default
q='*'in advanced search withinbackend/src/services/typesenseService.tswhenqis undefined/empty, enabling filter-only searches. - Logging:
backend/src/controllers/resumeController.tsnow logs the effectiveqvalue (including defaulted'*') for easier debugging of email-based searches.
- Root cause: frontend clears
- Investigation summary:
- Frontend query flow:
FresumesSearchBarparses natural language to a unifiedfilterState(includingemail/phone) and triggers changes viahandleSearchSubmit/handleSearchChange.ListView.jsmerges top bar and sidebar filters and callsresumeSearch→ResumeRepository.findAll. - List loading:
InfiniteFlatListrequests data viasearchCallback(nextPage), marksendOfDataon empty results, and rendersEmptyInfiniteListwhen no entries. - Download tracking:
ListViewCard.jsincrements download counts viav1/resume/${item.key}/increment-download.
- Frontend query flow:
- Backend email filter handling:
typesenseService.tsindexesemailand uses exact filter matching insearchResumesAdvanced.resumeController.tsextracts theemailquery parameter and passes it through to the Typesense service.
- Verification steps:
- Rebuild/restart backend (
npm run build && npm startor Docker). - In the web app, paste an email into the search bar; expect results and backend logs like
Found X results for q=*. - Natural language with email/phone is treated as advanced filters; free-text
qis cleared by the UI as intended.
- Rebuild/restart backend (
- Optional next step (not implemented): Frontend could skip sending empty
qwhen advanced filters are present; backend fallback already supports this. - Files changed:
backend/src/services/typesenseService.ts— defaultq='*'for advanced search if empty.backend/src/controllers/resumeController.ts— log effectiveqand pass optionalq.
- Fixed Typesense Container Health Check Issue:
- Resolved Typesense container showing "unhealthy" status during
docker-compose up - Root Cause: Original health check used
curl -f http://localhost:8108/healthbut Typesense Docker image doesn't includecurlby default - Solution: Updated health check in
docker-compose.ymlto use process-based verification:healthcheck: test: ["CMD-SHELL", "ps aux | grep '[t]ypesense-server' || exit 1"] interval: 15s timeout: 5s retries: 3 start_period: 30s
- Benefits:
- Typesense now starts with
(healthy)status consistently - All services start successfully without dependency failures
- Improved startup reliability with 30s start period
- Reduced overhead with 15s health check interval
- Typesense now starts with
- Resolved Typesense container showing "unhealthy" status during
- Align frontend search sort options with Typesense tokens in
fresumes/components/common/FresumesSearchBar.js:Most relevant→ default relevanceMost recent→sortBy=recentMost viewed→sortBy=viewedMost Downloaded→sortBy=downloaded
- Remove legacy Elasticsearch parameters (
sortField,sortOrder) from search requests. - Optimize backend Typesense search in
backend/src/services/typesenseService.ts:- Reduce
query_bytoname,email,phone,jobTitle,skills,locationand addquery_by_weights. - Set
num_typos=1andsearch_cutoff_ms=50for faster queries. - Limit payload with
include_fields(core fields) andexclude_fields=content. - Decrease default per-page results (search: 10, suggestions: 3) for responsiveness.
- Add in-memory caching for search results and suggestions with TTL and size bounds.
- Reduce
- Security and configuration:
- Backend uses a server-side Typesense admin key; the frontend does not use any Typesense key.
- Add/update
backend/.envwith Typesense settings (TYPESENSE_HOST,TYPESENSE_PORT,TYPESENSE_PROTOCOL,TYPESENSE_API_KEY).
Refer to backend/src/services/typesenseService.ts and fresumes/infastructure/ResumeRepository.js for parameter mappings and API usage.
This workflow lets you ingest many PDF resumes from a mounted server directory into MongoDB, generate page images, and optionally defer Typesense indexing to a single chunked reindex afterward.
-
Prerequisites
- Ensure environment variables are set for backend:
MONGO_URL,GOTENBERG_URL,TYPESENSE_HOST,TYPESENSE_PORT,TYPESENSE_PROTOCOL,TYPESENSE_API_KEY. - Place PDFs on the server in a directory you can mount, e.g.,
/upload. - Ensure Docker Compose mounts
/uploadinto the backend container at/app/uploadsand persistspublic/images.
- Ensure environment variables are set for backend:
-
Example docker-compose service (excerpt)
- Mount seed PDFs and images volume on backend:
backendservice:volumes: - /upload:/app/uploads - ./backend/public/images:/app/public/images
- Mount seed PDFs and images volume on backend:
-
Build and start services
docker compose builddocker compose up -d
-- Run the seeding script (reads PDFs from /app/uploads)
-
Seed without Typesense indexing (recommended for speed):
docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=3 --no-index"
-
Seed with per-resume Typesense indexing:
docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=3"- To skip already existing records entirely, add
--skip-existing:docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=3 --skip-existing"
- To defer Typesense indexing during large imports, add
--no-indexand runnpm run reindexafterward:docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=2 --no-index"docker compose exec backend sh -lc "npm run reindex"
- Performance flags (help with non-English/OCR-heavy PDFs):
- Fast mode (skip full-OCR, limited page OCR):
--parser-fast(example)docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=2 --parser-fast"
- Restrict OCR languages (improves speed/accuracy):
--ocr-langs=eng+deu(Tesseract language codes)
- Limit OCR pages processed:
--ocr-max-pages=2
- Disable OCR entirely (only text extraction):
--no-ocr
- Reduce GPT input length to speed parsing:
--gpt-text-len=6000
- Fast mode (skip full-OCR, limited page OCR):
-
Notes:
- Adjust
--concurrencybased on CPU/IO; start with3. - Deduplication: resumes are skipped if
pdfNamealready exists in MongoDB. - Images are written under
backend/public/images/<resumeId>/...and served by/v1/images/:resumeId/:pageNumber.
- Adjust
-
Chunked Typesense reindex after seeding
- Recreate schema with infix if mismatched and import in chunks using upsert:
docker compose exec backend npm run reindex -- --chunk=500
- Operational guidance:
- Start with
--chunk=500and increase if Typesense resources allow. - 10k documents typically completes in minutes to tens of minutes.
- Upsert action ensures idempotency; safe to retry on failures.
- Start with
- Recreate schema with infix if mismatched and import in chunks using upsert:
-
Troubleshooting
- If images fail to generate, verify Python venv exists in container and
poppler-utilsandtesseract-ocrare installed (handled by Dockerfile). - Ensure
GOTENBERG_URLpoints to a reachable Gotenberg container. - If Typesense import errors occur, confirm
TYPESENSE_*envs and collection readiness.
- If images fail to generate, verify Python venv exists in container and
Use GPT for structured extraction and OCR for scanned PDFs; skip conversion (all inputs are PDFs), defer indexing, then backfill images.
-
Stage 1 — Parse and store to MongoDB (no images, no conversion, defer indexing)
docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run seed -- --dir=/app/uploads --concurrency=8 --skip-existing --allow-partial --no-images --no-convert --no-index"- Notes:
- Requires
OPENAI_API_KEYandMONGO_URLconfigured in the backend service. - Keep OCR enabled by default; optionally tune
--ocr-max-pages=2and--ocr-langs=eng. - Adjust
--concurrencyto your CPU/IO and rate limits.
- Requires
-
Stage 2 — Generate page images for saved resumes
docker compose exec backend sh -lc "PYTHONPATH=/app/venv/lib/python3.11/site-packages npm run generate-images -- --concurrency=6"- Writes
pageImagesandtotalPagesinto MongoDB for records missing images.
-
Stage 3 — Index into Typesense (chunked)
docker compose exec backend sh -lc "npm run reindex -- --chunk=1000"- Ensures the latest resume data (including
pdfName) is available to search.
-
Windows host directory mount example
- Map your host folder (e.g.,
D:\\Resumes\\all_pdfs) to container path/app/uploadsindocker-compose.yml:services: backend: volumes: - D:\\Resumes\\all_pdfs:/app/uploads - ./backend/public/images:/app/public/images
- Then run Stage 1 with
--dir=/app/uploadsas shown above. \n## 2025-11-06
- Map your host folder (e.g.,
-
Optimized
fresumesweb ad rendering to prevent scroll freezes: -
AdCard now initializes AdSense only when visible (IntersectionObserver) and pushes once.
-
Reduced ad height usage in ListViewCard to 280px to avoid large reflows.
-
Notes: Preview requires starting the Expo web dev server in
fresumes/. -
Upload reliability improvements (Docker Desktop):
- Backend now returns
400 Bad Requestfor non-resume uploads instead of generic500. - Set
NEXT_PUBLIC_API_BASE_URL=http://localhost/apiindocker-compose.ymlfor marketing site to ensure API calls route via Nginx. - If you still see a
500, check container logs:docker-compose logs -n 200 backenddocker-compose logs -n 100 gotenberg
- Added
STRICT_RESUME_VALIDATION=falseto backend service to accept valid resumes even when parser extracts minimal fields. - Multi-file uploads no longer abort on a single bad file; response includes
{ results, rejections }. - When the parser sets
isResume=false, the backend now rejects the upload and deletes temp files (original upload and converted PDF) to avoid storing non-resumes.
- Backend now returns
-
Webapp favicon and title update:
- Updated
fresumes/webapp/index.htmlto use<link rel="icon" type="image/png" href="/logo.png" />and titleHostResumes. - Adjusted
fresumes/Dockerfileto copyassets/logo.pnginto Nginx site root (/usr/share/nginx/html/logo.png) so the favicon resolves in production. - Rebuilt and restarted containers:
docker-compose buildthendocker-compose up -d. - If favicon doesn’t appear, hard-refresh the browser (Ctrl+F5) and clear cache.
- Added favicon link to Expo web index (
fresumes/web/index.html), ensuring/appuses/logo.pngas its tab icon. - Gateway routing fix: Updated
nginx/nginx.confto proxy root-level/favicon.icoand/apple-touch-icon.pngto thefresumesapp service, avoiding the marketing site favicon overriding the app. - Apply the change with
docker-compose restart nginx. - Rule added: When UI changes don’t show, check and update
nginx/nginx.conffor the app and restart Nginx. - Temporarily disabled ad rendering by turning off ad scheduling in
ListViewCardto removeAdCardfrom the UI while investigating a persistent scroll lock. - Rebuilt and restarted containers; previewed
/appwith no console errors.
- Updated
-
Search flow verification (frontend ↔ backend):
- Confirmed backend mounts under
/v1; available endpoints:/resumes,/resumes/search,/resumes/advanced-search,/skills,/job_titles,/languages,/resume_languages,/locations, download and image routes. - Verified
ResumeRepository.jsusesAPI_BASE_URLfromEXPO_PUBLIC_API_ORIGINor defaults tohttp://localhost/apiand selects the correct endpoint based on filters/text. - Verified
FresumesSearchBar.jsnatural-language parsing populates advanced filters; when email/phone is detected, UI clears free-text and relies on advanced search. - Active
FresumesFiltersSidebar.jsemits filter-only updates (no suggestions fetch); saved variant used/suggestionsbut backend provides distinct endpoints above. - Typesense advanced search defaults
qintelligently (phone digits/name/jobTitle or*), mapssortBytokens (viewed→likes:desc,downloaded→downloads:desc), and includes fallbacks for missing infix index and request timeouts.
- Confirmed backend mounts under
-
Next steps:
- If reintroducing suggestions UI, wire to
/v1/skills,/v1/job_titles, and/v1/locations.
- If reintroducing suggestions UI, wire to
-
Validate end-to-end search with email/phone/name and multi-location filters via the web app; rebuild if gateway routing changes.
- Search refinements in
fresumes/components/common/FresumesSearchBar.js:- Removed auto-clearing of free-text
qwhen email or phone is detected; text queries now work alongside email/phone filters. - Normalized and deduplicated
locationsbefore sending to backend; always passed as a trimmed array.
- Removed auto-clearing of free-text
- Impact: Enables combined text + email/phone searches and ensures stable multi-location filtering.
- How to validate on running stack (
docker-composealready up):- In
/app, try queries likereact developer in Seattle, Austinandjohn@example.com Bangalore. - Confirm results update without clearing the text field and locations filter applies correctly.
- In
- Image 404s after search (root cause and fix):
- Cause: Typesense had stale IDs not present on disk/Mongo, leading
/api/images/:id/:pageto 404. - Fix: Reindexed Typesense from Mongo (
docker-compose exec backend npm run reindex). Added missing-only mode togenerateImagesscript for future checks. - Validation: Backend image endpoint returns 200 for valid IDs; stale IDs no longer appear in search results.
- Cause: Typesense had stale IDs not present on disk/Mongo, leading
- Integrated AI-assisted search refinement using GPT-4o Mini in backend advanced search.
- Adds optional
ai=1flag toadvanced-searchrequests to enable refinement per-query. - Environment toggle
AI_SEARCH_ENABLED=1enables refinement globally (backend service env). - Refinement outputs structured filters (names, emails, phones, skills, job titles, locations, languages) and a corrected free-text
q, merged before Typesense query. - Affected files:
backend/src/utils/enhancedResumeParser.py— newrefine_search_query_with_chatgptand CLI--refine-searchmode.backend/src/utils/enhancedResumeParser.ts— wrapperrefineSearchQueryWithAIto call Python helper.backend/src/controllers/resumeController.ts— merges AI-refined fields intosearchResumesAdvancedparameters.backend/src/services/typesenseService.ts— already supports multi-value arrays; constructs effectiveq.
- Build & restart backend:
docker-compose build backend; docker-compose up -d backend
- Usage examples:
GET /api/v1/resumes/advanced-search?q=react developer&skills=react,node&ai=1- Global enable: set
AI_SEARCH_ENABLED=1in backend env and call advanced search withoutai=1.
- Multi-token
qunion in basic search:- Enhanced
GET /resumes/searchto splitqon comma, "and",&, or|and return the union of unique results across tokens. - Works for locations, job titles, skills, names, and other fields included in
query_by.
- Enhanced
- Multi-location parameter handling:
- Frontend now serializes
locationsarrays using|to preserve full strings with commas (e.g.,Lille, France|Paris, France). - Backend advanced search parses
locationsby|and filters with OR over exact matches.
- Frontend now serializes
- Files changed:
backend/src/controllers/resumeController.ts— multi-token union handling insearchResumes; advancedlocationsparsing uses|.fresumes/infastructure/ResumeRepository.js—objectToQueryParamArrayuses|when serializinglocationsarrays.
- Build & restart:
docker-compose build backend fresumes && docker-compose up -d backend fresumes
- Validation:
GET /api/resumes/search?q=Mexico, United States&limit=5andGET /api/resumes/search?q=United States, Mexico&limit=5both return combined results.- Verified advanced search with
locations=Lille,%20France|Paris,%20Francereturns appropriate OR matches.