Typeahead System

This repository contains documentation and architecture overview for a Typeahead Search System that I implemented during a two-week work trial at Mercor as a Software Engineer.
The actual implementation code has been omitted as it's company property, but this repo serves to demonstrate my understanding of the system architecture, design decisions, and implementation approach.

Technologies Used

Frontend

Backend

Database & Cache

Authentication

DevOps & Infrastructure

Problem Statement

Recruiters on the Mercor team platform frequently perform similar searches for contractors, often repeating the same queries or making minor modifications to previous searches. Currently, there is no mechanism to leverage their search history, forcing recruiters to manually re-enter search terms each time. This leads to:

Inefficient workflows and wasted time.
Inconsistent search parameters across similar searches.
Difficulty in recalling effective previous search queries.
Reduced productivity for recruiters managing multiple roles.

Objectives

The primary goal was to implement a "Saved Queries/Typeahead" feature that would:

Automatically save user search queries in the database.
Store both the search text and associated filters (skills, experience, etc.)
Personalized ranking based on frequency and recency of searches.
Provide typeahead suggestions from historical searches as users type in the search bar.
Display up to 5 relevant previous searches in a dropdown.
Allow keyboard navigation through suggestions (arrow keys, Tab, and Enter).
Handle long queries gracefully with truncation in the UI.

Success Metrics

Adoption Rate: Recruiters who used the typeahead feature.
Time Savings: Reduction in time spent on search operations.
User Satisfaction: Positive feedback from recruiters.
Performance: Typeahead suggestions appear within 100ms of typing.
Technical: Zero regression in existing search functionality.

System Architecture

The system consisted of multiple components working together to provide a responsive and efficient typeahead search experience:

Backend Components

Create new MongoDB collection for saving search history.
Implement Redis for fast typeahead functionality.
New API endpoint for typeahead suggestions.
Query deduplication logic to prevent near-duplicates.
Build a service that first retrieves suggestions from Redis and falls back to MongoDB on a cache miss.

Query Deduplication Logic:

Here's how I implemented the deduplication logic in the BE to prevent near-duplicates.

Frontend Components

Enhanced search bar with typeahead functionality.
Dropdown component for displaying suggestions.
Keyboard navigation support.
Integration with existing search functionality.

Data Flow Explanation

User Input Flow:

User types in the SearchBar/TeamSearch component.
Input is captured and debounced (300ms) via useDebounceValue hook.
useTypeahead hook manages:
- Suggestion state.
- Selection navigation.
- Keyboard interactions.
- API communication.
Firebase authentication token is obtained for API requests.

Suggestion Retrieval:

Frontend API Layer:
- Makes authenticated GET request to /team/typeahead?prefix={query} .
- Handles token refresh and request retries.
- Manages error states.
Backend Processing:
- Validates Firebase token via FirebaseTokenAuthentication .
- TypeaheadAPIView processes the request.
Typeahead Service Layer (services/typeahead.py):
- First checks Redis cache (1-hour TTL).
- Falls back to MongoDB for cache misses.
- Manages caching strategy.
- Returns up to 5 most relevant suggestions.
- Sorts by use_count and timestamp.

User Interaction Handling:

Keyboard Navigation:
- ↑/↓: Navigate through suggestions.
- Enter: Select and execute search.
- Tab: Complete suggestion text.
- Escape: Close dropdown.
Mouse Interaction:
- Click: Select suggestion.
Selection Processing:
- Updates search input.
- Applies saved hard_filters .
- Triggers search execution.
- Updates URL parameters.

Search Query Management:

Search Execution:
- Validates query length (minimum 3 characters).
- Processes hard filters.
- Executes search with parameters.
Query Storage:
- Automatically saves in MongoDB:
  - user_email
  - query text
  - hard_filters
  - timestamp
  - use_count
Maintains rolling history (last 50 queries).

Caching Strategy:

Redis provides fast prefix matching for typeahead.
It caches suggestions with 1-hour TTL.
Key format: typeahead:{user_email}:{prefix} .
Graceful degradation to MongoDB if Redis is unavailable.
User-specific caching ensures privacy and relevance.

Data Storage

Created a new MongoDB Collection:

Collection Name: "user_search_queries"
Database Name: "typeahead"

# Detailed Schema Breakdown

{
  // Required Fields
  user_email: {
    type: String,
    required: true,
    index: true              // Indexed for faster user-specific queries.
  },
  query: {
    type: String,
    required: true,
    index: true              // Indexed for faster prefix matching.
  },
  display_text: {
    type: String,
    required: true           // Truncated version of query (max 50 chars).
  },
  hard_filters: {
    type: Object,            // Stores search filters.
    default: {}
  },

  // Timestamps
  timestamp: {
    type: DateTime,
    required: true,
    index: true              // Updated everytime the query is used. Indexed for sorting by recency.
  }, 
  created_at: {
    type: DateTime,          // First creation time. Useful for analytics & tracking query history.
    required: true
  },
  updated_at: {
    type: DateTime,          // Last modification time. Changes when query filters are modified.
    required: true
  },

  // Usage Statistics - Tracks how often a query is used for sorting suggestions.
  use_count: {
    type: Integer,
    default: 1,
    required: true
  }
}

API Endpoint

Endpoint	Method	Description
`/team/search/typeahead`	GET	Get typeahead suggestions based on prefix.

Query Parameter:

prefix: string (required) - The search prefix to match against.

Sample Response:

{
  "suggestions": [
    {
      "query": "react developer",
      "display_text": "React Developer",
      "hardFilters": {
        "tags": ["frontend", "javascript"],
        "status": ["available"]
      },
      "timestamp": "2023-03-01T12:34:56Z"
    }
  ]
}

Risks and Mitigations

Future Enhancements

While it was out of scope for the initial implementation, the following enhancements could be considered for future iterations:

Semantic similarity for better deduplication of queries.
Filter-aware suggestions: Suggest queries that are relevant to currently applied filters.
Popular searches: Show trending searches across the platform.
Advanced RedisSearch features like fuzzy matching for typo tolerance.

Conclusion

The Typeahead feature significantly improved the efficiency of recruiters using the Mercor team platform. By leveraging search query data and implementing a high-performance caching layer with Redis, we provided a seamless, Google/Amazon-like experience where previous searches appear as suggestions without requiring any explicit action from the user. The implementation focuses on:

Performance: Fast responses through Redis.
Context preservation: Maintaining filters with queries.
Minimal backend changes: Leveraging existing infrastructure.
Graceful degradation: Ensuring reliability through fallback mechanisms.

This approach allowed us to deliver a high-quality feature quickly while maintaining a path for future enhancements. The feature was unobtrusive yet helpful, reducing the cognitive load on recruiters and helping them quickly access their most relevant previous searches.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Backend Flow.png		Backend Flow.png
Deduplication Logic.png		Deduplication Logic.png
Frontend Flow.png		Frontend Flow.png
README.md		README.md
Risks and Mitigations.png		Risks and Mitigations.png
TypeaheadFeatureDemo.gif		TypeaheadFeatureDemo.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Typeahead System

Technologies Used