Exposes endpoints to shorten URLs, redirect to original URLs, and fetch analytics.
Contains the logic for URL shortening, cleanup, and analytics.
Stores the mappings between short URLs and long URLs, plus the access analytics.
Stores short URLs in-memory for fast lookups (to improve performance).
- Expire Service - handles removal of old URL mappings.
- FlushHits Service - handles flushing hits count to persistent db.
Shortens a given long URL and returns the corresponding short URL.
Request
POST /api/shorten
{
"long_url": "http://example.com"
}
Response
{
"short_url": "http://short.ly/abcd123"
}
Redirects the user to the long URL corresponding to the short URL.
Request
GET /abcd123
Response
Redirect to http://example.com
Returns analytics for a specific short URL (e.g., number of accesses).
Request
GET /api/stats/abcd123
Response
{
"hits": 100
}
- URL Mapping - A mapping between the short URL and the original long URL.
- Statistics - For analytics, we need to track how many times a short URL is accessed.
_id
: Unique ID for each entry.short_url
: Unique identifier (e.g., abcd123).url
: The original long URL.created_at
: Timestamp when the short URL was created.
_id
: Unique ID for each entry.url
: Shortened url unique id.hits
: The number of times this short URL was hit.
- Cache short URL to long URL mappings for faster redirection lookups.
- Cache the mappings for a short duration (e.g., 10 seconds), and every time we perform a redirect, we update the analytics data (e.g., increment the hits count).
Standalone process that expires old URL mappings and performs a cleanup in the system.
- loop every 3 seconds and perform clean up if old records found
- scan the
urls
collection while filtering for records older than 15 minutes - loop over found records
- delete each record from
urls
andstats
collection, then invalidate cache in redis if exists
Standalone process that collects hits aggregated in redis and flushes the stats to mongodb.
- loop every 3 seconds and perform write to mongodb if new keys found
- scan redis with match
stats:*
- loop over found keys
- write each one to relevant record in
stats
collection, then delete the keys in redis
To ensure scalability and handling high traffic rate, a few measures we're taken when designing and implementing persistence for access analytics and mapping retrieval:
- Batching - analytic access events are handled during redirect logic in a separate goroutine and stored (aggregated) in redis using the INCR command, then a separate process (worker) collects the keys from redis and writes them to mongodb in batches.
- Cache Aside - caching for url mapping is implemented during redirect logic in a lazy loading (cache-aside) strategy while also setting a TTL on keys to avoid utilizing too much memory in cache as well as avoid overloading mongodb with mapping retrievals.
- Rate Limit TODO: rate limit
/api/shorten
endpoint to avoid overloading the database with INSERTS and using too much storage.