A small Express + TypeScript project demonstrating an LRU in-memory cache with TTL and background sweeping, concurrent-request coalescing, an async queue for simulated DB calls, and a combined rate limiter (token-bucket + sliding-window burst control).
- Language: TypeScript
- Framework: Express
- LRU cache with TTL (60s) + background sweeper
- Cache stats (hits / misses / size)
- Concurrent-request coalescing (requests for same id share one fetch)
- Async queue for DB fetches (configurable concurrency)
- Rate limiting: 10 req/min sustained + burst: max 5 req in any 10s
- Endpoints: GET /users/:id, POST /users, DELETE /cache, GET /cache-status
- Test caching (first vs subsequent)
-
Request once (first fetch; ~200ms):
curl -s http://localhost:3000/users/1 -
Request again immediately (should be near-instant):
curl -s http://localhost:3000/users/1
- Test concurrent-request coalescing
- Spawn multiple concurrent requests for same id — only one simulated DB fetch should run:
for i in {1..10}; do curl -s http://localhost:3000/users/1 & done; waitObserve total time: similar to one DB fetch + tiny overhead.
- Test async queue (concurrency control)
- Send many requests for different ids and watch queue running/queued values via /cache-status:
for i in $(seq 1 20); do curl -s "http://localhost:3000/users/$(( (i % 4) + 1 ))" & done; waitcurl http://localhost:3000/cache-status | jq
- Test rate limiting
- Burst test (5 allowed in 10s):
for i in {1..6}; do curl -s -o /dev/null -w "%{http_code}\n" http://localhost:3000/users/1 & done; waitThe 6th request should return 429 with a burst-limit message.
- Type: LRU (least recently used) in-memory cache.
- TTL: each entry carries an expiresAt timestamp (default 60 seconds).
- Eviction: LRU eviction when capacity exceeded; background sweeper runs periodically (e.g., every 5s) to remove expired entries.
- Stats: hits, misses, and size are available via cache.stats() and shown in /cache-status.
- Concurrency safety: cache get() moves node to head (most recent). Writes check the cache again before setting to avoid race overwrites when multiple concurrent operations resolve.
- LRU keeps frequently accessed items in memory; TTL keeps stale data from living indefinitely.
- Sustained limit: implemented with a token bucket that refills at 10 tokens/min. Tokens are fractional and refilled continuously; each request consumes 1 token.
- Burst control: implemented as a sliding-window per-client (per IP) timestamp list — allows at most 5 requests in any rolling 10-second window.
- Combined effect: short bursts up to 5 requests are allowed (consuming burst headroom + tokens). - Long-term sustained traffic is limited to 10 requests per minute.
- Responses: When limits are exceeded the API returns 429 with a JSON error and message explaining which limit was hit.
- Queue type: simple array-based async queue (AsyncQueue) that accepts task functions returning promises.
- Concurrency: the queue runs up to N tasks concurrently (configurable), preventing unlimited parallel DB calls.
- Simulated DB calls: simulateDbFetch(id) is enqueued via fetchQueue.push(() => simulateDbFetch(id)). The task returns a promise and the request awaits it — the Node event loop is not blocked.
- Coalescing: pendingFetches map ensures multiple concurrent requests for the same id share one queued task (they await the same Promise). After completion we cache the result.
- Why: This limits load on downstream systems (real DB), keeps predictable concurrency, and reduces resource exhaustion risks.
-
brew install prometheus -
touch /opt/homebrew/etc/prometheus.yml
-
Paste this in
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
- job_name: "express_app"
metrics_path: /metrics
static_configs:
- targets: ["localhost:3000"]
-
Start prometheur as
prometheus --config.file=/opt/homebrew/etc/prometheus.yml -
Or this way
curl http://localhost:3000/metrics | sed -n '1,120p'
curl http://localhost:3000/metrics | grep app_http_requests_total -n
