This document provides detailed architectural insights, design decisions, and implementation details for the NexusPay digital wallet platform.
- Core Architectural Decisions
- Technology Stack & Justification
- Data Flow & Transaction Lifecycle
- Error Handling & Recovery
- Performance Considerations
- Production Deployment
- Testing Strategy
- Future Enhancements
Decision: Split the application into distinct services based on business domains:
- Account Service: Manages user accounts, balances, and account-related operations
- Transaction Service: Handles transaction initiation, validation, and logging
- API Gateway: Single entry point for all client requests
- Discovery Server: Service registry for dynamic service discovery
Rationale:
- Isolation: Each service can fail independently without affecting others
- Scalability: Services can be scaled independently based on load patterns
- Technology Diversity: Different services can use different technologies if needed
- Team Autonomy: Different teams can own and develop services independently
- Data Ownership: Each service owns its data, preventing tight coupling
Decision: Use Kafka as the central nervous system for asynchronous communication between services.
Why Kafka over alternatives:
- Durability: Messages are persisted to disk and replicated, ensuring no data loss
- Ordering: Kafka guarantees message ordering within partitions
- Replay Capability: Can replay events for recovery or new consumers
- High Throughput: Can handle millions of messages per second
- Stream Processing: Enables real-time analytics and event streaming
KRaft Mode: Eliminates Zookeeper dependency, simplifying deployment and reducing operational overhead.
Problem: In distributed systems, network failures can cause duplicate requests or message deliveries.
Solution: Implement idempotency at two critical layers:
- Mechanism: Client sends
Idempotency-Keyheader with each request - Storage: Redis stores processed keys with TTL
- Behavior: If duplicate key detected, return original response without processing
- Mechanism: Database unique constraint on transaction IDs
- Storage: PostgreSQL table tracks processed events
- Behavior: If duplicate event detected, skip processing silently
Why Both Layers:
- API layer prevents duplicate client requests (user double-clicks, network retries)
- Consumer layer prevents duplicate Kafka message delivery (at-least-once semantics)
Problem: Need to ensure that database updates and event publishing happen atomically.
Solution:
- Save transaction intent to local database
- Publish event to Kafka
- Both operations wrapped in single database transaction
Benefits:
- Atomicity: Either both succeed or both fail
- Consistency: No orphaned database records or missed events
- Reliability: Event publishing is guaranteed if database write succeeds
Problem: Multiple concurrent transactions could cause race conditions in balance updates.
Solution: Use PESSIMISTIC_WRITE locks when updating account balances.
Trade-offs:
- Pros: Prevents data corruption, ensures consistency
- Cons: Reduces throughput, potential for deadlocks
- Justification: In financial systems, correctness is more important than performance
Decision: Each service has its own dedicated database instance.
Benefits:
- Data Isolation: Services cannot directly access other services' data
- Technology Choice: Each service can choose optimal database technology
- Independent Scaling: Databases can be scaled independently
- Fault Isolation: Database failures affect only one service
Decision: Use Spring Cloud Gateway as single entry point.
Responsibilities:
- Routing: Route requests to appropriate microservices
- Load Balancing: Distribute requests across service instances
- Cross-cutting Concerns: Authentication, rate limiting, logging (in production)
- Protocol Translation: Convert external protocols to internal protocols
Decision: Use Netflix Eureka for service registration and discovery.
Benefits:
- Dynamic Routing: Services register themselves automatically
- Health Monitoring: Unhealthy instances are removed from registry
- Load Balancing: Client-side load balancing with Ribbon
- Resilience: No single point of failure for service location
| Category | Technology | Justification |
|---|---|---|
| Backend Framework | Java 17, Spring Boot 3 | Industry standard for enterprise microservices. Excellent ecosystem, mature tooling, and strong community support. |
| API Gateway | Spring Cloud Gateway | Reactive, non-blocking gateway with excellent Spring ecosystem integration. Supports load balancing and service discovery. |
| Service Discovery | Netflix Eureka | Battle-tested service registry with health monitoring and automatic failover capabilities. |
| Message Broker | Apache Kafka (KRaft) | Provides durability, ordering, and high throughput needed for financial transactions. KRaft mode eliminates Zookeeper complexity. |
| Primary Database | PostgreSQL | ACID compliance essential for financial data. Excellent performance, reliability, and advanced features like row-level locking. |
| Cache/Session Store | Redis | High-performance in-memory store perfect for idempotency keys and session management. Sub-millisecond latency. |
| Containerization | Docker & Docker Compose | Consistent environments across development and production. Simplified dependency management and deployment. |
| Build Tool | Apache Maven | Mature dependency management with excellent Spring Boot integration and plugin ecosystem. |
- Client Request: User initiates a P2P transfer via mobile app
- API Gateway: Routes request to Transaction Service with load balancing
- Validation: Transaction Service validates request and checks business rules
- Persistence: Transaction intent saved to PostgreSQL with PENDING status
- Event Publishing: TransactionInitiated event published to Kafka topic
- Async Processing: Account Service consumes event from Kafka
- Balance Updates: Account Service updates balances with pessimistic locking
- Completion: Transaction status updated to COMPLETED or FAILED
sequenceDiagram
participant Client
participant Gateway as API Gateway
participant TxnSvc as Transaction Service
participant Redis
participant Kafka
participant AcctSvc as Account Service
participant DB as PostgreSQL
Client->>Gateway: POST /api/transactions (Idempotency-Key)
Gateway->>TxnSvc: Route request
TxnSvc->>Redis: Check idempotency key
Redis-->>TxnSvc: Key not found
TxnSvc->>DB: Save transaction (PENDING)
TxnSvc->>Kafka: Publish TransactionInitiated event
TxnSvc->>Redis: Store idempotency key + response
TxnSvc-->>Gateway: Return transaction details
Gateway-->>Client: 201 Created
Kafka->>AcctSvc: Deliver event
AcctSvc->>DB: Check if already processed
AcctSvc->>DB: Lock accounts (PESSIMISTIC_WRITE)
AcctSvc->>DB: Update balances
AcctSvc->>DB: Mark transaction as processed
AcctSvc->>Kafka: Publish TransactionCompleted event
Flow:
- Transaction Service continues accepting requests
- Events accumulate in Kafka topics
- When Account Service recovers, it processes all missed events
- Kafka's durability ensures no transaction is lost
Recovery Process:
- Kafka retains messages based on retention policy
- Consumer groups track processing offsets
- Service restart resumes from last committed offset
Flow:
- Service immediately returns HTTP 503 (Service Unavailable)
- No partial state changes due to transactional boundaries
- Client can safely retry with same Idempotency-Key
Resilience Mechanisms:
- Connection pooling with health checks
- Automatic connection retry with exponential backoff
- Circuit breaker pattern to prevent cascade failures
Flow:
- Transaction Service returns HTTP 503 error (fails fast)
- No transaction is accepted unless it can be reliably processed
- Prevents data inconsistency between services
Monitoring:
- Kafka broker health checks
- Producer acknowledgment timeouts
- Consumer lag monitoring
Problem: Transaction Service can reach database but not Kafka
Solution:
- Use transactional outbox pattern
- Store events in database table first
- Separate process publishes events to Kafka
- Ensures eventual consistency
Partition Key: hash(accountId) % partition_count
Benefits:
- Transactions for same account processed in order
- Parallel processing across different accounts
- Even distribution of load
# HikariCP Configuration
spring.datasource.hikari:
maximum-pool-size: 20
minimum-idle: 5
connection-timeout: 30000
idle-timeout: 600000
max-lifetime: 1800000- Non-blocking I/O prevents thread starvation
- Reactive streams handle backpressure
- Event-driven eliminates polling overhead
Key Pattern: "idempotency:{key}"
TTL: 24 hours
Cache Hit Scenarios:
- Duplicate API requests (immediate response)
- Recently accessed account data
- Transaction status lookups
-- Primary indexes for fast lookups
CREATE INDEX idx_account_user_id ON accounts(user_id);
CREATE INDEX idx_transaction_from_account ON transactions(from_account_id);
CREATE INDEX idx_transaction_to_account ON transactions(to_account_id);
CREATE INDEX idx_transaction_status ON transactions(status);
CREATE INDEX idx_transaction_timestamp ON transactions(created_at);
-- Composite indexes for complex queries
CREATE INDEX idx_account_balance_lookup ON accounts(user_id, status) WHERE status = 'ACTIVE';- Keep-alive connections reduce handshake overhead
- Connection multiplexing for HTTP/2
- Service mesh for inter-service communication
Implementation Details:
- OAuth 2.0 with PKCE for mobile clients
- JWT tokens with short expiration (15 minutes)
- Refresh token rotation for security
- Role-based access control (RBAC)
# Security Groups (AWS Example)
api-gateway:
ingress:
- port: 443
source: 0.0.0.0/0 # HTTPS only
egress:
- port: 8081-8082
source: backend-sg
backend-services:
ingress:
- port: 8081-8082
source: gateway-sg
egress:
- port: 5432
source: database-sg
- port: 9092
source: kafka-sg// Example trace correlation
@RestController
public class TransactionController {
@PostMapping("/transactions")
@Traced(operationName = "create-transaction")
public ResponseEntity<Transaction> createTransaction(
@RequestBody TransactionRequest request,
@RequestHeader("Idempotency-Key") String idempotencyKey) {
Span span = tracer.nextSpan()
.tag("transaction.amount", request.getAmount())
.tag("idempotency.key", idempotencyKey);
try (Tracer.SpanInScope ws = tracer.withSpanInScope(span)) {
return transactionService.createTransaction(request, idempotencyKey);
} finally {
span.end();
}
}
}# Prometheus metrics
metrics:
- name: transaction_requests_total
type: counter
labels: [method, status, service]
- name: transaction_duration_seconds
type: histogram
labels: [service, endpoint]
- name: account_balance_updates_total
type: counter
labels: [account_type, currency]
- name: kafka_messages_processed_total
type: counter
labels: [topic, partition, status]# Critical Alerts
alerts:
- name: HighTransactionFailureRate
condition: |
rate(transaction_requests_total{status="error"}[5m]) /
rate(transaction_requests_total[5m]) > 0.05
severity: critical
- name: DatabaseConnectionPoolExhausted
condition: hikari_connections_active >= hikari_connections_max * 0.9
severity: warning
- name: KafkaConsumerLag
condition: kafka_consumer_lag_sum > 1000
severity: warning# Kubernetes HPA Example
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: transaction-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: transaction-service
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80Read Replicas:
# PostgreSQL Read Replicas
datasource:
primary:
url: jdbc:postgresql://primary-db:5432/nexuspay
username: ${DB_USER}
password: ${DB_PASS}
read-replicas:
- url: jdbc:postgresql://replica-1:5432/nexuspay
- url: jdbc:postgresql://replica-2:5432/nexuspayPartitioning Strategy:
-- Partition transactions by date
CREATE TABLE transactions_2025_01 PARTITION OF transactions
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
CREATE TABLE transactions_2025_02 PARTITION OF transactions
FOR VALUES FROM ('2025-02-01') TO ('2025-03-01');@SpringBootTest
@Testcontainers
class TransactionIntegrationTest {
@Container
static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:14")
.withDatabaseName("testdb")
.withUsername("test")
.withPassword("test");
@Container
static KafkaContainer kafka = new KafkaContainer(DockerImageName.parse("confluentinc/cp-kafka:7.4.0"));
@Container
static GenericContainer<?> redis = new GenericContainer<>("redis:alpine")
.withExposedPorts(6379);
@Test
void shouldProcessTransactionEndToEnd() {
// Given: Two accounts with sufficient balance
Account fromAccount = createAccount("user1", new BigDecimal("1000"));
Account toAccount = createAccount("user2", new BigDecimal("500"));
// When: Transaction is initiated
TransactionRequest request = TransactionRequest.builder()
.fromAccountId(fromAccount.getId())
.toAccountId(toAccount.getId())
.amount(new BigDecimal("100"))
.build();
String idempotencyKey = UUID.randomUUID().toString();
ResponseEntity<Transaction> response = restTemplate.postForEntity(
"/api/transactions",
request,
Transaction.class,
headers("Idempotency-Key", idempotencyKey)
);
// Then: Transaction should be successful
assertThat(response.getStatusCode()).isEqualTo(HttpStatus.CREATED);
// And: Balances should be updated
await().atMost(Duration.ofSeconds(10)).untilAsserted(() -> {
Account updatedFrom = accountRepository.findById(fromAccount.getId()).orElseThrow();
Account updatedTo = accountRepository.findById(toAccount.getId()).orElseThrow();
assertThat(updatedFrom.getBalance()).isEqualTo(new BigDecimal("900"));
assertThat(updatedTo.getBalance()).isEqualTo(new BigDecimal("600"));
});
}
}// k6 Load Test Script
import http from 'k6/http';
import { check } from 'k6';
import { uuidv4 } from 'https://jslib.k6.io/k6-utils/1.4.0/index.js';
export let options = {
stages: [
{ duration: '2m', target: 100 }, // Ramp up
{ duration: '5m', target: 100 }, // Steady state
{ duration: '2m', target: 200 }, // Spike test
{ duration: '5m', target: 200 }, // Steady high load
{ duration: '2m', target: 0 }, // Ramp down
],
thresholds: {
http_req_duration: ['p(95)<1000'], // 95% under 1s
http_req_failed: ['rate<0.1'], // Error rate under 10%
},
};
export default function() {
// Create transaction
const transactionPayload = JSON.stringify({
fromAccountId: 'acc-123',
toAccountId: 'acc-456',
amount: '10.00'
});
const headers = {
'Content-Type': 'application/json',
'Idempotency-Key': uuidv4()
};
const response = http.post(
'http://localhost:8080/api/transactions',
transactionPayload,
{ headers }
);
check(response, {
'transaction created': (r) => r.status === 201,
'response time < 1000ms': (r) => r.timings.duration < 1000,
});
}# Chaos Monkey Configuration
chaos:
monkey:
enabled: true
watcher:
enabled: true
assaults:
- level: 5
runtime:
- assault: killApplication
level: 5
- assault: latencyAssault
level: 3
latencyRangeStart: 1000
latencyRangeEnd: 5000
- assault: memoryAssault
level: 2
memoryFillIncrementFraction: 0.15Purpose: Real-time fraud detection using machine learning
Implementation:
graph LR
TxnService --> FraudML[Fraud Detection ML]
FraudML --> RiskScore[Risk Score]
RiskScore --> Decision{Risk Level}
Decision -->|Low| Approve[Auto Approve]
Decision -->|Medium| Review[Manual Review]
Decision -->|High| Block[Auto Block]
Features:
- Real-time scoring (< 100ms)
- Machine learning models for pattern detection
- Rule-based checks for known fraud patterns
- Integration with external fraud databases
Purpose: Real-time push notifications for transaction events
Architecture:
notification-service:
inputs:
- kafka-topic: transaction-events
- kafka-topic: account-events
outputs:
- push-notifications: Firebase/APNS
- email: SendGrid/SES
- sms: TwilioDatabase Schema:
-- Enhanced account table
ALTER TABLE accounts ADD COLUMN currency VARCHAR(3) NOT NULL DEFAULT 'USD';
ALTER TABLE accounts ADD COLUMN exchange_rate DECIMAL(10,6);
-- Exchange rates table
CREATE TABLE exchange_rates (
id BIGSERIAL PRIMARY KEY,
from_currency VARCHAR(3) NOT NULL,
to_currency VARCHAR(3) NOT NULL,
rate DECIMAL(10,6) NOT NULL,
timestamp TIMESTAMP NOT NULL,
UNIQUE(from_currency, to_currency, timestamp)
);Real-time Exchange Rates:
@Service
public class ExchangeRateService {
@Scheduled(fixedRate = 60000) // Update every minute
public void updateExchangeRates() {
// Fetch from external API (e.g., Fixer.io)
List<ExchangeRate> rates = externalRateProvider.getCurrentRates();
exchangeRateRepository.saveAll(rates);
}
public BigDecimal convertAmount(BigDecimal amount, String fromCurrency, String toCurrency) {
if (fromCurrency.equals(toCurrency)) {
return amount;
}
ExchangeRate rate = exchangeRateRepository
.findLatestRate(fromCurrency, toCurrency)
.orElseThrow(() -> new ExchangeRateNotFoundException());
return amount.multiply(rate.getRate());
}
}# Kubernetes Blue-Green Deployment
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: transaction-service
spec:
replicas: 5
strategy:
blueGreen:
activeService: transaction-service-active
previewService: transaction-service-preview
autoPromotionEnabled: false
scaleDownDelaySeconds: 30
prePromotionAnalysis:
templates:
- templateName: success-rate
args:
- name: service-name
value: transaction-service-preview
postPromotionAnalysis:
templates:
- templateName: success-rate
args:
- name: service-name
value: transaction-service-active@Service
public class AccountServiceClient {
private final CircuitBreaker circuitBreaker;
public AccountServiceClient() {
this.circuitBreaker = CircuitBreaker.ofDefaults("accountService");
circuitBreaker.getEventPublisher()
.onStateTransition(event ->
log.info("Circuit breaker state transition: {}", event));
}
public Account getAccount(String accountId) {
Supplier<Account> decoratedSupplier = CircuitBreaker
.decorateSupplier(circuitBreaker, () -> {
return restTemplate.getForObject(
"/accounts/" + accountId,
Account.class
);
});
return Try.ofSupplier(decoratedSupplier)
.recover(throwable -> {
log.error("Failed to get account {}: {}", accountId, throwable.getMessage());
throw new ServiceUnavailableException("Account service is down");
})
.get();
}
}This comprehensive documentation provides the detailed architectural insights, implementation patterns, and operational considerations needed to understand and extend the NexusPay platform.