This guide provides a comprehensive overview of the PerformanceOptimizedTransformer, a specialized component designed to significantly boost performance in high-throughput JSON transformation scenarios.
The PerformanceOptimizedTransformer is a decorator that wraps any existing transformer implementation and adds several optimization techniques. It's particularly useful in event processing pipelines that handle thousands or millions of events per second.
Stores previously computed transformations to avoid redundant processing for identical inputs.
- Cache Hit: Return cached result (microseconds vs. milliseconds)
- Cache Miss: Transform normally and store result
- Configurable Cache Size: Limit memory usage
- Thread-Safe: Concurrent cache access
Identifies documents with the same structure (field names and types) but different values.
- Structure Recognition: Creates a fingerprint based on JSON structure, not values
- Field Names: Records all property names in the document
- Type Information: Preserves type information without actual values
- Sorting: Ensures consistent fingerprints regardless of field order
Ensures different input values produce different results while maintaining structural optimizations.
- Composite Cache Key: Combines structural fingerprint with value hash
- Value Extraction: Extracts critical values that affect transformation
- Hash Computation: Creates unique identifier for each value combination
- Collision Avoidance: Ensures different inputs get different results
Prepares frequently accessed paths for faster processing.
- Path Resolution: Pre-resolves common paths for direct access
- Resolver Caching: Stores optimized access patterns
- Query Optimization: Avoids repeated path parsing
The PerformanceOptimizedTransformer is ideal for:
- High-Volume Event Processing: Systems processing thousands or millions of events per second
- Similar Document Structures: When many documents share the same structure
- CPU-Bound Applications: When transformation CPU usage is a bottleneck
- Latency-Sensitive Systems: When minimizing transformation time is critical
// Create your normal transformer
TransformationEngine baseTransformer = new FieldRenameTransformer("user.name", "user.fullName");
// Wrap it with the performance optimizer
PerformanceOptimizedTransformer optimizedTransformer =
new PerformanceOptimizedTransformer(baseTransformer);
// Use it like any other transformer
JsonElement result = optimizedTransformer.transform(input);// Create with custom settings
PerformanceOptimizedTransformer optimizedTransformer = new PerformanceOptimizedTransformer(
baseTransformer, // The underlying transformer to optimize
5000, // Cache size (number of results to store)
true // Enable structural fingerprinting
);// Pre-compute paths that will be accessed frequently
optimizedTransformer.precomputePaths(
"user.profile.name",
"user.profile.email",
"metadata.timestamp",
"items[].price"
);// Clear the cache when needed (e.g., after configuration changes)
optimizedTransformer.clearCache();
// Schedule periodic cache clearing for long-running applications
scheduler.scheduleAtFixedRate(() -> {
optimizedTransformer.clearCache();
}, 1, 1, TimeUnit.HOURS);Here's a comprehensive example showing how to integrate the PerformanceOptimizedTransformer in a Spring Boot application:
@Configuration
public class TransformerConfig {
@Bean
public TransformationEngine userProfileTransformer() {
// Create the base transformers
List<TransformationEngine> transformers = new ArrayList<>();
transformers.add(new FieldRenameTransformer("user.firstName", "user.givenName"));
transformers.add(new FieldRenameTransformer("user.lastName", "user.familyName"));
transformers.add(new DateFormatTransformer("user.createdAt", "unix", "ISO-8601"));
transformers.add(new DefaultValueTransformer("user.status", new JsonPrimitive("active")));
// Create a composite transformer
TransformationEngine compositeTransformer = new CompositeTransformationEngine(transformers);
// Wrap with the performance optimizer
PerformanceOptimizedTransformer optimizedTransformer =
new PerformanceOptimizedTransformer(compositeTransformer, 10000, true);
// Pre-compute frequently accessed paths
optimizedTransformer.precomputePaths(
"user.firstName", "user.lastName", "user.createdAt", "user.status"
);
return optimizedTransformer;
}
@Bean
public ScheduledExecutorService cacheMaintenanceScheduler(
@Qualifier("userProfileTransformer") TransformationEngine transformer) {
ScheduledExecutorService scheduler = Executors.newSingleThreadScheduledExecutor();
// Schedule cache clearing every 6 hours
scheduler.scheduleAtFixedRate(() -> {
if (transformer instanceof PerformanceOptimizedTransformer) {
((PerformanceOptimizedTransformer) transformer).clearCache();
log.info("Cleared transformation cache");
}
}, 6, 6, TimeUnit.HOURS);
return scheduler;
}
}TODO
The transformer builds a cache key using a strong 128-bit hash over the input's structure and values in a single pass. This avoids extra traversals and minimizes collisions without a separate structural fingerprint.
The transformer is fully thread-safe:
- Uses
ConcurrentHashMapfor cache storage - Performs atomic read and write operations
- Handles race conditions gracefully
- Works correctly in multi-threaded environments
- Initialize During Startup: Create optimized transformers at application startup
- Monitor Memory Usage: Adjust cache size based on available memory
- Clear Periodically: For long-running applications, clear cache occasionally
- Pre-compute Common Paths: Identify and pre-compute frequently accessed paths
- Combine with CompositeTransformer: Use with composite transformers for maximum benefit
- Measure Actual Performance: Benchmark with your specific workload
- Memory Usage: Caching requires memory proportional to cache size and document complexity
- Complex Value Dependencies: May not fully optimize transformations with complex value interdependencies
- Very Large Documents: May have diminishing returns for extremely large documents
The PerformanceOptimizedTransformer provides significant performance benefits for high-throughput JSON transformation scenarios. By intelligently caching results and optimizing path access, it can reduce transformation time by orders of magnitude for repeated or similar documents.
For systems processing thousands or millions of events, this can translate to substantial resource savings and throughput improvements without changing the transformation logic itself.