-
Notifications
You must be signed in to change notification settings - Fork 1
MDS Kafka Streaming: Design and Concepts
The Mobility Data Space (MDS) enables secure, federated sharing of mobility and transportation data between organizations. This document explains the design and core concepts for implementing real-time data streaming using Apache Kafka within the MDS ecosystem, leveraging the Eclipse Dataspace Connector (EDC) framework and MDS Kafka extension.
A federated ecosystem where organizations share data under agreed-upon rules and policies. Data spaces enable secure, interoperable data sharing while maintaining data sovereignty and control.
A framework that facilitates secure data sharing between data space participants. The Eclipse Dataspace Connector (EDC) provides:
- Data Plane: Handles actual data transfer operations
- Control Plane: Manages metadata, contracts, and policies
Real-time data streaming using Apache Kafka topics, enabling high-throughput, fault-tolerant data distribution across the data space.
The process of establishing data usage agreements between providers and consumers. This includes:
- Policy evaluation
- Terms agreement
- Access credential generation
- Usage monitoring
Secure references that contain access credentials for data endpoints. EDRs include:
- Authentication tokens
- Connection parameters
- Access permissions
- Usage constraints
[tbd]
Contract negotiation follows a standardized state machine:
[tbd]
Key Message Types
- Contract Request: Consumer initiates negotiation for a specific dataset
- Contract Agreement: Provider responds with terms and conditions
- EDR (Endpoint Data Reference): Contains credentials and connection details for data access
The MDS Kafka Extension provides:
1. data-plane-kafka
Manages Kafka topic access by:
- Creating dynamic Kafka credentials
- Managing Access Control Lists (ACLs)
- Handling SASL authentication
- Supporting OIDC-based client registration
2. data-plane-kafka-spi
Defines the DataAddress format for Kafka assets:
{
"type": "Kafka",
"topic": "mobility-events",
"kafka.bootstrap.servers": "kafka.example.com:9092",
"kafka.sasl.mechanism": "OAUTHBEARER",
"kafka.security.protocol": "SASL_SSL",
"oidc.discovery.url": "<https://auth.example.com/.well-known/openid_configuration>",
"oidc.client.registration.endpoint": "<https://auth.example.com/clients>"
}
The extension implements multi-layered security:
- EDC-level: Contract-based access control2.
- Kafka-level: SASL authentication with dynamic credentials3.
- Network-level: TLS encryption for all communications4.
- Application-level: OIDC tokens for client authentication
Key metrics to monitor:
- Kafka Consumer Metrics: Lag, throughput, error rates
- Database Metrics: Connection pool, query performance
- Application Metrics: Event processing rates, error counts
- Infrastructure Metrics: CPU, memory, network usage
- Consumer Pool: Configure thread pool sizes based on expected load
- Database: Use connection pooling and consider read replicas
- Kafka: Partition topics appropriately for parallel processing
- Load Balancing: Use multiple backend instances behind a load balancer
-
Kafka Consumer: Adjust
max.poll.recordsandfetch.min.bytes - Database: Tune connection pool size and query timeouts
- JVM: Configure heap size and garbage collection settings
- Monitoring: Set up alerts for key metrics
- Home
- User guide
- Useful information