An application is cloud native when it is designed to run on the cloud (private, public or hybrid) and it can adapt to different scenarios, so it is scalable.
The cloud means just making computing resources (computing, storage, networking) available over internet and on-demand.
Main cloud native features are:
- Core elements:
- [[#Microservices]]: it is an architectural-style where applications are considered as collections of atomic, resilient services with limited responsibility w.r.t. the entire application, and developed by its own small team
- [[#Containers]]
- Serverless computing: developers can focus just on the logic (code), while the cloud provider is responsible for the whole computing infrastructure, scaling and availability
- automation
- cost efficient
- faster time-to-market
- service meshes
- immutable infrastructure
- declarative APIs
- [[#Event-Driven Architecture]] (EDA): components communicate only through events, in a producer-consumer fashion exchanging events.
- Events represent state changes (or triggers) that lead to a further action
- Producer and consumer approach
- enables real-time interactions
- enables loose coupling and flexibility: new services can subscribe without changing existing ones
- Continuous delivery: frequent updates with minimal downtime
- Elastic scaling: applications can expand and contract resources based on usage
- Horizontal scaling: add more instances of equal services when demand increases (load balancers are used)
- Vertical scaling: dynamically adjust computing resources (cpu, ram)
- More responsive and scalable
- Enables real-time processing
- Healthcare
- Finance
- Retail
Innovations
- AI/ML
- Blockchain
- Digital Twin
- 5G
- Edge Computing
- Zero Trust Architecture
- Automation at Scale
Pros
- Flexible resource allocation
- Cost reduction
- eliminate fixed capital costs (no need to buy hardware)
- pay-as-you-go
- efficient use of resources
- Business continuity (DevOps)
- Release faster
- Automate processes
- Faster development and deployment
- Resilience: self-healing and quick recovery
- Efficiency: dynamic resource allocation, hardware potential maximization
- Agility
Cons
- Security: data privacy and compliance more difficult to handle across multiple environments
- Vendor lock-in: limited mobility to migrate between cloud providers
- Skill gaps: need for Docker and Kubernetes specialists
![[Pasted image 20250828120616.png|400]]
- Development process: waterfall → agile (see [[#Evolution of System Design]])
- Architecture: monolithic (tightly coupled modules) → [[#Microservices]] (more modular)
- Development & Packaging: physical servers → virtual servers/containers (see [[#Evolution of Servers]])
- Infrastructure: self-hosted → cloud infrastructure
![[Pasted image 20250828121118.png|600]] It helps organizations assess their current status towards becoming fully cloud-native. The maturity model consists of several stages:
- Cloud-Aware applications deployed in but not optimized for the cloud
- Cloud-Enabled applications deployed in but not some parts are optimized for the cloud
- Cloud-Native applications deployed in and optimized for the cloud
- Cloud-Powered applications deployed in and optimized for the cloud and the organization fully embraces the cloud approach (in development, deployment, monitoring, scaling)
- Codebase: one codebase (tracked by a VCS), many deployments
- Dependencies: declare and isolate Use files such as requirements.txt (python), package.json (node)
- Configuration: store config in env, separated from code
- Backing services: services as resources It enables swapping services by changing config rather than the code
- Build, release, run: create more stages, with respective configs/envs
- Processes: run app as stateless, with no persistence To ensure persistence, a backing service (see 4) should be used
- Port binding: export internal service through a specific port
- Concurrency: enable stateless processes
- Disposability: fast startup, graceful shutdown, seamless restarts
- Dev/prod parity: ensure minimal differences
- Logs: stream and aggregate externally, using a centralized system
- Admin process: create dedicated admin processes in the same environment used by the app, to avoid running admin/debug/maintenance commands inside the main long-running app service
Is the professional figure responsible for Cloud Native transformation. Its tasks are:
- Design (the system) for failure
- Enable scalability
- Security by design
- Collaboration with DevOps
There are two main approaches when comes to design a system.
Sequential development It is characterized by
- structured and rigid approach
- timeline subdivided in phases. Strong emphasis on documentation and traceability.
- Rapid process, so poor reflexes on changes
- Late testing → bugfix at higher costs ![[Pasted image 20260120160100.png|600]]
Iterative development: development is broken in short cycles Alternative to the [[#Waterfall Model]], so focused on:
- flexibility
- fast delivery
- customer value
- individuals + interactions > process + tools
- working software > documentation
- reacting to change > following a rigid plan
- Pair programming
- Test-Driven Development (TDD)
- Continuous Integration (CI)
- Constant refactoring
- Frequent releases ![[Pasted image 20260120160609.png|400]]
- 2-4 week sprints that must deliver a usable increment for the software
- Backlogs are list of prioritized tasks and they are
- Product backlog: general list of tasks
- Sprint backlog: list of tasks selected for the specific sprint
- Core events
- Daily scrum
- Sprint review
- Retrospective
- Roles
- Product owner
- Scrum master
- Development team
Bare-metal physical servers → VMs/containers/serverless ![[Pasted image 20250828163832.png]]
Physical server which can host just one workload and operates independently.
- inefficient: high cost with low utilization rate (<20%)
- complex maintenance and management
Process of abstracting physical hardware resources into multiple virtual resources. The idea is to maximize hardware utilization: multiple VMs with multiple applications can run simultaneously on a single host, managed by a hypervisor which dynamically allocates resources to them.
- Each VM is isolated w.r.t. the other ones: multiple OS running on the same hardware
- Resource sharing principle: CPU, RAM, storage are shared across multiple VMs. Hardware potential is maximized, reducing idle time
- Enhanced portability
- Enhanced scalability. Legacy systems can be migrated to VMs (so to the cloud) easily
- Cost saving: there is no need for a single machine per application
- Disaster Recovery: VMs can be easily snapshotted and restored
It’s the software that allocates resources to each VM and runs independently w.r.t. the VMs. VMs communicate with the HV to access real hardware.
- Type 1 HV: bare-metal, runs on the physical hardware
- performance and efficiency
- used in real data centers (ex: ESXI)
- Type 2 HV: software installed on existing OS
- used in less demanding scenarios (ex: VirtualBox, Parallels)
- Provisioning: VM is created on predefined configuration
- Operation: VM runs apps by using shared resources
- Snapshot and backup
- Deprovisioning: VM is shut down when no longer needed
- Server Virtualization: physical servers are divided into multiple VMs
- Network Virtualization: physical networks are divided into multiple independent networks
- Storage Virtualization: storage resources are pooled across multiple services, but they are presented as a unified entity
- Isolate VMs to prevent cross-VM attacks
- Update the [[#Hypervisors]]
- Implement robust access control policies
Tip
Very famous with Docker introduction in 2014
They encapsulate an entire application source code and its dependencies, to ensure consistent execution across different environment and systems. They share the same host OS, but remain isolated and with no conflicts.
Containers are stateless by definition: DO NOT store persistent data inside. Data must be stored in external databases or distributed storage solutions. This allows the restart/deletion of the container without worrying about loosing data.
Using containers enable the [[#Rolling Deployment]] practice: gradually replace the old container version with new one, without downtime. Of course, robust automations are required to enable seamless deployment and rollbacks.
- Less overhead than [[#VMs]]
- Quicker startup/shutdown than [[#VMs]]
- More portable than [[#VMs]]
- The only one cons is that they cross-OS compatibility is limited.
- Engine: manages lifecycle (Docker, Orbstack) and made up by
- Docker deamon or Runtime: executes containers, interacting with the OS kernel
- Client: CLI or GUI that enable the user to send commands to the runtime
- Image: immutable atomic package, containing dependencies and configurations
- Containers: running instances of the downloaded images
- Registry: service that manages - create/share/destroy - Docker container images
- Visibility
- public, so cloud-hosted
- private, so self-hosted ![[Pasted image 20260119164732.png|700]]
- Visibility
- Ephemeral: containers are designed to be short-lived. They should be stopped and replaced when not needed
- Automation: containers are commonly used in CI/CD pipelines ![[Pasted image 20260119115128.png]]
Updates are rolled out gradually, by replacing old containers one at a time. In the middle phase, users will access mixed versions of the containers. ![[Pasted image 20260120171931.png|700]]
- blue: older version of the system; current live version
- green: newer version of the system; updated version Once that the green instance has been tested and ready, traffic is fully routed from blue to greed. Then green → blue. ![[Pasted image 20260120171955.png|700]]
The new version of the system is reachable only by a restricted part of the users. Once that it has been tested, the full updated system is deployed. ![[Pasted image 20260120172042.png|700]]
It enables enterprises to manage complex microservice environments efficiently
- Automated deployment
- Efficient scaling: the number of containers can be adjusted w.r.t. the performances
- Service discovery: mechanism that lets services automatically find and talk to each other without hard-coding IP addresses
- Serverless compatible: developers can focus on the code rather than on the infrastructure
Containers are designed run just one service, so the whole output logs are centralized and sent via stdout and stderr. Log collectors can used (ex: Fluentd).
More advanced monitoring tools can be used, to aggregate logs from multiple containers on multiple OS hosts (ex: Prometheus + Grafana). teri#### Containers | Security
- Vulnerability scanning: tools can be used to statically scan containers’ images
- Isolation: isolate processes and limit resource access
Containers can be deployed close to the data source in a edge computing scenario, to enable real-time data processing. This practice can:
- improve performances
- reduce latencies
- [ [[#Containers Security]] ] Use non-root containers to avoid privilege escalation attacks
- [ [[#Containers Security]] ] Use robust ACLs to manage permissions
- [ [[#Containers Security]] ] Implement network segmentation for container communication: separating containers into different virtual networks so they don’t all see or talk to each other by default.
- [ [[#Containers Auditing]] ] Monitor both the container(s) and the host environment
- Scan images at multiple stages of the pipeline, to catch issues early, in both dev and prod (ex: Clair).
- Start from containerizing few core services, not all at once
- Secure every part of the container lifecycle
- Automate the pipeline for image creation, testing, deployment and so on (ex: Jenkins)
- Use orchestration tools to manage large-scale deployments
Modern environments combine [[#VMs]] with [[#Containers]], so an hybrid approach is used.
- VMs are used to provide OS-level isolation; one per workload
- Containers are used to provide process-level isolation; one per one part of workload
The idea is to deliver services over the Internet without worrying about infrastructure or hardware stuff and delegating everything to cloud computing providers. This abstraction can be applied to different layers of a cloud app, from the infrastructure, to the platform, to the app.
Deliver computing power, storage and network resources via the cloud.
Customer have control over:
- VM
- runtime
- OS
- apps
- Network configuration
- Storage
Offer a platform with tools for building, testing and deploying applications, without managing infrastructure or the OS of the machine.
Customer have control over:
- development tools, database, middleware
- HIDDEN infrastructure layers PaaS styles:
- Instance-based PaaS: application code deployed to VMs
- Framework-based PaaS: code conform to specific framework
- Metadata-based PaaS: code in form of metadata, instead in form of a normal source code
Deliver fully functional software applications over the Internet.
- subscriptionn-based
- HIDDEN infrastructure layers
- vendor lock-in ![[Pasted image 20250829112810.png|500]]
In general, the software offered is used by multiple customers, so the software must support multi-tenancy: customers use the same infrastructure, but their data is logically separated.
Data can be partitioned in two ways (see [[#Resource Management]]):
- Siloing: dedicated database per tenant
- Pooling: shared database, different tenant IDs
Each tenant is created after the user have gone through the control plane component, which handles the onboarding and authentication.
Different tenants can access different subsets of the services offered by the SaaS, considering their auth level.
When resources (computing, storage, network) are shared, three different approaches can be used:
- Silo: each tenant has a dedicated resource
- ⊕ high security and high performances
- ⊖ high costs
- Pool: resources are shared across the tenants
- ⊕ more efficient, lower costs
- ⊖ potentially insecure
- Hybrid: some resources are shared, other are dedicated
- Public Cloud: infrastructure shared to the public
- scales on-demand
- pay-as-you-go
- Private Cloud: infrastructure dedicated to the single organization
- highly customizable
- higher costs
- Hybrid Cloud: combines public and private
- flexible (between public and private approach)
- Community Cloud: infrastructure shared to specific organizations (ex: GARR) with a common mission
![[Pasted image 20260120182725.png|700]]
The following are just a bunch of principles used in the cloud native scenario. A principle is a set of rules guiding the system design decisions. They define:
- system structure
- interactions and relationships between components
- helps managing the complexity of the system
All services are exposed via APIs, typically REST.
- Description before implementation: documentation is a contract between services
All components are tightly coupled in a single application; APIs are exposed.
- Suitable for small applications
- Entire system is bounded to the slowest component
The application is broken into multiple independent components, that can be built and deployed independently.
- modular approach
- single components can be scaled
In this scenario, multiple database technologies are used, each of them suitable for its dedicated purpose.
- Relational DB ← transaction-heavy systems (ex: MySQL, Postgres)
- NoSQL DB ← scalability and high throughput (ex: MongoDB)
- Graph DB ← complex relationships and queries (ex: Neo4j)
The idea is to develop the software by connecting the implementation closely to the business concepts; also defined [[#Domain-Driven Design]] (DDD) ![[Pasted image 20260121113735.png]]
API definition is tailored for the costumer. Define how microservices will interact, from the user perspective.
- API documented with tools like Swagger, Confluence, JIRA
Promotes self-directions, self-sufficiency and autonomy.
- decentralize microservices → more testable, scalable
- infrastructure as code (ex: kubernetes)
- decentralize DevOps → each pipeline is assigned to one team
- decentralize data → [[#Cloud Native Polyglot Persistence Principle]]
Migrating logic and data to service-based platforms.
Prevent system-wide failures by isolating issues, so the other services can keep running despite the failed ones. Avoids single points of failure scenario.
#TODO
Implement security in multiple layers of the application:
- logic → secure coding, authentication
- data → encryption
- network → firewalls, security groups, VPNs The more control the user have, the lesser will be the protection ensured by the provider.
Example: Amazon AWS
- Infrastructure services (ex: [[#EC2]]) → more customizable, more user responsibility
- Container services (ex: RDS) → mid
- Managed services (ex: [[#S3]], KMS, [[#DynamoDB]]) → less customizable, less user responsibility
Vulnerabilities and attacks in prod are reduced because product was built to be secure from the ground up. Multiple layers of security are used (see [[#Defense in Depth Principle]]) and services are independent from each other (see [[#Cloud Native Isolate Failure Principle (IFP)]]).
One responsibility per container. This allows the image to be deployed for different scenarios, with different configuration, of course.
Understand the internal state of the system by leveraging on its auditing features (see [[#Containers Auditing]]), such as logs and metrics. More complex tools can be used such as Grafana, Prometheus, NewRelic.
Containers should react to lifecycle events.
- SIGTERM, sent by docker stop → graceful shutdown
- SIGKILL → sent by engine if container was not shutdown within 10s
- post-start → perform initialization tasks
- pre-stop → perform cleanup tasks
A container’s image must not be changed once built. When multiple behaviors are needed, they must be set in the configuration.
- image reusability
- rollback enhanced
Containers are meant to be ephemeral, so designed for quick startup and shutdown.
- allows auto-scaling features (required by cloud native approach)
Container must contain everything needed at build time, such as libraries, dependencies, etc. Again, configuration and state outside the container are managed by configurations.
Each container should declare its resource requirements and communicate them to the hosting platform. Applications that go beyond the limits are eligible for termination.
Be sure to properly use functions in the source code. Reuse images.
- semver
- limit code based on libraries
Changes to one module should not impact others. Each microservice is responsible for its own state. Allows [[#Cloud Native Isolate Failure Principle (IFP)]]
- limit dependencies
Components that change for the same reason should be packed together.
- Reduces the number of services to be redeployed when an update is necessary
Decouple software in isolated modules; break larger problems into smaller ones and well define boundaries.
- separate UI, API, logic, DB
Layering an application means organizing it into horizontal and vertical layers.
Layering in traditional applications
- Presentation
- Business logic
- Persistence
- Database (?)
Layering in cloud-native applications
- Presentation
- Integration: event-driver architecture
- Legacy: integration with internal enterprise application
- Third-party SaaS: integration with externals
Orthogonality means that components/modules can be modified without affecting other ones. The presence of components/modules assumes that modular design is implemented. ![[Pasted image 20260121150500.png|300]] Orthogonal design is based on the cohesion and coupling principles:
- Cohesion: measures how strongly elements insider a module are related to each other.
- High cohesion: each element performs a well-defined task, so it’s easier to understand, maintain and reuse.
- Low cohesion
- Functional Cohesion: each element performs a well-defined task
- Sequential Cohesion: each element performs a task that is part of a sequence of the module
- Communication Cohesion: each module operates on the same data or reference the same structure. Reusability is reduced because tied to a specific data source
- Procedural Cohesion: modules are grouped on the order of execution inside the module
- Temporal Cohesion: they must execute at the same time. This deny reusability
- Logical Cohesion: elements are grouped because perform similar tasks
- Coincidental Cohesion: elements are grouped for no particular reason
- Coupling: measures the degree of independence between modules/microservices
- No Coupling: modules do not communicate/depend (ex: server and clients that exchange details over socket)
- Message Coupling: modules communicate without passing parameters (ex: dependency injection)
- Data Coupling: modules communicate by passing parameters with no sharing of unnecessary information
- Stamp Coupling: modules communicate by passing parameters where only part of data is required, maybe using complex structures
- Control Coupling: one module controls the other one by passing control parameters
- External Coupling: modules share an externally imposed data format/protocol/interface
- Common Coupling: modules depend on shared global data
- Content Coupling: one module modifies the internal behavior of another one
Set of principles used to reduce dependencies, improve maintainability/flexibility/scalability of the software, in order to make it cloud-native compliant.
Every module must have its own responsibility/purpose w.r.t. a specific domain (see [[#Separation of Concerns (SoC)]]).
Software entities should be open for extensions, but closed for modifications. Enhances should be made without modified current functionality.
Objects of a superclass should be replaceable with objects of its subclasses without affecting correctness of application.
Example Let's assume that we have just defined the Document class and subsequently generated the Letter, SMS, and Novel classes as child classes of the Document class. Wherever in our code a Document type object is required, it should be possible to use any object instantiated from one of the Document's child classes, such as SMS, without in any way prejudicing the proper functioning of the program.
Consumers should not be forced to depend on methods or properties unused. This leads to less dead code.
High-level modules should not depend on low-level modules; both should depend on abstractions between them. This leads to loose coupling.
Note
It is not directly related to microservices. A full monolith or a single microservice can follow the hexagonal architecture.
The idea is to separate the core business logic from the external services (decoupling).
- Flexible → components can be swapped easily (ex: MySQL → MongoDB)
- Testable The hexagon is made up by:
- core: business logic
- ports: entry/exit points to/from the application from/to the outside world. External systems are connected via dedicated adapters ![[Pasted image 20260121161942.png|500]]
![[Screenshot 2026-01-21 alle 16.30.19.png]]
Define the specific operations the system must support, independent from the technical development of the system. Operation → Task/Process.
FTGO: operations are placing an order, updating an order, process payment.
Step 1 → [Functional requirements] → Step 2 Map each operation to a dedicated microservice.
FTGO: create two microservices: one for orders and the other for payment. More precisely:
- Accounting service
- Supplier Management:
- Restaurant service
- Courier service: couriers management is separated to enhance their distinct role in the scenario
- Consumer service
- Kitchen service
- Delivery service: merges both Courier availability and Delivery management because they are strictly related
![[Pasted image 20260121171459.png|400]]
Another way to define services is based on DDD (see [[#Cloud Native Modeled with Business Domain Principle]]). Basically the domain is broken into subdomains and one service is created for each subdomain.
FTGO: ![[Pasted image 20260122101516.png|400]]
Each service defined must satisfy the [[#Single Responsibility Principle (SRP)]] and [[#Common Closure Principle (CCP)]]. Having many services might increase the network latency. Solutions can be:
- batch API calls
- combine services that need frequent interaction In a microservice-enabled scenario, maintaining consistency across services is a critical task, especially when updating data concurrently. Solution is:
- use SAGA #tolink
For each service, can be defined
- operations: functions of the current service
- collaborators: other services called by the current one’s operations
FTGO: ![[Pasted image 20260122102717.png|500]] ![[Pasted image 20260122102743.png|500]]
It is a good practice to version the APIs, and to increment the version when changes are made.
The format to follow is MAJOR.MINOR.PATCH:
MAJOR: increment for non backward-compatible API changesMINOR: increment for backward-compatible API feature additionsPATCH: increment for backward-compatible API bugfixes
[!note] Backward-compatible Newer version of the API is compatible with already existing clients.
APIs with different version can exist at the same time, so there must be some way to call that specific API version:
- URI Versioning: version is embedded in the URI (ex: GET /v2/orders/xyz)
- potential breakage of existing links
- MIME-Type Versioning: version is included in the HTTP headers, via content negotiation
- requires client support to read/request it
Practices that enable microservices to communicate and collaborate on processing requests.
Client sends a request and waits for a response.
The request is done by using a proxy interface implemented by an adapter class. Components involved are:
- RPI Proxy: routes requests to the service
- RPI Server Adapter: handles requests received by invoking business logic
- Response Flow: it is not an entity, is just the process where the service processes a request and sends back a response Service discovery mechanisms might be required to locate services or to handle partial failures.
Resources are identified using URIs and data is represented using JSON or XML formats. HTTP methods can have different verbs:
- GET: retrieve a resource
- POST: create new resources
- PUT: update existing resources
- DELETE: delete existing resources Service discovery mechanisms might be required to locate services.
REST APIs are often documented using tools such as
- OpenAPI: standardized Interface Definition Language (IDL) that allows developers to define APIs in a human-readable format.
- Some tools allow the automatic openAPI document generation
- Helps in automatically generate client stubs and server skeletons
- Initially part of the Swagger project
- Swagger: tool for writing OpenAPI definitions; provides a GUI for API testing ![[Pasted image 20260122112744.png|150]]
- Info: title, version, description, license
- Servers: one or mode base urls
- Paths → Operations:
- endpoints and respective HTTP method
- parameters (see below)
- request body (see below)
- responses (see below)
- Components: reusable entities
- schemas
- parameters
- responses
- callbacks
gRPC uses:
- HTTP/2 for data transport
- Protocol Buffers for message serialization: they provide a binary format more efficient than text-based formats (like JSON used in [[#REST]]). However, It is a more rigid definition. Communication between gRPC entities can happen with different approaches:
- Unary RPC: single request, single response
- Server Streaming RPC: single request, stream of responses
- Client Streaming RPC: stream of requests, single response
- Bidirectional Streaming RPC: stream of requests, stream of responses (suitable for real-time interaction)
![[Pasted image 20260122114557.png|400]] Partial failures occurs when the client blocks (ex: waiting for an unresponsive service) which leads to cascading failures. Many patterns are used, or a combination of them of course.
In the Circuit Breaker Pattern proxy between consumer and microservice is used:
- rejects requests that exceed a failure threshold timeout
- prevents further request attempts by opening the circuit
- after the timeout, the client can retry and, if successful, the circuit is closed
Additional recovery strategies can be used like:
- use network timeouts
- impose a cap on the number of requests
- return meaningful errors
- use fallback responses (ex: cached data)
Recall the [[#Cloud Native Isolate Failure Principle (IFP)]] as a basis for the Bulkhead pattern: when a microservice fails, it won’t impact the ability of others to respond.
Faults are broken into:
- Transient Faults: likely to succeed upon repetition
- Non-Transient Faults: unlikely to succeed upon repetition, so frequent retries are avoided, using an exponential backoff Client will retry failed service operations
Since autoscaling, failures, upgrades are frequent in a microservices scenario, automatic service discovery mechanisms are required because static/hardcoding service endpoints could be a pain (recall the [[#Cloud Native Location-Independent Principle (LIP)]]).
A Service Registry is used: a database that holds the network location of services.
- referenced by each service instance
- can provide a list of available services
- it is updated on services start, stop, fail
- responses to clients’ queries to obtain services’ instance locations, so it routes requests accordingly
- can perform health checks
The Service Registry can be Application-Level… ![[Pasted image 20260122121817.png|400]] …or Platform-Provided ![[Pasted image 20260122122409.png|400]]
Clients exchange messages through channels. Messages are made up by:
- Header: metadata
- Body: data in test or binary format Messages can be:
- Documents: data
- Commands: specify an operation
- Events: often represent state change Channels can be:
- Point-to-Point: messages are delivered to one client
![[Pasted image 20260122151543.png|400]]
- used for commands
- Algorithm:
- client sends a command message, with a msgId (
MessageId) - if message was requesting data then
- service replies with data and corresponding msgId (
CorrelationId) - else it was a One-Way Notification sent by the client to the service, so no response is expected
- service replies with data and corresponding msgId (
- client sends a command message, with a msgId (
- Publish-Subscribe: messages are delivered to all the client subscribed to the topic
![[Pasted image 20260122151628.png|400]]
- used for events
- patterns:
- Publish/Subscribe: producer publishes (or subscribes) messages to a channel that multiple consumers can read
- Publish/Async Responses: client publishes a message specifying a reply channel
Services can exchange messages without an intermediary. Pros
- Less latency
- Less traffic Cons
- Each service must know the other ones Example: ZeroMQ
Pros
- Loose coupling
- Message buffering Cons
- Single point of failure vulnerability Example: [[#RabbitMQ]], Kafka
It allows different applications and platform to communicate via messaging.
- reliable: guarantees message delivery with ack, routing, etc.
- flexible: supports different patterns
Core components:
- Queue: a buffer that stores messages
- durable: stored on storage
- non-durable: stored on memory
- Exchange made possible by a routing mechanism
- Binding: defines how messages are routed
- Message: data being transferred
- persistent
- transient
Possible messages flows:
- Producer → Exchange: producer sends the message
- Exchange → Queue: exchange routes the messages to the correct queue
- Queue → Consumer: consumer processes the message
Types of exchange:
- Direct Exchange: routes messages based on an exact match between the message routing key and the queue’s binding key.
- supports multicast
- Fanout Exchange: routes messages to all queues bound to the exchange, ignoring routing keys.
- supports broadcast
- Topic Exchange: routes messages based on pattern matching between routing key and binding key.
- Example bindings:
- logs.* → logs.error, logs.info
- logs.# → logs.error.db, logs.system.cpu
- Example bindings:
- Headers Exchange: routes messages based on message headers instead of routing keys.
- Headers act like parameters for selecting clients
- Used for more complex routing
AQMP supports Point-to-Point channel type for messaging (see [[#Asynchronous Messaging]] → Channel Types):
- the producer sends a message with a routing key
- the message is stored in the queue where
queue name == routing key - only one consumer will consume messages from the queue
It is based on the [[#Advanced Message Queuing Protocol (AMQP)]].
Remote Procedure Calling (see [[#Remote Procedure Invocation (RPI)]]) can be implemented over RabbitMQ: ![[Pasted image 20260122162837.png]]
- The client sends a request message to a specific
rpc_queue, by specifying acorrelation_idas the id for the communication and a callback queue where the server will store the result. - Server processes the request and send the result to the callback queue, keeping the
correlation_id
In general, in a RabbitMQ-enabled scenario, the actors are:
- Producers (P): send tasks to a queue
- Consumers (C_1, C_2) So RabbitMQ can distribute tasks on available workers, preventing the single one from being overloaded. Queues and messages can be made persistent to survive system crashes. ![[Pasted image 20260126155405.png | 200]] Increasing the number of service instances can increase throughput, but may disrupt the expected order of message processing (concurrent processing scenario). So competing receivers scenario may occur: a sharded message channel is used. This allows maintaining message order in concurrent processing.
It is an open standard used to document asynchronous and event-driven APIs (analogous of Swagger used for [[#REST]]). Like for Swagger, tiny servers and clients can be generated from the documentation. AsyncAPI Generator is used to automatically create documentation. AsyncAPI Studio is a web-based GUI that can help developers. ![[Pasted image 20260122164744.png|150]] ![[Pasted image 20260122165312.png|150]]
- Servers: defines one or more message brokers, their details and security policies
- Channels: represent the communication paths through which messages are exchanged
- topics
- queues
- routing keys
- …
- Channels → Channel: summary, description, messages
- Channels → Messages: define the structure of the data
- headers
- payload
- Operations: define the action taken on a channel
- send/receive
- channel reference
- message reference
- Tags and External Docs
- Components: reusable entities
- Schemas
- Messages
- Parameters
- Security Schemes
- Bindings for other protocols
- Traits
- Correlation IDs
Duplicated Messages might occur because (FTGO):
- “Order Created” event is processed, then followed by an “Order Cancelled” event
- “Order Created” event isn’t acknowledged, so both “Order created” and “Order Cancelled” must be redelivered
- “Order Created” might be the only one re-delivered Solution could be:
- Idempotent Message Handlers: multiple invocation of the same operation don’t have additional effect
- Track messages and discard duplicated
Asynchronous messaging decouples services, so resilience and availability is improved.
While with synchronous, like [[#REST]], availability is reduced because all services must be available at the same time.
System availability is calculated by multiplying all the services’ availabilities: $$ avail_{tot}=\prod_{i} avail_{service,i} $$
Order Service receives a request from the client, so it will interact asynchronously with the Consumer and Restaurant Service to process the request.
- No service is blocked waiting for a response
- However, the API gateway will use a synchronous protocol such as REST ![[Pasted image 20260122174622.png|500]]
Some services might persist data received from other services in the system to avoid excess of requests. However, storage and data aging problems must be handled. ![[Pasted image 20260122174952.png|500]]
The final flow is the following one:
![[Pasted image 20260122175430.png|500]]
Basically, when the client request a createOrder operation, it is executed and an immediate response is received, without waiting for further processing, with a PENDING state.
After that the other services asynchronously perform their tasks, the order will get a VALIDATED state.
In a monolithic approach, data is in a single database so queries can use a single select to collect and join data. In microservice architectures data is scattered across multiple services, so multiple data sources.
The idea of the API Composition Pattern is to locate the API composition logic in a specific component of the architecture called the API Composer. There are less and more structured approaches to implement the API Composer:
- Client composition: embedded the web application itself.
- Gateway composition: the composer is combined with the API gateway; the component will both expose external APIs and centralize API composition logic.
- Service composition: the composer is a standalone service; usually used for more complex API composition logic. ![[Pasted image 20260129122031.png]] The composer can perform query calls in two ways:
- Parallel Calls: latency is minimized because multiple services are invoked simultaneously
- Sequential Calls: needed when a service call depends on the result of another one
Pros
- simplicity
- reusability
- flexibility Cons
- increased overhead ← scattered data across data sources
- reduced availability ← more services involved
- risk of data inconsistency between data sources
- developers may spend more time on improving performances than on business logic features
FTGO: implementing findOrderHistory() query ![[Pasted image 20260129124215.png|450]] Of course, the query will be filtered on:
consumerIdpagefor pagination- and more:
maxOrderAge,restaurantName, etc. Data needed to perform filtering is - of course - scattered across multiple services.
The API Pattern allows multiple strategies to handle joins:
- In-Memory Join: all data is fetched from the services and the composer performs the join
- inefficient for large data sets
- Bulk Fetch by ID: all data is queried from the services using appropriate parameters
- services bust expose bulk fetch APIs
One of the critical aspect when using microservices is the transaction management one: data is just spread across different services because each one of them has its own database, but we still want to implement transactional operations. A service often needs to:
- update the database
- publish messages
Recall the [[#Asynchronous Messaging Availability]] issue: all services must be available for the transaction to complete.
The approach is to prefer the availability over consistency of the system (see [[#CAP Theorem]]).
A system can only have two out of three properties:
- Consistency: all the nodes sees the same data at the same time
- Availability: every request receives a response (success or failure)
- Partition Tolerance: the system continues to operate, despite network splits
One might say to use distributed transactions pattern, but it is not suitable for microservices.
![[Pasted image 20260126163810.png|600]] The outbox table is used as dedicated database table that acts as a temporary message queue.
- While the service is done processing, it inserts results in the database and a messages into the outbox table, by using a transaction.
- A separate process, the message relay, polls - following the Polling Publisher Pattern - the outbox table and will send them to a message broker (ex: Kafka, RabbitMQ).
- Once that events has been sent, corresponding rows are deleted from the outbox table.
There is a variant where the message relay reads from the Transaction Log of the database, without using an additional table. This is called the Transaction Log Tailing Pattern.
- Scales better than the [[#The Outbox Pattern]] - polling can be a bottleneck in high-traffic systems
- More complex to implement
It is a distributed design pattern for managing long-running transactions across microservices. It ensures consistency even without relying on distributed transactions.
All steps in a saga pattern-enabled process are triggered by specific events or commands. If a step of the chain fails, preceding actions are reversed to maintain consistency by using appropriate [[#Compensating Transactions]].
![[Pasted image 20260126165936.png|400]] Unlike ACID transactions, SAGA cannot rollback automatically, so compensating transactions must be defined explicitly. Their behavior can differ from one scenario to another.
- Compensatable Transactions: transactions that need a compensating transaction to roll back state - when some writing operations where made. They precede the pivot
- Pivot Transaction: if it succeed, the saga will complete
- Retriable Transaction: they are guaranteed to succede after the pivot In general, each transaction can be executed one on a different participant (microservices).
FTGO: ![[Pasted image 20260126170148.png|600]]
The whole process can be coordinated in two ways:
- Choreography: coordination is decentralized, so services listen for/trigger each other and acts accordingly
- Orchestration: a saga orchestrator service manages the whole flow, so sends commands to each service, in sequence
Again, services publish events when creating/updating objects. Loose coupling still holds because queues are used as proxies, so participants that use a specific queue does not need to know each other, but just the queue.
- Prone to cyclic dependencies
- not scalable → participants might need to subscribe to a lot of events
FTGO: success flow ![[Pasted image 20260127105100.png|400]] FTGO: error flow ![[Pasted image 20260127105611.png|400]] In the example:
- OrderService exposes POST /orders APIs
- messages are exchanged using [[#RabbitMQ]]
- routing keys are consistent:
<service>.<event> - AccountingService is the pivot
FTGO: ![[Pasted image 20260127105939.png|400]]
Here a saga orchestrator is used: it sends commands to each service, in sequence, waiting for their asynchronous replies. A saga orchestrator can be represented using a State Machine: a set of states and transitions between them. Each - state machine’s - transition is triggered by the completion of a local transition, and it determines the next. It acts as a good compact representation for the whole scenario.
- it ensures that both the database update and message publication occur - atomically - together or not at all
- retries requests if necessary
- tracks the saga process
The other participants must send a reply message to the orchestrator once they updated their local DBs. The reply message triggers the next step.
Pros:
- cyclic dependencies are avoided since participants communicate with the coordinator and not to each other
- less coupling: participants must know just the coordinator
- improved SoC because the saga process is entirely managed by the coordinator Cons:
- the orchestrator might become too complex
FTGO: ![[Pasted image 20260126175231.png|600]]
Saga lacks isolation: concurrent sagas can access data mid-execution, causing anomalies.
- Lost Updates: one saga overwrites changes made by another one without first reading them
- Dirty Reads: one saga reads data from another saga that hasn’t - successfully - completed its updates
- Fuzzy/Nonrepeatable Reads: different steps of one saga read the same data, but receive different values because they have been changed by another saga
The concurrency handled mechanism can be chosen based on the business risk associated with each request.
- Semantic Lock: a compensatable transaction sets a flag (ex:
_PENDING) in the record being updated, acting as a lock for the record not fully committed. The flag will be cleared out by a retriable transaction (saga completes) or a compensating transaction (saga rolls back) When a record is read as locked:- Fail and retry
- Block until lock is released → more isolation, deadlock management is required
- Commutative Updates: ensure operation can be executed in any order, without affecting the final outcome. Commutative operations are easier to roll back.
- Pessimistic View: ensure a logical sequence of the operations in the saga to minimize business risk.
- Reread Value: prevents lost updates by checking one more time the record’s state before making changes, in case another saga updated it after the first read.
- Version: a version file is used to track all the updates made on the record, with the associated timestamp
The organization of business logic inside a microservice determines its scalability and maintainability. Often the decision to make is to choose between the following ones.
It is a procedural approach, so suitable for simple, linear logic.
Each operation is handled by a separate and dedicated script or method, living in service classes. While data objects (DAOs) handle persistence
Applications:
- used for CRUD-style microservices
- suitable for light microservices FTGO:
- OrderService → logic
- OrderDao → data access
- Order → pure data model
It is an object-oriented approach, so suitable for structured scenarios that require reusable logic.
Here data and logic coexists within domain classes. The purpose is to model real-world concepts.
Requires much more design effort, but pays off for maintainability, testability, scalability, evolving business logic.
Cons:
- some relationships and boundaries can be left implicit
Refines the Domain Model, used in complex domain. Each microservice defines a bounded context taken from the business domain. The idea is to align the entire software model with the business model.
Main components are:
- Entity: actor in the scenario defined by its unique identity and not by its attributes, with a dedicated lifecycle
- Value Object: actor in the scenario defined by its attributes
- Factory: creates objects
- Repository: handles persistence and retrieval
- Service: contains domain logic not belonging to a particular Entity/Value Object
All the objects related to a single transactional unit are grouped into an Aggregate. This is an addition w.r.t. [[#Domain Model]]. The aggregate entity acts as a root entity because it is the only entrypoint to access single items.
Aggregates allow clear transaction boundaries definition, so they avoid complex coordination logic across services and they are focused on ensuring local consistency.
A scale trade-off must be considered:
- too small aggregates → too many connections
- too large aggregates → performance degradation
- Reference Only the Aggregate Root: only the root of an aggregate (object) can be accessed from external classes or services
- Inter-Aggregate References by Primary Key: aggregates reference each other by primary key (atomic field) only, not by reference
- One Transaction per Aggregate: each transaction can create/update only one aggregate. Of course, operations involving multiple aggregated use [[#SAGA Pattern]].
![[Pasted image 20260127125454.png|400]]
- One transaction per Aggregate
- Multiple transactions per Service
Now that aggregate concept is clear, Services can be thought as the (only) entrypoint to the specific aggregate’s business logic.
Given an instant in the vole timespan, each aggregate has its own state.
A Domain Event represents a significant change of state. Domain Events names are usually past-participle verbs (ex: OrderCreated) and have properties that contain information about the event. All the data is contained by a wrapper object of the event.
FTGO: the Order an aggregate of Order Items and its the entrypoint to the items. ![[Pasted image 20260127124227.png|400]]
![[Pasted image 20260127170007.png|400]]
- OrderService: is the primary entrypoint to the system logic
It manages two aggregates:
- Order ← persistence managed by OrderRepository
- Restaurant ← persistence managed by RestaurantRepository
- Inbound adapters:
- User → [ REST APIs ] → OrderService
- Listen to RestaurantService → [ OrderEventConsumer ] → update Restaurant data
- OrderCommandHandlers
- SagaReplyAdapter → handles OrderSaga responses and triggers saga actions
- Outbound Adapters:
- OrderRepository → [ DBAdapter ] → Database
It is an asynchronous architecture where components communicate via events. Core components are:
- Event Producer: detects and generate events, sending them to Brokers
- Event Broker: routes event, filters and route them to the appropriate Consumers
- Event Consumer: reacts and processes events, asynchronously (enables decoupling)
- Event Persistence: store events for durability and replay
Features:
- Asynchronous communication → services operate independently, decoupled → low latency
- Loose coupling → services does not need to know each other → better maintainability
- Scalability
- Event Persistence → enables auditability
- Real-time event processing
- Event stream → continuous stream of data
- Single Event Processing (SEP): an event immediately triggers an action (Trigger-Action)
- Event Stream Processing (ESP): filter and processes stream of events (Real-Time Flow)
- Kafka is often used
- Complex Event Processing (CEP): filter and analyzes stream of events to identify meaningful patterns (Multi-Event Analysis)
- Used for logistics, energy management, finance
The idea is to track incremental changes - INSERT, UPDATE, DELETE - in data, in a data store. This approach is very useful to synchronize changes across repositories. All the changes are read from transaction logs, timestamps or triggers. ![[Pasted image 20260127181516.png | 400]] Data streams triggered by CDC events can be processed in real-time. This is called Event Streaming. Changes are delivered as streams to allow immediate processing. Core components are:
- Event Storage: stores data chronologically
- Streaming Technology (Kasfka, Kinesis, etc.)
- Stream Processors: enrich, transform, analyze incoming data Event Streaming is used to:
- capture real-time insights for applications
- track user actions in a e-commerce ![[Pasted image 20260127181558.png|500]] Even streaming can be combined with message queues for seamless and real-time integration.
In this pattern, state changes are stored as a series of events in event stores, instead of just performing the change. The current state is built just by replaying all the history of events. Moreover, it is possible to revert to a previous state, just by replaying all the history to a specific record.
- Useful for auditing and maintenance
Each time an aggregate is requested, it is loaded using the following algorithm:
- load all the events for the aggregate from the EVENT table
- create an instance using the aggregate’s default constructor
- apply all the events fetched to the instance, using
apply()
Pros
- User and time are tracked for each state change, so for each event
- Auditing, historical queries, temporal analysis are facilitated
- Events are easier to (de)serialize w.r.t. ORM-based Cons
- Idempotence must be implemented
- Data deletion if more difficult, especially because of GDPR → encryption/pseudonymization can be solutions
- queries made on the whole aggregate state can be complex
FTGO: each order is persisted as multiple rows in the EVENTS table, each one representing a state change (ex: OrderCreated, OrderApproved, OrderShipped).
![[Pasted image 20260128150342.png|400]]
In the traditional approach, persistence is implemented using:
- Class-to-Table Mapping: each application class → database table
- Field-to-Column Mapping: each application class attribute → database table column
- Each object instance → database table row ![[Pasted image 20260128145118.png|400]] Cons
- Can lead to complex mappings
- History not automatically available
- Difficult auditing
- Object-Relational impedance mismatch
Again, each command generate state change, then a sequence of events, without modifying the current one. Using these two commands allow ensuring [[#Separation of Concerns (SoC)]] principle.
process(command): it creates and return a list of events representing the intended state changes.- takes the command object which contains the requests’ data (FTGO: new order details can be passed when calling
OrderRevisionProposed) - can throw exceptions if validation fails, preventing invalid state changes
- SoC:
process()handles validation and event generation
- takes the command object which contains the requests’ data (FTGO: new order details can be passed when calling
apply(event): processes each generated event to update the aggregate’s state, so to update the aggregate’s fields based on the event given- SoC:
apply()handles state update ![[Pasted image 20260128151344.png|300]]
- SoC:
When multiple requests try to update the same aggregate, one might overwrite another one’s changes. This anomaly can be prevented with Optimistic Locking: a version column is used to keep track on changes made from the first read of the aggregate. Each aggregate will have a version.
- If version is still the same → no change was made by other transactions → the current transaction updates version field
- If version was changed (by another transaction) → subsequent transaction fails to prevent overwrites
Some long-lived aggregates can accumulate a large number of events over time, making inefficient to load and replay all events. Snapshots are used to periodically capture and save the current state of an aggregate, in a specific event version N.
With snapshots, the algorithm to load an aggregate changes:
- retrieve the most recent snapshot
- load only the events occurred after the snapshot
- apply all the events fetched in previous step to the instance, using
apply()![[Pasted image 20260128160553.png|400]]
Message brokers may deliver the same message multiple times, so it is better for the service handle duplicates with idempotence:
- RDBMS-based event stores → a
PROCESSED_MESSAGEStable is used to record message IDs, so if a message ID already exists it can be discarded - NoSQL-based event stores → the message ID is embedded in generated events to detect duplicated. Of course, event are not always generated, so using a pseudo event might by handy.
During the lifetime of an event-based application, some changes could be necessary, at different levels of the architecture:
- Schema Level: adding (backward-compatible) or renaming aggregates
- Aggregate Level: adding (backward-compatible) or removing attributes from the aggregates
- Event Level: adding (backward-compatible), modifying or removing fields from an event Of course, some migration operations are needed to switch from the old to the new structure.
Services using Event Sourcing often requires SAGAs to maintain data consistency across multiple (micro)services.
- RDBMS-based event stores → supports ACID → seamless SAGA integration
- NoSQL-based event stores → lack of ACID → alternative approaches to implement SAGA
In such scenario the flow is:
- Aggregates emit events when updated
- Event Handlers consume the events and update related aggregates, so new events are emitted
The saga orchestrator coordinates multi-step workflows across services.
Saga participants must ensure idempotence to prevent duplicate processing - by recording and checking a message ID - and must alway send reply messages to the saga oschestrator. Saga commands might not change the aggregate state, so no event will be emitted.
Saga orchestrator is just created - SagaOrchestratorCreated event is triggered - and updated when participants replies to it - SagaOrchestratorUpdated event is triggered.
It records message IDs as well, to ensure idempotence.
Since it is an event-based scenario, the Event Store will keep all the events to reconstruct the orchestrator.
The orchestrator will just:
- emit a
SagaCommandEvent, containing all the data needed (channel, payload) - then the service’s Event Handler will process the event and reply to the orchestrator ![[Pasted image 20260129114645.png|400]]
FTGO: ![[Pasted image 20260129113232.png|400]]
CQRS is a design that separates commands (CUD) from queries (read) to improve scalability and efficiency. Here each service has its dedicated model and data store for:
- Command Side database: publishes domain events on data changes like create, update, delete operations
- Query Side database: processes nontrivial queries with optimized databases or data models. It stores pre-materialized views of data. This database is updated every time that a command on query-side models is sent
![[Pasted image 20260129130329.png|500]]
These services expose APIs for query operations only. The databases used are kept synchronized because services are subscribed to events from other services. These services are mainly used to implement:
- Standalone Queries: (FTGO) the Available Restaurant Service maintains data about available restaurants - exposed with the
findAvailableRestaurants()query - and it is subscripted subscripted to the Restaurant Service - Cross-Service Views: (FTGO) subscribes to events from Order, Delivery and Accounting Services and keeps a databases for queries such as
findOrderHistory()![[Pasted image 20260206104513.png|400]]
A better representation can be: ![[Pasted image 20260206110000.png|400]]
- View Database stores data in a optimized way/structure
- The database technology must be suitable for the type of data to store and the type of queries to perform
- SQL vs NoSQL: complex reporting vs flexible data model and stability
- Data Access handles database logic
- Ensure idempotent updates → handle concurrency
- Pessimistic Locking mechanism: rows to be updated are locked
- Optimistic Locking mechanism: allows recurrent access but data is verified before commit
- If layer is non idempotent → risk of corruption and inconsistent state
- Reliable Event Tracking:
- SQL: record event IDs in a
PROCESSED_EVENTSdedicated table - NoSQL: record event IDs in the updated record itself
- SQL: record event IDs in a
- Optimized Event Tracking: use monotonically increasing IDs
- Reliable Event Tracking:
- Ensure idempotent updates → handle concurrency
- Event Handlers subscribed to other services’ events and update the View Database, through Data Access layer
- Query API exposes operations to clients
Usual problems due to event-based architecture arise again:
- (large quantity of) events must be stored somewhere, like on [[#S3]], and processed, like by Spark
- large quantity of events must be processed → periodically take snapshots
A client querying immediately after a command might not see its update due to message delays - the query request arrives to the server before the command end processing.
Event tokens can be used as a solution, where everything is based on eventId and max(eventId):
- the command-side returns an event ID token to the client
- the query-side makes sure if the view has processed the event
- returns query result if updated
- else returns error
The Aggregate Only scenario does not use event stacking, so just one latest updated aggregate exists, with no history stored. Can be faster, but reconstruction of previous states is not possible. Each SQL row is stateful.
The Event-Sourcing scenario tracks every - immutable - event that is made on the starting aggregate and a snapshot is taken every N events. The approach is more complex, but allows some useful history-wise operations on data.
Often happens that a single query involves multiple aggregates. Components needed are:
- Gateway that wraps the API called externally in a Reactive Streams where multiple services’ operations are invoked
- Services are called in an async fashion, with normalized fallback for non-critical services: values are left null
- Per-Service Circuit Breaker Layer to handle failures
FTGO:
- Client calls API GET
/orders/{id} - Gateway wraps the workflow in a Reactive Stream that will emit a result
- Gateway spawns 4 async tasks and collect results
- Gateway emit the final result
Pros:
- The gateway becomes an orchestrator and can evolve to support more services/endpoints
- Async composition solves the domain problem, so components are more decoupled
Recalling the FTGO example, multiple types of client can be found:
- Inside the Firewall: clients that can access high-performance LAN
- Backend service
- Outside the Firewall: other clients that must use a lower-performance network
- Browser
- Mobile
- 3-rd party applications
![[Pasted image 20260206121715.png|400]] The naive approach is the one in figure above, where each client independently talks to the backend.
Cons:
- multiple independent requests → no caching
- developer overhead
- lack of encapsulation
- outside the firewall clients might not be able to use the same protocols used inside the firewall
![[Pasted image 20260206122631.png|400]] An additional component is used: the API Gateway. It acts as the single entry point into a micro-service based application.
Some features are:
- Request Routing: API requests are processed by the proper service
- API Composition (see [[#API Composition Pattern]]): combines responses from multiple services
- Protocol Translation: translate external-friendly protocols with internal-friendly ones and viceversa, like REST ←→ gRPC
- Security and performance: Authentication, Authorization, Rate Limiting
- Optimization and monitoring: Caching, Auditing
![[Pasted image 20260206123649.png|400]] API Gateway can be broken into:
- API Layer: contains independent modules for each specific type of client supported
- Common Layer: contains independent shared functions (such as authentication, rate limiting) API Gateway can work with:
- Synchronous I/O: each network connection is handled by a single thread
- Asynchronous I/O: a single thread processes all network connections
- Centralized Ownership: the whole API gateway is managed by a dedicated team
- Decentralized Ownership:
- the common layer only of the gateway is managed by a dedicated team
- the API layer is managed by a dedicated team per type of client, which handles the client as well ![[Pasted image 20260206123945.png|400]] The entire gateway module is ofter owned my a dedicated API Gateway Team that implements APIs requested by other teams that develop client-specific modules.
![[Pasted image 20260206124448.png|400]] This structure allows to define clear ownerships of the module even more clearly, with enhanced scalability and reduced complexity. Each team is responsible for its client and respective API gateway.
Of course, the API Gateway might become the main bottleneck of the application. Other common API Gateway challenges are:
- Sequential calls can lead to high latency → invoke services in parallel
- (Some) Requests might fail → [[#Circuit Breaker Pattern]]
- API Gateway must identify services dynamically
- AWS API Gateway: routing, authentication
- AWS Application Load Balancer: just routing
- Kong
- Traefik
The purpose is to map the server-side data as a graph of objects with fields and relationships. A client is responsible to retrieve all specific data requested to the gateway, and avoids overfetching. The standard used is GraphQL, now a newer version is Apollo GraphQL, made by Facebook.
![[Pasted image 20260206131452.png|400]] The components are:
- GraphQL Schema (left white square): defines the server-side data model and the queries supported
- Object Types: entities (FTGO: Consumer, Order, Restaurant, DeliveryInfo)
- Fields
- Enums
- Resolver Functions (right white square): maps the schema to the backend services
- [[#API Composition Pattern]] is supported, by combining data from multiple sources
- works with this set of parameters:
- Object: parent (or root) object for top-level queries
- Query Arguments
- Context such as user info, or global data in general
- Proxy Classes: handle interactions with the (FTGO) application
The Query Language used lets the client have the full control over the data returned by the query:
- it is better to select just interested fields, to make the query more efficient
- it is possible to retrieve related data from a single request, so API calls are minimized
- FTGO: fetch a consumer, its orders, etc.
Example: query
query {
consumer(consumerId: 1) {
id
firstName
lastName
orders {
orderId
restaurant {
id
name
}
deliveryInfo {
estimatedDeliveryTime
status
}
}
}
}
Example: query with aliases
query{
c1: consumer(consumerId:1){ id, firstName, lastName }
c2: consumer(consumerId:2){ id, firstName, lastName }
}
Example: order resolver Given the Order Service, a resolver that defines multiple queries can be created as follows.
const resolvers = {
Query: {
orders: resolveOrders, // Fetch orders by consumerId
order: resolveOrder, // Fetch a single order
consumer: resolveConsumer // Fetch a consumer
},
Order: {
restaurant: resolveOrderRestaurant, // Resolve nested restaurant
deliveryInfo: resolveOrderDeliveryInfo // Resolve nested delivery info
},
Consumer: {
orders: resolveConsumerOrders // Resolve consumer.orders
}
};
![[Pasted image 20260206155757.png|600]]
As we can see from the picture above, execution starts from the top resolver, so the resolveConsumer(), then the nested ones (so just resolveConsumerOrder()) and so on, with the lower level (resolveOrderRestaurant() and resolveOrderDeliveryInfo())
A mutation is a write operation, so can be create, update or delete. It still returns structured data.
Example: mutation
typeMutation{
createOrder(
consumerId: Int!
restaurantId: Int!
items:[OrderItemInput!]!
): Order!
}
input OrderItemInput {
sku: String!
qty: Int!
}
mutation{
createOrder(
consumerId:1
restaurantId:10
items:[
{sku:"PIZZA_MARGHERITA", qty:1}
{sku:"COLA_CAN", qty:2}
]
){
orderId
status
items { sku qty }
kitchen { status }
deliveryInfo { status estimatedDeliveryTime }
payment { status amount }
}
}
createOrderis called- the resolver performs the HTTP POST to the Order Service
- the resolver returns the object created into the GraphQL pipeline so it can resolve additional info (Order.restaurant, Order.deliveryInfo, Order.payment)
Two main tools are used:
- Strawberry: has a code-first approach for Python. The schema is generated from python classes and it automatically creates the GraphQL schema and explorer. Example
type Order {
orderId: ID
status: String
items:[OrderItem]
kitchen: KitchenTicket
delivery: Delivery
payment: Payment
}
- Ariadne: has a schema-first approach for Python. Example
@strawberry.type
class Order:
orderId: str
status: str
items: List[OrderItem]
Of course, they have different implementation but same API, so the are interchangeable.
The main characteristic of big data approach is to manage large-scale amount of data, that cannot be processed by just few nodes. A big data infrastructure must provide
- horizontal scaling based on the workload requirements:
- from a Short-term high demand scenario where a lot of nodes perform heavy computation
- to a Long-term steady processing scenario where few nodes perform light computation.
- compatibility for on-site, cloud-based or hybrid use, and is tailored for the specific need.
- flexibility: it must accept structured, semi-structured and unstructured data.
- security by providing encryption, access control and auth
- a storage layer
- scalable horizontally to increase capacity and speed
- with disaster recovery
- data clean-up policies
- a resource management system
- with task prioritization mechanisms
- dashboard
- cost tracking
- metadata discovery using a central repository
- that automates datasources to crawl
- that allows searchability and modification
- reporting with rich visualizations
- monitoring of the system itself
- testing of the system
- lifecycle management of the data and the system through phases like planning, design, development, maintenance, deprecation
Two decades of evolution had meaningful impact on networking and data storage. Definitions of Big Data have also evolved with time. 1997 - Big Data used for remote and satellite sensor colleciton 1998 - Growth in storage needs and distributed systems 2004 - [[#MapReduce]] introduction
It is a new approach that simplifies the processing of large datasets with:
- Map Function: generates intermediary key-value pairs
- Reduce Function: merge values for output key-value pairs The idea wants basically allow big data processing on a computing cluster, with a bunch of computing nodes. Every node will perform a part of the total computation in order to minimize the amount of data transferred between nodes.
Example: word count Map phase: iterate though document content and emit key-value pairs where: word-count. Reduce phase: group and sum the value of identical keys and return key-value pairs where: word-tot_count. ![[Pasted image 20260206181049.png|600]] Phases in image above are:
- Splitting: data is broken into M pieces
- Mapping: pull data locally (on a node) and run the map function that will emit intermediary key-value results
- Partitioning: result is split into R partitions so each partition can be assigned to the proper reducer
- Shuffling and Sorting
- Reducing: intermediary results are processed and final key-value pairs are returned
All the nodes are managed by a master node that assigns tasks and monitors the system. Tasks are idempotent, so if a node fails → task is reassigned to the same node or to another one. If master fails → computation is restored from the last snapshot.
![[Pasted image 20260207100959.png|400]]
- Ingestion: collect data from various sources like IoT devices, web apps, logs, etc.
- Extraction: data is being collected
- Transformation: data gets cleaned
- Loading: data is loaded into target system
- Storage: data lakes and warehouses store structured and unstructured data
- Two approaches are used:
- Distributed Key-Value Stores → high availability
- Columnar Database → high query performance
- Eventual Consistency → faster writes
- Two approaches are used:
- Computation: data is processed with clustering, regression, pattern recognition algorithms in online or offline fashion
- Presentation: results are presented to stakeholders through dashboards or exposed via APIs
Example: Microservices x MapReduce #TODO
On-Premise means that the organization deploys the entire Big Data platform on a internal data center. [[#Hadoop]] is generally a good choice to leverage costs.
Costs:
- High starting costs for hardware and maintenance. Old hardware can be considered to lower initial investment.
- Potentially lower long-term costs
Pros:
- compliance to regulations
- backups and history managed
- providers offer different prices w.r.t. the data access frequency needed Cons:
- pay for data retrieval → backup outputs
It combines [[#Big Data On-Premise]] cost-efficiency and [[#Big Data In the Cloud]] compliance to regulations. Still [[#Big Data Object Storage]] is used.
- Store rarely access data off-premise (in cloud)
- Use cloud for heavy ETL*
- SLA** guarantees by the provider * ETL: Extract, Transform, Load ** SLA: Service Level Agreement
Used as:
- primary data warehouse
- shared data marts
- long-term storage for non critical/sensitive data
- Querying large amount of data with complex ad-hoc SQL queries
- Reporting: create interactive dashboards for business insight or third-party sharing
- Alerting: detect anomalies using thresholds or ML Advanced techniques can be:
- Mining using clustering, classification, collaborative filtering for recommender systems
- Modeling: develop and train ML models
- Impact: use results for decision-making systems
There are various pattern available, one for each business case. They could work together as well, like data lakes usually feed data warehouses and marts after data processing.
It can store all types of data: structured, semi-structured and unstructured. It uses schema-on-read approach: data schema is applied only when accessing data, while data is stored raw.
Used:
- as a sandbox
- for ML and exploratory analysis
It uses schema-on-write approach: data schema is applied each time data is written. Data is aggregated, so a single source of truth is created for the organization. Usually, data comes from [[#Data Lake]]s.
Used for business intelligence dashboards, decision making-oriented. Can be easily integrated with [[#Big Data Object Storage]].
Pros:
- supports standard SQL for querying massive data storage Cons:
- cannot scale as easily as [[#Data Lake]]
- requires metadata and documentation
Data is stored following the Column-Oriented Storage Structure:
Names: Mary, Paulina, John, Suzanne, GZ
Ages: 21, 43, 45, 11, 15
Pros:
- more efficient for queries on a specific column
- more space efficient by grouping similar data types together Cons:
- slower updated
![[Pasted image 20260210125055.png|400]]
- Leader Node coordinates queries and data distribution
- Compute Nodes store data and execute queries Data is distributed using and identified by keys.
![[Pasted image 20260210125130.png|400]] Distributed queries can be executed using the tree architecture where
- Leaf Nodes fetch and process data
- Intermediate Nodes aggregate results The system can auto-scale if the query requires it, to speed up the query processing.
![[Pasted image 20260210125153.png|400]]
- Storage Layer
- uses [[#Big Data Object Storage]] approach for everything, from table data to intermediary results and outputs.
- Compute Layer
- built with virtual warehouses, so resource clusters, that can be dynamically created, resized, destroyed
- external clients interact with abstract clusters, not with concrete nodes underneath
- Service Layer allows
- authentication and access control
- query optimization
- transaction management
Pros:
- this approach allows the application of [[#Separation of Concerns (SoC)]] principle
They are [[#Data Warehouse]]s dedicated to specific business areas, such as just for sales, costumers, partners, etc.
With this approach, data is stored as objects where:
- data: raw content in any format (structured or not)
- metadata: timestamps, ownership, access control, etc.
- unique identifier used to access data
Using unique identifiers allow easier data sharing
It’s a type of - usually bulk - data processing where user intervention is not needed and there are no strict time boundaries, like hours and days are often acceptable. Scheduling tasks is required.
Steps in general are:
- clean
- transform
- aggregate
- consolidate
For further details, see [[#Big Data Offline Processing]].
Approach used to real-time process data coming from (eg) IoT devices, social media, online transactions, etc. Cleaning, filtering, categorizing steps are made on data, without storing it. So low latency and agility are ensured.
Used for anomaly detection of real-time analysis.
Applications source of data just send a continuous stream of data in fixed-size blocks. The Hadoop client just buffers data locally till the data chunk can be created. Of course, data must be aggregated in small or big chunks before processing:
- Time Window: data in a specific time range is aggregated
- Count Window: a specific amount of data is aggregated Then data is written (actually, is sent to the NameNode, see [[#Hadoop Distributed File System (HDFS)]]).
In order to extract data from the source, Message Brokers are used, then data is processed by Stream Processing Engines.
It is a tool created in 2004, inspired by Google’s [[#MapReduce]]. It is used to process large datasets on commodity hardware, so it’s easier to move data computation closer to data source.
In the first version, Hadoop architecture is a tightly coupled stack:
- Storage Layer is powered by the [[#Hadoop Distributed File System (HDFS)]]
- Compute Layer is based on the [[#MapReduce]] approach, to allow data locality
It is optimized for large-scale systems with streaming data access. It scales efficiently even with commodity hardware. Access model is write-once, read-many.
HDFS 1.0 had the following architecture, with:
- NameNode (Master)
- manages file system namespace and metadata
- maintains block mapping and DataNode location
- DataNodes (Slaves)
- store actual data blocks, with redundancy (see [[#Hadoop Distributed File System (HDFS)]])
- serve read/write requests to/from NameNode
- report their status to NameNode
- reduce costs with Just a Bunch Of Disks (JBOD): storage architecture that combines multiple drives into a single logical unit, without data redundancy ![[Pasted image 20260209103033.png|400]]
Read algorithm
- client → NameNode to gather metadata
- NameNode → client with location of file blocks in the scattered across DataNodes
- client → DataNode opens a connection with the specific node
- DataNode → client data streaming, without involving NameNode
Write algorithm
- client → NameNode requests file creation
- NameNode selects DataNode targets for data block allocation
- client → first DataNode in the pipeline
- fist DataNode → other DataNodes in the pipeline, to allow data replication (see [[#HDFS Data Replication]])
- NameNode → client returns metadata
Metadata is conceptually more valuable than actual data: data blocks are meaningless without structure given by metadata. Metadata is used to:
- represent the control plane of the file system
- data represents the content plane Metadata must already exists before creating/reading content.
If data is unavailable → content is loss If metadata is unavailable → meaning is loss
Data replication is implemented such that:
- files → split into fixes-size blocks
- each block → multiple replicas on different DataNodes, as many as the replication factor value says.
Of course, replicas are placed across different nodes and racks so that:
- system is resilient to rack-level failures
- network efficiency is increased: when a client requests data, the closer node will respond The NameNode decides the replica placement and continuously updates it.
DataNodes periodically send heartbeats and self-reports. If a DataNode becomes unavailable →
- DataNode is marked as failed and data blocks that where hosted on it are considered missing
- replication factor must be restored, so a new replica is created on a different DataNode
Data blocks are protected using checksums.
If a DataNode becomes unavailable → entire system becomes unavailable. High availability is not supported in HDFS 1.0.
![[Pasted image 20260209122915.png|350]] JobTracker is just the module inside the NameNode that assign tasks to TaskTrackers, another module inside DataNodes, that execute MapReduce tasts. In Hadoop 1.0 architecture, there is just one JobTracker, making it a potential bottleneck.
Newer version of Hadoop introduces a clear separation between storage and resource management. There is no more a single master → metadata and job execution supports higher availability. There is just one Active NameNode that serves requests, while the Standby NameNodes keep metadata synchronized. So if the Active one fails, the Standby one is promoted. Moreover, new version keeps track of metadata changes on shared edit logs. ![[Pasted image 20260209123357.png|500]]
- Storage is managed by [[#Hadoop Distributed File System (HDFS)]] version 2, with support for Active and Standby NameNodes → high availability.
- Resources are managed by [[#Yet Another Resource Negotiator (YARN)]]
In general, version 2.0 is:
- more available
- more scalable
- fault tolerant at both storage and resource levels
- multi-framework support, beyond MapReduce
![[Pasted image 20260209124302.png|400]] The YARN container represents a set of bounded resources - CPU and memory - and is the fundamental unit in the resource allocation. However YARN container is NOT a OS-level or Docker container (see [[#Containers]]); it is just a logical abstraction and corresponds to a process of the system.
As can be seen above, each node can run multiple containers that are relative to a specific application. The application in YARN is a set of jobs, requested by a client.
The main components are:
- ResourceManager (one per system): it is the global master of the whole system
- receives jobs from clients
- allocates resources for applications based on
- priority
- resource locality
- resource requirements
- number of containers needed
- monitors heath and capacity of the system
- DOES NOT execute tasks
- ApplicationMaster (one per application)
- negotiates resources with the ResourceManager
- runs inside a container of a node
- monitors the execution of the application
- NodeManager(s) (one per cluster node)
- responsible for launching, monitoring, terminating containers
- each container can execute just one mapper or reducer task
![[Pasted image 20260210122703.png]]
In general, tasks are executed following the Relax Locality approach where data necessary to the computation can be extracted from different nodes; if the first node is busy, another can respond.
In an empty directory, just create:
docker-compose.yamlto define the services needed to run Hadoopnamenode: metadata storagedatanodesx2: data block storageresourcemanager
configfile with env variables- HDFS config
- MapReduce config
- YARN config
services:
namenode:
image: apache/hadoop:3
hostname: namenode
ports:
- 9870:9870
env_file:
- ./config
environment:
ENSURE_NAMEN
O
DE_DIR: "/tmp/hadoop-root/dfs/name"
command: ["hdfs", "namenode"]
datanode_1:
image: apache/hadoop:3
command: ["hdfs", "datanode"]
env_file:
- ./config
datanode_2:
image: apache/hadoop:3
command: ["hdfs", "datanode"]
env_file:
- ./config
resourcemanager:
image: apache/hadoop:3
hostname: resourcemanager
command: ["yarn", "resourcemanager"]
ports:
- 8088:8088
env_file:
- ./config
nodemanager:
image: apache/hadoop:3
command: ["yarn", "nodemanager"]
env_file:
- ./config
Example: MapReduce for
docker compose exec -ti namenode /bin/bash
yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-HADOOP_VERSION.jar pi 10 15
The command above will generate a total of
-
$10$ is the number of mappers -
$15$ is the number of random points generated by each mapper Of course, the more the points, the more accurate the estimation.
Mapper:
- generates random points
- assigns 1 for points inside and 0 for points outside the circle
- assigns the same key to all points Output:
Key Value
1 1 (Inside the circle)
1 0 (Outside the circle)
1 1 (Inside the circle)
1 0 (Outside the circle)
...
Reducer: aggregates all mapper outputs
- calculates the ratio of the points under the same key, so all of them
- estimates
$\pi$ Output:
Key Values
1 [1, 0, 1, 0, ...]
$$ dS = c \cdot r \cdot a \cdot \frac{(1+g)^y}{(1-i)(1-b)} $$ where:
-
$dS$ : totale storage needed [TB or PB] -
$c \in [0,1]$ : compression ratio -
$r \in \mathbb N$ : replication factor (default 3) -
$a \in \mathbb R$ : initial data size -
$g \in [0,1]$ : yearly grow rate -
$y \in \mathbb N$ : planning period (in years) -
$i \in [0,1]$ : intermediate storage for temporary data -
$b \in [0,1]$ : buffer for unexpected spiker or peak loads
$$ dN = \frac{c \cdot r \cdot a}{d \cdot n \cdot (1-i)(1-b)} $$ where:
-
$dN$ : totale number of DataNodes needed -
$c \in [0,1]$ : compression ratio -
$r \in \mathbb N$ : replication factor (default 3) -
$a \in \mathbb R$ : total raw data size, before hardware consideration -
$d \in \mathbb R$ : disk size per node -
$n \in \mathbb N$ : number of disks per node -
$i \in [0,1]$ : intermediate storage for temporary data -
$b \in [0,1]$ : buffer for unexpected spiker or peak loads
Hive is a SQL-based data warehouse layer that goes on top of [[#Hadoop Distributed File System (HDFS)]] where the user sends data using HiveQL and the output is the query enabled for distributed execution (MapReduce/Apache Tez/Spark). Basically allows the user to query an HDFS-based storage where:
- Query → Hive
- Resources → YARN
- Storage level → Hadoop (so HDFS)
Features:
- hive provides the user a logical schema over all the files stored, so queries are easier to create
- works with managed and external tables
- supports various drivers to work with the Hadoop ecosystem
![[Pasted image 20260210183904.png|400]]
- Hive Clients
- CLI
- Thrift server for cross language communication
- JDBC/ODBC drivers
- Hive Driver
- Compiler: queries → parsed tree, semantic analysis
- Execution Engine: logical plans → execution-ready workflows
- Metastore: stores tables, partitions, database information. Ensures a relation between data to be queried and actual directories
- File-based storage → basic use
- DB-based storage → scalable
Main components are:
- Tables: corresponds to a directory in HDFS with one or more files. Hive cannot modify data, just to store schema and metadata.
- Partitions: corresponds to subdirectories in HDFS Filter columns → skipping subdirectories
- Buckets: each bucket correspond to a physical file inside the Partition and has an assigned hash. Bucketing is performed at load time.
![[Pasted image 20260210182658.png|400]]
Both row and columnar formats are supported:
- Row-Based Format ← choose for frequent writes, real time updates
- TEXTFILE: plain-tex, human readable
- inefficient
- SEQUENCEFILE: binary with compression
- suitable for ETL workflows
- AVRO: supports schema evolution
- balanced
- TEXTFILE: plain-tex, human readable
- Columnar-Based Format ← choose for large-scale analytics, compression
- ORC (Optimized Row Columnar): high compression
- fast for analytical queries
- PARQUET: schema-based
- efficient for nested data
- ORC (Optimized Row Columnar): high compression
- Input data is stored as CSV, on HDFS
- Tables defined using schema-on-read
- Execution Engine set to MapReduce
- HiveQL query is submitted via JDBC client
- Hive Driver parses the query → consults Metastore for metadata → generates an execution plan
- Execution plan → one or more MapReduce stages
It is a distributed computing framework for big-data processing. It interfaces directly with various data stores, such as [[#Hadoop Distributed File System (HDFS)]], [[#Apache Hive]], Cassandra, [[#S3]].
- Supports various APIs for Java, Python, etc.
- Integrates with [[#Yet Another Resource Negotiator (YARN)]], Mesos, etc.
It follows a Master-Slave Architecture ![[Pasted image 20260210192334.png|300]]
- Driver:
- manages and distributes tasks execution
- creates executors
- Executors
- provide resources to execute tasks
- report status and results
- Cluster Manager
- Two schedulers available to choose from
- Standalone Scheduler: default scheduler when Hadoop is not installed
- YARN: default scheduler for Hadoop, optimized for MapReduce
- mediates resource allocation
- manages the state of physical resources
- Two schedulers available to choose from
- Spark UI (not in picture): provides a web interface to monitor execution
- Local Mode: runs both driver and executors on a single JVM on the client. Used for quick development
- Client Mode: runs driver on a single JVM and executors on the cluster Used when local data configuration is needed
- Cluster Mode: runs both driver and executors on the cluster. Used in production
spark-sql: write queries for structured datasetsspark-submit: submit applications written in Java, Scala, Python, R
Examples
- Local Mode:
./bin/spark-submit --master local[8] /examples.jar 100 - Standalone Cluster:
./bin/spark-submit --master spark://host:7077 --executor-memory 20G /examples.jar - YARN Cluster Mode:
./bin/spark-submit --master yarn --deploy-mode cluster --num-executors 50 /examples.jar
It is the core execution engine of Spark and responsible for connecting to the cluster and coordinating resources. Owned by the Driver and manages communication between Driver and Cluster (see [[#Apache Spark Architecture]]).
It is the main entry point for a Spark application and provides a unified interface to interact with Spark. Internally, manages a [[#SparkContext]] that can be accessed by the SparkSession itself. Applications should create a SparkSession instead of creating SparkContext directly.
It is the Spark’s core data abstraction: a read-only collection of objects that is broken in partitions to enable parallel processing by independent task.
Data contained is just about how data is derived, not where is stored.
- Lineage information: points to parent RDD and tracks all the transformation made on it
- Computation logic: function to compute each partition from its parents
- Partition metadata: number of partitions and their indices
- Preferred location: hints about data locality used to schedule tasks
Some Spark terminology:
- A new collection is created every time a transformation is applied.
- Transformations are lazy and define just how to process data.
- A job is created every time an action (code function) is invoked and is the logical unit of work
- Running a job means execute a Direct Acyclic Graph (DAG): represents the logical execution plan.
- DAG components
- nodes are the transformations
- narrow transformations → each output partition depends on at most one input partition, so data exchange across the cluster is NOT required ![[Pasted image 20260211125051.png|500]] In the scenario in the image, both A and B options are equivalent
- wide transformations → the output partition depends on more input partition, so data exchange across the cluster is required ![[Pasted image 20260211125931.png|500]]
- edges are the dependencies
- nodes are the transformations
- DAG is broken into smaller execution unit, called stages. A stage is a set of tasks (see below) that can be executed in parallel, across the cluster, without shuffling data → boundaries of a stage are defined by two wide transformations (see above)
- DAG components
- before actually executing actions and producing results, Spark first lazily loads all the transformations, to guarantee an optimized execution
If partition gets lost → is recomputed using lineage: history of transformations.
Features:
- Resilient because of the lineage mechanism (history of changes in database)
- Immutable: transformations always create a new RDD, without ever changing the original one
- Partitioned: data is split into logical partitions
Cons:
- code can be difficult to understand
- spark cannot further optimize operations inside lambdas
- slower for Python/R w.r.t. Java version
Define a RDD from numbers
numberRDD = spark.sparkContext.parallelize(range(1, 10))
numberRDD.collect() # Output: [1, 2, 3, 4, 5, 6, 7, 8, 9]
Transform one RDD with filtering using a lambda function: a small anonymous function with just one expression, often used for short-lived functionality
evenNumberRDD = numberRDD.filter(lambda num: num % 2 == 0)
evenNumberRDD.collect() # Output: [2, 4, 6, 8]
Define a RDD from a textfile
logRDD = spark.sparkContext.textFile("/FileStore/tables/sampleFile.log")
Transform by filtering INFO and ERROR, then map to key-value where log_level-1 and reduce to key-value where log_level-count
resultRDD = logRDD\
.filter(lambda line: line.split(" - ")[2] in ["INFO", "ERROR"])\
.map(lambda line: (line.split(" - ")[2], 1))\
.reduceByKey(lambda x, y: x + y)
Here there are 3 stages:
- filter
- map
- reduceByKey
Export results
resultRDD.collect() # Output: [('INFO', 2), ('ERROR', 3)]
Here the result is a Pair RDD because of the key-value structure.
Spark provides a built in UI for monitoring on default port 4040. It can be handy to monitor, optimize performance and manage Spark resources.
![[Pasted image 20260211122254.png|600]]
- Jobs: displays DAG, stages, tasks
- Stages: detailed information on the stage
- Storage: show partitions
- Environment: Spark config
- Executors: resource usage
- SQL: insights into SQL queries
When calling a functions in Spark, it can be:
- transformation: an already existing RDD is transformed and the resulting RDD is outputted
- narrow (see [[#Resilient Distributed Dataset (RDD)]])
- wide (see [[#Resilient Distributed Dataset (RDD)]])
- action: produces a result or data is written
take(N): returns first N elements of RDDfirst(): returns first element of RDD
top(N): returns first N elements of the sorted RDD
saveAsTextFile(dir)reduce(lambda): combines RDD elements to produce aggregated results, by processing the groups with lambda passedforeach(lambda): executes lambda for each element of RDD
map(lambda): applies lambda to each element of RDD, returning the RDD with respective outputsflatMap(lambda): applies map (with lambda passed) and flatteningfilter(lambda): filters out elements that satisfy lambdaRDD1.union(RDD2): combines two RDDs
distinct(): removes duplicates and returns a RDD with unique elementssortBy(lambda): sorts a RDDRDD1.intersection(RDD2): finds common elements between RDDsRDD1.subtract(RDD2): removes RDD2 elements from RDD1
groupByKey(): just groups elements with same key (no aggregation)reduceByKey(lambda): aggregates values with same key using lambda- better than using
groupByKey(): uses local aggregation so data is initially reduced inside their own node and then shuffled across the cluster
- better than using
sortByKey(): sorts key-value elements w.r.t. the keyRDD1.join(RDD2): joins two RDDs on their keys
Caching enable storing frequently used RDDs in memory or disk to improve performance.
There are more modes for RDDs caching:
MEMORY_ONLY: unserialized data in memoryMEMORY_ONLY_SER: serialized data in memoryMEMORY_AND_DISK: unserialized data in diskMEMORY_AND_DISK_SER: unserialized data in diskDISK_ONLY: data in disk onlyOFF_HEAP: serialized RDD stored off-heap
Actions:
cache(): cache RDDunersist(): clear cached RDD
Partitions define the degree of parallelism in Spark.
repartition(N): creates N partitions from the starting RRD, so shuffles data across the nodescoalesce(): #TODO sulle dispense c’è una definizione erratapartitionBy(): #TODO
createOrReplaceTempView(query): create a DataFrame based on the query outputcreateGlobalTempView(): to persist views across Spark sessions
Used to enable - limited - data sharing across tasks. Two types are available in Spark:
- Broadcast Variables: provide read-only large data sharing across nodes
- cached on each node that uses them
- avoid redundant data transfer
- unpersist() for reuse
destroy()for permanent deletion
- Accumulators: provide an easy way to aggregate results from various tasks, back to the driver. Supports only additive operations
They are distributed collections of data organized into rows and columns. It is an abstraction on top of RDDs, used for efficient processing of structured data. Dimension can go from KB to PB.
A DataFrame can be generated from:
- files
- storage media
- RDDs
Common actions:
printSchema(): displays the schema of a DataFrame in a tree formatselect(*cols): selects some specific columns from a DataFramefilter(lambda): filters rows w.r.t. lambdagroupBy(lambda): groups rows w.r.t. lambda
Example: load data from file
sales_df = spark.read.option("sep", "\t").option("header", "true") \
.csv("hdfs:///sample_data/sales-data-sample.csv")
where:
option("sep", "\t")specifies the delimiter for a tab-separated fileoption("header", "true")specifies that the file contains a header row
Example: load data to Parquet Parquet is just a columnar storage that preserves schema.
# Save as Parquet format
sales_df.write.parquet("sales.parquet")
# Read from Parquet and query
parquetSales = spark.read.parquet("sales.parquet")
parquetSales.createOrReplaceTempView("parquetSales")
ip = spark.sql("SELECT ip FROM parquetSales WHERE id BETWEEN 10 AND 19")
ip.show()
Example: working with JSONs
# Save as JSON format
sales_df.write.json("sales.json")
# Read from JSON and query
jsonSales = spark.read.json("sales.json")
jsonSales.createOrReplaceTempView("jsonSales")
ip = spark.sql("SELECT ip FROM jsonSales WHERE id BETWEEN 10 AND 19")
ip.show()
Datasets are strongly-typed collections of objects, suitable for relational and functional transformations. Available on Scale and Java only.
Available functions are:
- transformations
map()flatMap()filter()select()aggregate()
- actions
count()show()save()
Example: load data from CSV into dataset
import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder
.appName("Dataset Example")
.getOrCreate()
// Load dataset from a CSV file
val salesDS = spark.read
.option("sep", "\t")
.option("header", "true")
.csv("hdfs:///sample_data/sales-data-sample.csv")
.as[SalesRecord] // Convert DataFrame to Dataset using a case class
// Show the content of the Dataset
salesDS.show()
// Define the case class for strong typing
caseclassSalesRecord(product: String, quantity: Int, price: Double)
Of course, schema mismatches can occur. To fix that Spark provides some strategies:
PERMISSIVE: invalid fields are replaces with nullsDROPMALFORMED: invalid records - with invalid fields - are droppedFAILFAST: invalid records trigger process abortion
Example: deal with schema mismatches
val salesDS = spark.read
.option("sep", "\t")
.option("header", "true")
.option("mode", "PERMISSIVE") // Set the schema mismatch mode
.csv("hdfs:///sample_data/sales-data-sample.csv")
.as[SalesRecord]
salesDS.show()
Create the example.py file:
from pyspark.sql import SparkSession
from pyspark.sql.functions import desc
spark = SparkSession.builder.appName('Sales Application').getOrCreate()
sales_df = spark.read.option("header", "true").csv("/user/data/sales.csv")
result = sales_df.groupBy("COUNTRY_CO
result.show()
Then run the command:
spark-submit \
--master yarn \
--deploy-mode client \
--num-executors 3 \
--executor-memory 2g \
--total-executor-cores 1 \
example.py
LocalStack is a fully functional local cloud environment that emulates AWS services. Useful to local testing and development.
It emulates a variety of AWS services, can be used with provisioning tools (Terraform, AWS CDK, Pulumi) and runs inside a container.
Platform for developing, shipping, running apps in isolated containers: a packaged application with its dependencies that ensure running across various environments.
- Dockerfile: a script that contains directives on how to build a Docker Image (see [[#Dockerfile]])
- Docker Image: snapshot of an application and its dependencies (see [[#Docker Image]])
- Docker Container: a running instance of a Docker Image
[!example] Manage containers
docker rundocker psdocker stopdocker rmdocker exec -it [container-id] bash
Directives to build images are broken into the following components:
- base image to use
- commands to run at startup
- files to copy
- software to install
- env variables
- ports to expose
Everything is translated to the keywords:
[!example]
FROM <image>WORKDIR <path>COPY <host-path> <image-path>RUN <cmd>ENV <key> <value>EXPOSE <port-number>USER <username-or-uid>CMD ["<cmd>", "<arg1>", ...]
# Use the official Ubuntu 18.04 base image from the ECR Public Gallery
FROM public.ecr.aws/docker/library/ubuntu:18.04
# Install dependencies and Apache
RUN apt-get update && \
apt-get -y install apache2 && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
# Write the Hello World message
RUN echo 'Hello World!' > /var/www/html/index.html
# Configure the startup script for Apache
RUN echo '. /etc/apache2/envvars' > /root/run_apache.sh && \
echo 'mkdir -p /var/run/apache2' >> /root/run_apache.sh && \
echo 'mkdir -p /var/lock/apache2' >> /root/run_apache.sh && \
echo '/usr/sbin/apache2 -D FOREGROUND' >> /root/run_apache.sh && \
ln -sf /proc/self/fd/1 /var/log/apache2/access.log && \
chmod 755 /root/run_apache.sh
# Expose port 80
EXPOSE 80
# Launch the script when the container starts
CMD ["/bin/bash", "/root/run_apache.sh"]
Once that the Dockerfile is defined, the image can be build with:
[!example] Build Docker image
docker build -t localstack-ecr-image .
Once that an image is built, it can be pushed to a Docker Registry.
When pushing to a Self-hosted Registry: the image name must follow the pattern:
[registry-host]:[registry-port]/[repository-name]:[tag-name]
Before pushing, the image must be tagged using the command:
docker tag SOURCE_IMAGE[:TAG] TARGET_IMAGE[:TAG]
The examples below show how to tag and push a Docker Image to a LocalStack Docker Registry (useful for [[#ECR]]).
[!example] Push Docker Image to Registry
docker tag localstack-ecr-image 000000000000.dkr.ecr.us-east-1.localhost.localstack.cloud:4566/localstack-ecr-repositoryto tag imagedocker push 000000000000.dkr.ecr.us-east-1.localhost.localstack.cloud:4566/localstack-ecr-repositoryto push image
It simplifies the management of multi-container Docker applications. The whole infrastructure is defined in the docker-compose.yml file and the main commands are:
[!example] Manage composable
docker compose updocker compose down
Example of compose
version:"3"
services:
web:
image:nginx
ports:
- "80:80"
db:
image:mysql
environment:
MYSQL_ROOT_PASSWORD:example
Used to monitor the LocalStack instance and all the services connected to it.
It’s just a thin wrapper around the AWS CLI, used to interact with LocalStack.
The Amazon Simple Storage Service (S3) provides a scalable and secure object storage service.
[!example] Manage S3 buckets
awslocal s3api list-bucketsawslocal s3 mb s3://NAME*awslocal s3 cp PATH_TO_FILE s3://NAME*awslocal s3api list-objects --bucket NAMEawslocal s3api get-object --bucket NAME --key FILE_NAME PATH_TO_DOWNLOADawslocal s3api delete-object --bucket NAME --key FILE_NAMEawslocal s3api delete-bucket --bucket NAME
* mb → make bucket
** cp → like the linux command to copy files
There is a tiny difference between s3 and s3api:
- s3: it’s designed to feel like working with local files, so a single command can use multiple API calls
- s3api: every command maps almost 1:1 to AWS S3 API calls, so more low level than s3 command
Elastic Compute Cloud (EC2) is an AWS service that provides on-demand, scalable computing capacity. EC2 instances act as virtual servers that user can manage.
EC2 instances access is secured by a key pair:
- public key: stored by Amazon EC2
- private key: stored by the user
Warning
In order to allow EC2 to create new containers on the host, "/var/run/docker.sock:/var/run/docker.sock" must be mapped in the volumes.
[!example] Manage SSH keys
ssh-keygen -t rsa -b 2048*awslocal ec2 import-key-pair --key-name KEY_NAME --public-key-material fileb://PATH_TO_PUBKEY**
Both commands are different w.r.t. the ones in the slides:
* AWS supports key pairs of 2048 bytes RSA
** fileb: is a workaround to file:
Before creating a EC2, a Virtual Private Cloud (VPC) must be set up.
- logically isolated section of the AWS cloud:
- defines a pool of IPs and allows subnets (6 created by default)
- each resource runs inside a subnet and must be attached to at least one security group: defines its network rules
- subnets are tied to a specific Availability Zone (AV)
- in reality → they give geographic redundancy
- in localstack → they are simulated
- security group acts are virtual firewall, so controls inbound and outbound traffic
- subnets are tied to a specific Availability Zone (AV)
[!example] Manage VPC and Subnets
awslocal ec2 describe-vpcsto list all VPCs and gather VPC_IDawslocal ec2 describe-availability-zonesawslocal ec2 describe-subnets --filters "Name=vpc-id,Values=VPC_ID"to list subnets in a vpc with id equals to VPC_ID
[!example] Manage VPC Security Group
awslocal ec2 authorize-security-group-ingress --group-id default --protocol tcp --port 8000 --cidr 0.0.0.0/0to authorize incoming tcp traffic on port 8000 from any source (defined with Classless Inter-Domain Routing)awslocal ec2 describe-security-groups --filters "Name=vpc-id,Values=VPC_ID"to fetch the security groups in a specific VPC
Now a EC2 instance can be created and a generic script user_script.sh can be fed to it.
[!example] Manage EC2 instance
awslocal ec2 run-instances --image-id ami-df5de72bdb3b --count 1 --instance-type t3.nano --key-name my-key --security-group-ids SECURITY_GROUP_ID --user-data file://./user_script.shssh -p 22 -i ~/.ssh/id_rsa root@127.0.0.1
where:
--instance-type t3.nanodetermines the hardware used;t3means general purpose--image-id ami-df5de72bdb3bis given externally by Amazon AWS or the organization and is the template for the OS to boot on the EC2 instance. Must be compatible with instance type selected
Moreover, the EC2 instance can be monitored via the Resource Browser by accessing Instance→System Status in the LocalStack web app.
EC2 machines con be emulated with two different methods:
- Mock/CRUD
- Pro
This configuration can be changed from the EC2_VM_MANAGER option:
docker→ Docker containers are created (see [[#EC2 VM Managers Docker]])libvirt→ real virtual machines are createdmock→ no real virtualization, stores all the resources as in-memory representation
Docker engine is used to emulate EC2 instances, so Docker socket must be mounted to use it. EC2 instances emulated will be subject to Docker containers limitations.
- Containers created will be named as
localstack-ec2.INSTANCE_ID. - User data (like the Python script used before)
- can be found in the container at
/var/lib/cloud/instances/INSTANCE_ID - can be sent via CLI with the UserData or ModifyInstanceAttribute variables
- can be found in the container at
- Logs can be found in the container at path
/var/log/cloud-init-output.log. - Network Address is contained in PublicIpAddress
[!example] A summary of emulated actions (APIs) that can be made with EC2 instances is:
CreateImage: capture snapshot of a running instance into a new AMI (usesdocker commitunder the hood)DescribeImages: list of AMIsDescribeInstances, where docker instances are marked withec2_vm_manager: dockerRunInstances(used before)StopInstances→ pauseStartInstances→ resumeTerminateInstances→ stop
Warning
Actions can be written into:
- PascalCase: keywords used when interacting with AWS instance via REST APIs (using classic AWS)
https://ec2.amazonaws.com/?Action=ModifyInstanceAttribute &InstanceId=i-1234567890abcdef0 &InstanceType.Value=m1.small &AUTHPARAMS - kebab-case: keywords used when interacting with AWS instance via CLI (LocalStack)
Such EC2 instances can che used as an Elastic Book Store (EBS) block device:
- EBS Volumes
- EBS Snapshots On LocalStack the volume size unit used is MiBs (GiBs in AWS).
#TODO Example: create an ext3 file system on the block device
The Amazon Elastic Container Registry (ECR) is a fully managed service for storing and managing Docker images. It can integrate with Lambda, Elastic Container Service ([[#ECS]]) and Elastic Kubernetes Service (EKS).
The service is private by default and lives in a specific AWS region. On LocalStack, everything is emulated and local.
See [[#Docker]] chapter to see how Docker images can be created and managed.
[!example] Manage ECR
awslocal ecr create-repository --repository-name localstack-ecr-repo --image-scanning-configuration scanOnPush=trueto create a repository where newly pushed images are scanned for vulnerabilities
ECR instances can be monitored from the LocalStack web app under Status→ECR section.
The Amazon Elastic Container Service (ECS) is a fully managed service to orchestrate Docker images.
Features:
- Logical Separation for containerized apps; useful to implement various stages - dev/test/prod - of the same app
- Resource Management and Organization
- Run multiple copies (replicas) of the same task
- Scale apps for traffic spikes
- Load balance traffic with Elastic Load Balancing (ELB)
- pay-as-you-go
Containers orchestration can be done using two approaches:
- AWS Fargate: serverless and infrastructure-free; the user just defines the container task and Fargate provisions compute on demand
- focus on the application
- auto-scaling
- no idle cost
- ideal for event driven apps or for spiky workloads
- [[#EC2]]: full control; a cluster of EC2 instances is created
- manual instance type and resources
- auto-scale or manual scale
- pay for EC2 instance 24/7
- ideal for predictable workload
- ECS Agent runs inside each EC2 instance and is responsible for task management and reporting
![[Pasted image 20260214111520.png|400]]
[!example] Create ECS
awslocal ecs create-cluster --cluster-name my-cluster
Now, a task can be defined in a JSON file:
{
"containerDefinitions": [
{
"name": "server",
"cpu": 10,
"memory": 10,
"image": "000000000000.dkr.ecr.us-east-1.localhost.localstack.cloud:4566/localstack-ecr-repository:latest",
"portMappings": [
{
"containerPort": 80,
"protocol": "tcp",
"appProtocol": "http"
}
],
"essential": true,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-create-group": "true",
"awslogs-group": "myloggroup",
"awslogs-stream-prefix": "myprefix",
"awslogs-region": "us-east-1"
}
}
}
],
"family": "my-family",
"requiresCompatibilities": [
"FARGATE"
]
}
where:
familyis a custom identifier, used for to name different versions of the same task. If two tasks of the same family are registered, the version increases
[!example] Create ECS task
awslocal ecs register-task-definition --cli-input-json file://task_definition.jsonto create a task definition (it is NOT running yet) and the associated logs
[!example] Run ECS task
awslocal ecs create-service --service-name my-service --cluster my-cluster --task-definition my-family --launch-type FARGATE --desired-count 1 --network-configuration "awsvpcConfiguration={subnets=[\"<SUBNET-ID>\"],securityGroups=[\"SG-ID\"],assignPublicIp=\"ENABLED\"}"to create a taskawslocal ecs update-service --service my-service --cluster my-cluster --desired-count 2to update a service configuration
where:
- to get
SUBNET_IDsee [[#EC2]] - to get
SG_IDsee [[#EC2]]
[!example] Manage ECS logs
awslocal logs filter-log-events --log-grou-name my-log-groupto access all logsawslocal logs filter-log-events --log-grou-name my-log-group | select -first 20awslocal logs filter-log-events --log-grou-name my-log-group --query "events[].message"
It is a AWS managed service to create, deploy and manage REST/HTTP/WebSockets APIs. Can integrate with [[#AWS Lambda]], [[#EC2]], Cognito, etc.
Features:
- RESTful APIs creation based on HTTP and stateless
- main front door for
- workloads running EC2
- [[#AWS Lambda]]
- web apps
- real-time apps
It allows running code without worrying about infrastructure.
- Functions (code) run only when needed, automatically
- pay-as-you-go
- multiple runtimes supported
Example: API Gateway to route REST APIs to Lambda functions
- start LocalStack
- create Lambda function, zip and upload it
'use strict'
const apiHandler = (event, context, callback) => {
const name = event.queryStringParameters.name;
const surname = event.queryStringParameters.surname;
callback(null, {
statusCode: 200,
body: JSON.stringify({
message: `Hello from Lambda: ${name} ${surname}`
}),
});
};
module.exports = {
apiHandler,
};
where:
event(object) contains information from the invokercontext(object) contains details about the invocation, function and the execution environmentcallback(function) can be used in non-async handlers to send a response
[!example] Upload Lambda function
awslocal lambda create-function --function-name apigw-lambda \ --runtime nodejs16.x --handler lambda.apiHandler --memory-size 128 \ --zip-file fileb://function.zip --role arn:aws:iam::111111111111:role/apigw
where:
--handleris the name of the method within our code that Lambda calls to run our function (see source code above)
- create REST API and the associated resource
[!example] Manage REST API Create the REST API
awslocal apigateway create-rest-api --name 'API Gateway Lambda integration'Retrieve all APIs
awslocal apigateway get-resources --rest-api-id <REST_API_ID>where the REST_API_ID is given from the successful execution of the creation command.
Create the resource
awslocal apigateway create-resource \ --rest-api-id <REST_API_ID> \ --partent-id <PARENT_ID> \ --path-part "hello"Associate GET method to the resource
awslocal apigateway put-method \ --rest-api-id <REST_API_ID> \ --resource-id <RESOURCE_ID> \ --http-method GET \ --request-parameters file://paramters.json \ --authorization-type "NONE"
where parameters.json is:
{
"httpMethod":"GET",
"authorizationType":"NONE",
"apiKeyRequired":false,
"requestParameters":{
"method.request.querystring.name":true,
"method.request.querystring.surname":true
}
}
here requestParameters can have different values of method.request.{location}.{name} where {location}:
querystringpathheader
- connect to Lambda
[!example] Connect resource to lambda
awslocal apigateway put-integration \ --rest-api-id <REST_API_ID> --resource-id <RESOURCE_ID> \ --http-method GET --type AWS_PROXY \ --integration-http-method POST \ --uri arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-1:000000000000:function:apigw-lambda/invocations \ --passthrough-behavior WHEN_NO_MATCH
where:
--type AWS_PROXYmeans that the API Gateway will pass the request and return the response given with no transformations--passthrough-behavior WHEN_NO_MATCHmeans that when there is noContent-Typematching, the API Gateway will forward the request anyway--uriused to interact with backend. It has a specific structure:arn:aws:apigateway:{region}:lambda- ARN prefix
- region used
- AWS service used (lambda)
path/2015-03-31version of the gateway, active since that datefunctions/{lambda_arn}invocationsjust a suffix
--integration-http-method POSTbecause while the API gateway is exposing a GET, it interacts with backend using a POST
- Deploy and test
[!example] Deploy the API
awslocal apigateway create-deployment \ --rest-api-id <REST_API_ID> \ --stage-name test
where:
-
--stage-namecan bedev/prod/test[!example] Test the API
curl http://localhost:4566/restapis/cor3o5oeci/test/_user_request_/hello?name=Mario&surname=Rossi
where the API endpoint follows the patten:
http://localhost:4566/restapis/{REST_API_ID}/{STAGE}/_user_request_/{PATH}?{KEY-VALUE-PAIRS}.
The expected output is:
{
"message":"Hello from Lambda: Mario Rossi"
}
#TODO Exercise 1 #TODO Exercise 2
It is a serverless workflow engine for orchestrating multiple AWS services. It is programmed using Amazon States Language (ASL).
Can be managed from Resource Browser as well.
[!example] Create a state machine
aswlocal stepfunctions create-state-machine \ --name "CreateAndListBuckets" \ --definition file://definition.json \ --role-arn "arn:aws:iam::000000000000:role/stepfunctions-role"
where:
definition.jsonrepresents the state machine itself and it is written into ASL
{
"Comment": "Create bucket and list buckets",
"StartAt": "CreateBucket",
"States": {
"CreateBucket": {
"Type": "Task",
"Resource": "arn:aws:states:::aws-sdk:s3:createBucket",
"Parameters": {
"Bucket": "new-sfn-bucket"
},
"Next": "ListBuckets"
},
"ListBuckets": {
"Type": "Task",
"Resource": "arn:aws:states:::aws-sdk:s3:listBuckets",
"End": true
}
}
}
where the at each stage, it is checked wether it is an End State; if it’s not, the execution continues to the next step.
States,StartAtare mandatoryComment,Version,TimeoutSecondsare optionals
[!example] [ASL] Create Bucket
"CreateBucket":{ "Type":"Task", "Resource":"arn:aws:states:::aws-sdk:s3:createBucket", "Parameters":{"Bucket":"new-sfn-bucket"}, "Next":"ListBuckets" }
[!example] Manage state machine Create the state machine
aswlocal stepfunctions start-execution \ --state-machine-arn "arn:aws:states:us-east-1:000000000000:stateMachine:CreateAndListBuckets"Monitor the state machine
awslocal stepfunctions describe-execution --execution-arn "<YOUR-EXECUTION-ARN>"
Example: branched execution ![[Screenshot 2026-02-16 alle 08.04.34.png|500]] The idea is to create three independent lambdas and connect them together via the state machine changes.
- Create the lambdas source code, zip it and upload to LocalStack (see [[#AWS Lambda]])
- Create the workflow definition, create state machine and upload (see commands above)
{
"Comment": "Branched state machine example",
"StartAt": "FirstState",
"States": {
"FirstState": {
"Type": "Task",
"Resource": "arn:aws:states:::lambda:invoke",
"Parameters": {
"FunctionName": "arn:aws:lambda:us-east-1:000000000000:function:generator"
},
"OutputPath": "$.Payload",
"Next": "ChoiceState"
},
"ChoiceState": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.number",
"NumericEquals": 0,
"Next": "EvenState"
},
{
"Variable": "$.number",
"NumericEquals": 1,
"Next": "OddState"
}
],
"Default": "DefaultState"
},
"EvenState": {
"Type": "Pass",
"Result": "The number was zero (even).",
"End": true
},
"OddState": {
"Type": "Pass",
"Result": "The number was one (odd).",
"End": true
},
"DefaultState": {
"Type": "Fail",
"Error": "InvalidInput",
"Cause": "The number was neither 0 nor 1."
}
}
}
[!example] Branching logic
... "ChoiceState": { "Type": "Choice", "Choices": [ { "Variable": "$.number", "NumericEquals": 0, "Next": "EvenState" }, { "Variable": "$.number", "NumericEquals": 1, "Next": "OddState" } ], "Default": "DefaultState" }, ...
where:
- ChoiceState adds branching
Example: parallel execution ![[Screenshot 2026-02-16 alle 08.05.19.png|500]] The procedure is the same:
- Create the lambdas source code, zip it and upload to LocalStack (see [[#AWS Lambda]])
- Create the workflow definition, create state machine and upload (see commands above)
[!example] Parallel logic
... "Parallel State": { "Type": "Parallel", "Next": "combine", "Branches": [ { "StartAt": "lambda1", "States": {...} }, { "StartAt": "lambda2", "States": {...} } ] }
where the compiler waits to move on until each branch has terminated execution.
Example: input/output execution ![[Screenshot 2026-02-16 alle 08.05.51.png|300]] where:
- InputPath: filters the incoming input and…
[!example] InputPath
"InputPath":"$.customer"
- Parameters: …transform it before passing to the Task
[!example] Parameters
"Parameters":{ "orderId.$":"$.orderId", "timestamp.$":"$$.State.EnteredTime", "fixedValue":"StaticData" }
- ResultSelector: transforms the Task output
- ResultPath: merges the result with the original state and…
[!example] ResultPath
"ResultPath":"$.shippingDetails"
- OutputPath: …is passed to the NextState
[!example] OutputPath
"OutputPath":"$.shippingDetails"
When using filter, if they are not defined the entire input goes through. For complete example of execution, see slides.
Keep in mind that:
- $ refers to the State Input, so the data being passed from the previous state into the current one.
- $$ refers to the Context Object, so the metadata about the execution itself that isn't part of the payload.
The Amazon Simple Queue Service (SQS) enables communication based on message queues between two decouples applications.
Features:
- stores messages sent by producer until consumer is ready to process, in order of arrival
- Types of queues:
- Standard Queue (default): high throughput, with at-least-one delivery that could lead to duplicated messages or out of order
- FIFO Queue: strict ordering and exactly-once delivery
- Visibility Timeout: when a message is consumed, if it isn't deleted within the visibility timeout, it becomes visible again, allowing another consumer to pick it up
[!example] Manage queues
- Create a queue
awslocal sqs create-queue --queue-name my-queueARN is returned
- List queues
awslocal sqs list-queues
- Get queue attributes*
awslocal sqs get-queue-attributes --attribute-names All \ --queue-url http://sqs.us-east-1.localhost.localstack.cloud:4566/000000000000/my-queue
- Send message to queue
awslocal sqs send-message --message-body "Hello World" \ --queue-url http://sqs.us-east-1.localhost.localstack.cloud:4566/000000000000/my-queue
- Receive message from queue
awslocal sqs recieve-message \ --queue-url http://sqs.us-east-1.localhost.localstack.cloud:4566/000000000000/my-queue
- Delete message from queue
awslocal sqs delete-message --receipt-handle <RECEIPT_HANDLE> \ --queue-url http://sqs.us-east-1.localhost.localstack.cloud:4566/000000000000/my-queuewhere
RECEIPT_HANDLEis given when reading a message
- Purge the queue (delete all)
awslocal sqs purge-queue \ --queue-url http://sqs.us-east-1.localhost.localstack.cloud:4566/000000000000/my-queue
* Queue attributes can be:
ApproximateNumberOfMessagesreturns the number of message availableMaximumMessageSizeMaximumRetentionPeriodVisibilityTimeout
Example: Integrate SQS with [[#AWS Lambda]]:
[!example] Event Source Mapping
awslocal lambda create-event-mapping \ --function-name my-function --event-source-arn <QUEUE_ARN>
In the lambda, the queue is accessible by:
exports.lambda_handler = async (event) => {
for (const record of event.Records) {
await processMessageAsync(record);
}
return {
statusCode: 200,
body: JSON.stringify({ message: "Processing complete" })
};
};
async function processMessageAsync(record) {
try {
console.log(`Processed message ${record.body}`);
// Placeholder for actual async work
await Promise.resolve(1);
} catch (err) {
console.error("An error occurred:", err.message);
throw err;
}
}
The Amazon Simple Notification Service (SNS) is a serverless messaging service that can distribute a message to multiple subscribers.
[!example] Manage SNS
- Create a topic
awslocal sns create-topic --name my-topic
TopicArnis returned
- List all topics
awslocal sns list-topic
- Set topic attributes
awslocal sns set-topic-attribute \ --topic-arn <TOPIC_ARN> \ --attribute-name DisplayName \ --attribute-name MyTopicDisplayName
- List topic attributes
awslocal sns get-topic-attributes \ --topic-arn <TOPIC_ARN>
- Publish message to the topic
awslocal sns publish --message file://message.txt \ --topic-arn <TOPIC_ARN>
messageIdis returned
- create a queue and a topic
- subscribe the queue to the topic
[!example]
- Subscribe queue to topic
awslocal sns subscribe --topic-arn <TOPIC_ARN> \ --protocol sqs --notification-endpoint <QUQUE_ARN>
SubscriptionArnis returned
- List subscriptions
awslocal sns list-subscriptions
- Unsubscribe
aswlocal sns unsubscribe \ --subscription-arn <SUBSCRIPTION_ARN>
- send a message to the queue, via the topic with
sns publish - check if the message is arrived with
sqs recieve-message![[Pasted image 20260216101702.png|300]] The fanout scenario is represented, where a SNS topic is sent to multiple endpoints.
It is an Infrastructure as Code service by Amazon, though JSON or YAML files.
Features:
- the declarative template specifies resources, configurations, dependencies, etc. and the service automates all the implementation process
- once the template is executed (across multiple AWS accounts or regions) → becomes a stack that can be managed
- preview of changes before actually modifying the stack
- offers rollback and self-healing
- repeatability and scalable
[!example] Create a stack template
{ "Resources": { "LocalBucket": { "Type": "AWS::S3::Bucket", "Properties": { "BucketName": "cfn-quickstart-bucket" } } } }
[!example] Manage stack templates
- Deploy a stack
awslocal cloudformation deploy --stack-name my-stack \ --template-file "./stack.json"
- Delete a stack
awslocal cloudformation delete --stack-name my-stack
It is another Infrastructure as Code service by Amazon, which purpose is to make AWS open-source development-enabled. The user can write the code for a custom AWS infrastructure implementation, deploy it and [[#CloudFormation]] will allocate necessary resources to execute the project. Supported languages are: TS/JS, Python, Java, .NET, Go.
The core building blocks are the constructs:
- L1 constructs: low-level, direct mapping to AWS [[#CloudFormation]]
- L2 constructs: provides come abstractions for common patterns
- L3 constructs: high-level, provides entire solutions
Features:
- it provides higher abstraction on top of [[#CloudFormation]]
- modular code
- ensures AWS best-practices
First, the tool must be installed separately:
[!example] Install CDK CLI and check version
npm install -g aws-cdk-local aws-cdk cdklocal --version
[!example] Init a CDK project
- Initialize and deploy
cdklocal init sample-app --language=javascript
- Bootstrap the project: essential to connect it to the AWS instance because it prepares all required resources
cdklocal bootstrapIf the project is meant to run on multiple regions or accounts, it must be bootstrapped for each one of them.
cdklocal bootstrap aws://<ACCOUNT_ID>/<REGION>
- Preview the deployment
cdklocal synth
- Deployment to AWS
cdklocal deploy
- Destroy deployment
cdklocal destroy
The initialized project follows this structure:
my-cdk-app/
├── bin/
│ └── my-cdk-app.js # Entry point for the CDK application
├── lib/
│ └── my-cdk-app-stack.js # Main stack definition for AWS resources
├── cdk.json # CDK app configuration
├── package.json # Project dependencies and scripts
├── README.md # Project documentation
└── .gitignore # Files to ignore in git
bin/will be the entrypoint for the app[!example] my-cdk-app.js
#!/usr/bin/env node const cdk = require('aws-cdk-lib'); const { QueueSubscriptionStack } = require('../lib/queue_subscription-stack'); const app = new cdk.App(); // Ensure there is a space between 'new' and your Stack class name new QueueSubscriptionStack(app, 'QueueSubscriptionStack', { /* * If you don't specify 'env', this stack will be environment-agnostic. * Account/Region details can be passed here if needed. */ });
lib/holds the core business logic, organized in .js files with classes that extendcdk.Stack[!example] my-cdk-app-stack.js
const cdk = require('aws-cdk-lib'); const sqs = require('aws-cdk-lib/aws-sqs'); class QueueSubscriptionStack extends cdk.Stack { constructor(scope, id, props) { super(scope, id, props); // Define the SQS Queue const queue = new sqs.Queue(this, 'QueueSubscriptionQueue', { queueName: 'my-queue', visibilityTimeout: cdk.Duration.seconds(300) }); // ... additional resources (SNS Subscriptions, etc.) go here } } module.exports = { QueueSubscriptionStack };
cdk.jsondefines how to run the CDK app with additional configurationpackage.jsonlists the node dependencies, scripts and metadata
#TODO
It is a fully managed key-value NoSQL database provided by AWS. It allows:
- scalability
- encryption
- replication
- auto-scaling
- backup
[!example] Manage DynamoDB instance
- Create instance
awslocal dynamodb create-table --table-name my-table \ --key-schema AttributeName=id,KeyType=HASH \ --attribute-definitions AttributeName=id,AttributeType=S \ --billing-mode PAY_PER_REQUEST --region us-east-1
- List tables
awslocal dynamodb list-tables
- Insert items
awslocal dynamodb put-item --table-name my-table \ --item file://table-item.json
- Query num of items
awslocal dynamodb describe-table --table-name my-table \ --query 'Table.ItemCount'
In AWS both [[#Orchestration-based Coordination]] and [[#Choreography-based Coordination]] of SAGA can be implemented with a combination of [[#Step Functions]] and [[#AWS Lambda]].
![[Pasted image 20260216125940.png|500]]
As said in the [[#SAGA Pattern]] chapter, each compensatable operation must have its respective compensating transaction (in red).
[!example] Compensating Transactions in Step Functions
"Catch": [ { "ErrorEquals": [ "States.ALL" ], "ResultPath": "$.reserveFlightError", "Next": "CancelFlightReservation" } ]
where:
CatchErrorEquals"ResultPath": "$.reserveFlightError"specifies where to print the error information
#TODO
It is a serverless event bus service by Amazon that enables building event-driven applications by routing events from various sources like [[#AWS Lambda]], [[#Step Functions]], [[#SQS]] queues.
Applications:
- automating event-driven microservices or workflows (e.g.: trigger a [[#AWS Lambda]] on [[#S3]] object upload)
- SaaS integration with third-party apps
EventBridge can work with a variety of rule types:
- Scheduled Rules: trigger rules at regular time intervals
- Examples
rate(5 minutes)→ every 5 minutescron(0 12 * * ? *)→ every day at 12:00
- Examples
- Event Pattern Rules: trigger rules when certain event occurs on an AWS component
- Example
{
"source": ["aws.s3"],
"detail-type": ["AWS API Call via CloudTrail"],
"detail": {
"eventName": ["PutObject"]
}
}
- Custom Event Rules: tigger rules when certain event occurs on a microservice developed by the user
- SaaS Integration Rules: trigger rules when certain event occurs on a third-party software
Rules can trigger actions on a variety of targets:
- [[#AWS Lambda]]
- [[#SNS]] topics
- [[#SQS]] queues
- [[#Step Functions]]
[!example] Manage EventBridge instances
- Create a rule
awslocal events put-rule --name my-scheduled-rule \ --schedule-expression "rate(2 minutes)"to run the rule every 2 minutes
- Add lambda function as target, so add permission to do it
awslocal lambda add-permission --function-name my-events \ --statement-id my-scheduled-rule \ --action "lambda:InvokeFunction" \ --principal events.amazonaws.com --source-arn <SCHEDULED_RULE_ARN>
- Add lambda as target
awslocal events put-targets --rule my-scheduled-rule \ --targets file://targets.jsonwhere the
targets.jsoncontains all the lambdas’ ARNs.
[!example] Check logs to test
awslocal logs describe-log-groups awslocal logs describe-log-groups --log-group-name /aws/lambda/my-events localstack logs
Example: create a EventBridge rule using CDK #TODO
[!example] Manage custom event bus
As seen for [[#AWS SAGA Pattern]], implementing Microservices on AWS means combining [[#AWS Lambda]] and [[#API Gateway]].
- API GW acts as an internal proxy
- API GW as single external endpoint with extra features such as authentication
Example: single API GW ![[Screenshot 2026-02-16 alle 16.03.23.png|500]] Example: multiple API GWs ![[Screenshot 2026-02-16 alle 16.03.39.png|500]]
As seen in the [[#Asynchronous Messaging]] communication. ![[Screenshot 2026-02-16 alle 16.06.29.png|500]] However, business actions are still synchronous.
- Approach with [[#SNS]] where each subscriber receives a copy of the message and process it. ![[Screenshot 2026-02-16 alle 16.06.51.png|500]]
- Approach with [[#EventBridge]] (bus) where a different tailored message is sent to every target. ![[Screenshot 2026-02-16 alle 16.07.58.png|500]]
It is an AWS service for creating serverless GraphQL APIs to query databases, microservices, APIs, etc.
[!example] Query a DynamoDB table
- First, get an API id to interact with a given table
awslocal appsync create-graphsql-api --name NotesApi \ --authentication-type <API_KEY>
- Then, create an API key
awslocal appsync create-api-key --api-id <API_ID>
- Define a GraphSQL schema to further query the database
awslocal appsync start-schema-creation --api-id <API_ID> \ --definition file://schema.graphqlsee below for the GraphQL schema.
- Create a Data Source for the dynamoDB table. The Data Source is an AWS abstraction used to interact with a table and necessary because provides AppSync a connector with an IAM Role associated.
awslocal appsync create-data-source --name AppSyncDB --api-id <API_ID> \ --type AMAZON_DYNAMODB \ --dynamodb-config tableName=DynamoDBNotesTable,awsRegion=us-east-1
- Now create a Resolver (see [[#API Gateway GraphQL Implementation]]) to query the DB
awslocal appsync create-resolver --api-id <API_ID> --type-name Query \ --field-name allNotes --data-source-name AppSyncDB \ --request-mapping-template file://request-template.vtl --response-mapping-template file://response-template.vtl
[!example] schema.graphql
type Note { NoteId: ID! title: String content: String } type PaginatedNotes { notes: [Note!]! } type Query { allNotes(limit: Int): PaginatedNotes! getNote(NoteId: ID!): Note } type Mutation { saveNote(NoteId: ID!, title: String!, content: String!): Note deleteNote(NoteId: ID!): Note } schema { query: Query mutation: Mutation }
#TODO format better
[!example] request-template.vtl
#set($limit = $util.defaultIfNullOrEmpty($context.arguments.limit, 10)) { "version": "2018-05-29", "operation": "Scan", "limit": $limit }
[!example] response-template.vtl
#if($context.result && $context.result.errorMessage) $util.error($context.result.errorMessage, $context.result.errorType) #else { "notes": $util.toJson($context.result.items) } #end
Just create a HTTP POST request viacurl, by adding the x-api-key header with the API key as value, with body:
{
"query": "query { allNotes { notes { NoteId title content } } }"
}
It is a serverless data integration service by AWS used to implement ETLs. The process follows those steps:
- Extract: data is fetched (e.g. from [[#S3]]) in raw format
- Transform: data gets cleaned, optimized and enriched (e.g. with Glue PySpark jobs)
- Load: data is loaded into another database (e.g. [[#S3]] or RDS table)
The entire Glue job is encoded in the etl-script-bucket.py:
[!example] etl-script-bucket.py
import sys from awsglue.transforms import * from awsglue.utils import getResolvedOptions from pyspark.context import SparkContext from awsglue.context import GlueContext from awsglue.job import Job from awsglue.dynamicframe import DynamicFrame from pyspark.sql.functions import col # INIT args = getResolvedOptions(sys.argv, ['JOB_NAME']) sc = SparkContext() glueContext = GlueContext(sc) spark = glueContext.spark_session job = Job(glueContext) job.init(args['JOB_NAME'], args) # EXTRACT input_data_path = f"s3://{raw_data_bucket}/raw-data/" print(f"Reading data from: {input_data_path}") datasource0 = glueContext.create_dynamic_frame.from_options( connection_type="s3", connection_options={"paths": [input_data_path]}, format="csv", format_options={"withHeader": True, "separator": ";"}, ) # TRANSFORM data = datasource0.toDF() transformed_data = data.filter( ~( col("id").isNull() | (col("id") == "") | col("name").isNull() | (col("name") == "") | col("age").isNull() | (col("age") == "") ) ) # LOAD dynamic_frame_transformed = DynamicFrame.fromDF(transformed_data, glueContext, "transformedDF") glueContext.write_dynamic_frame.from_options( frame=dynamic_frame_transformed, connection_type="s3", connection_options={"path": f"s3://{raw_data_bucket}/transformed-data/"}, format="csv", format_options={"withHeader": True, "separator": ";"} )
[!example] Manage glue jobs
- Create a job
awslocal glue create-job --name MyGlueETLJob \ --role <ROLE_ARN> \ --command file://command-bucket.json \ --default-arguments file://arguments.json \ --max-retries 2 --timeout 10
- Start job
awslocal glue start-job-run --job-name MyGlueETLJob
where:
[!example] command-bucket.json
{ "Name":"glueetl", "ScriptLocation":"s3://my-raw-data-bucket/scripts/etl-script-bucket.py", "PythonVersion":"3" }
[!example] arguments.json
{ "--job-bookmark-option":"job-bookmark-enable", "--raw_data_bucket":"my-raw-data-bucket" }
When running the start job command the following stuff happens under the hood:
- a Spark environment is started, and Spark jobs are set up
- ETL script is run
- CloudWatch is available to monitor execution