Skip to content

Commit fb03de6

Browse files
Merge pull request #2050 from odincodeshen/main
Prototyping Safety-Critical for Autonomous Application on Neoverse
2 parents 64be96f + d477b05 commit fb03de6

File tree

10 files changed

+759
-1
lines changed

10 files changed

+759
-1
lines changed

assets/contributors.csv

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ Odin Shen,Arm,odincodeshen,odin-shen-lmshen,,
8282
Avin Zarlez,Arm,AvinZarlez,avinzarlez,,https://www.avinzarlez.com/
8383
Shuheng Deng,Arm,,,,
8484
Yiyang Fan,Arm,,,,
85-
Julien Jayat,Arm,,,,
85+
Julien Jayat,Arm,JulienJayat-Arm,julien-jayat-a980a397,,
8686
Geremy Cohen,Arm,geremyCohen,geremyinanutshell,,
8787
Barbara Corriero,Arm,,,,
8888
Nina Drozd,Arm,NinaARM,ninadrozd,,
Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
---
2+
title: Functional Safety for automotive software development
3+
weight: 2
4+
5+
### FIXED, DO NOT MODIFY
6+
layout: learningpathall
7+
---
8+
9+
## Why Functional Safety Matters in Automotive Software
10+
11+
[Functional Safety](https://en.wikipedia.org/wiki/Functional_safety) refers to a system's ability to detect potential faults and respond appropriately to ensure that the system remains in a safe state, preventing harm to individuals or damage to equipment.
12+
13+
This is particularly important in **automotive, autonomous driving, medical devices, industrial control, robotics and aerospace** applications, where system failures can lead to severe consequences.
14+
15+
In software development, Functional Safety focuses on minimizing risks through **software design, testing, and validation** to ensure that critical systems operate in a predictable, reliable, and verifiable manner. This means developers must consider:
16+
- **Error detection mechanisms**
17+
- **Exception handling**
18+
- **Redundancy design**
19+
- **Development processes compliant with safety standards**
20+
21+
### Definition and Importance of Functional Safety
22+
23+
The core of Functional Safety lies in **risk management**, which aims to reduce the impact of system failures.
24+
25+
In autonomous vehicles, Functional Safety ensures that if sensor data is incorrect, the system can enter a **safe state**, preventing incorrect driving decisions.
26+
27+
The three core objectives of Functional Safety are:
28+
1. **Prevention**
29+
- Reducing the likelihood of errors through rigorous software development processes and testing. In the electric vehicle, the battery systems monitor temperature to prevent overheating.
30+
2. **Detection**
31+
- Quickly identifying errors using built-in diagnostic mechanisms (e.g., Built-in Self-Test, BIST).
32+
3. **Mitigation**
33+
- Controlling the impact of failures to ensure the overall safety of the system.
34+
35+
This approach is critical in applications such as **autonomous driving, flight control, and medical implants**, where failures can result in **severe consequences**.
36+
37+
### ISO 26262: Automotive Functional Safety Standard
38+
39+
[ISO 26262](https://www.iso.org/standard/68383.html) is a functional safety standard specifically for **automotive electronics and software systems**. It defines a comprehensive [V-model](https://en.wikipedia.org/wiki/V-model) aligned safety lifecycle, covering all phases from **requirement analysis, design, development, testing, to maintenance**.
40+
41+
Key Concepts of ISO 26262:
42+
- **ASIL (Automotive Safety Integrity Level)**
43+
- Evaluates the risk level of different system components (A, B, C, D, where **D represents the highest safety requirement**).
44+
- For example: ASIL A can be Dashboard light failure (low risk) and ASIL D is Brake system failure (high risk).
45+
https://en.wikipedia.org/wiki/Automotive_Safety_Integrity_Level
46+
- **HARA (Hazard Analysis and Risk Assessment)**
47+
- Analyzes hazards and assesses risks to determine necessary safety measures.
48+
- **Safety Mechanisms**
49+
- Includes real-time error detection, system-level fault tolerance, and defined fail-safe or fail-operational fallback states.
50+
51+
Typical Application Scenarios:
52+
- **Autonomous Driving Systems**:
53+
- Ensures that even if sensors (e.g., LiDAR, radar, cameras) provide faulty data, the vehicle will not make dangerous decisions.
54+
- **Powertrain Control**:
55+
- Prevents braking system failures that could lead to loss of control.
56+
- **Battery Management System (BMS)**:
57+
- Prevents battery overheating or excessive discharge in electric vehicles.
58+
59+
For more details, you can check this video: [What is Functional Safety?](https://www.youtube.com/watch?v=R0CPzfYHdpQ)
60+
61+
62+
### Common Use Cases of Functional Safety in Automotive
63+
- **Autonomous Driving**:
64+
- Ensures the vehicle can operate safely or enter a fail-safe state when sensors like LiDAR, radar, or cameras malfunction.
65+
- Functional Safety enables real-time fault detection and fallback logic to prevent unsafe driving decisions.
66+
67+
- **Powertrain Control**:
68+
- Monitors throttle and brake signals to prevent unintended acceleration or braking loss.
69+
- Includes redundancy, plausibility checks, and emergency overrides to maintain control under failure conditions.
70+
71+
- **Battery Management Systems (BMS)**:
72+
- Protects EV batteries from overheating, overcharging, or deep discharge.
73+
- Safety functions include temperature monitoring, voltage balancing, and relay cut-off mechanisms to prevent thermal runaway.
74+
75+
These use cases highlight the need for a dedicated architectural layer that can enforce Functional Safety principles with real-time guarantees.
76+
A widely adopted approach in modern automotive platforms is the Safety Island—an isolated compute domain designed to execute critical control logic independently of the main system.
77+
78+
### Safety Island: Enabling Functional Safety in Autonomous Systems
79+
80+
In automotive systems, a **General ECU (Electronic Control Unit)** typically runs non-critical tasks such as infotainment or navigation, whereas a **Safety Island** is dedicated to executing safety-critical control logic (e.g., braking, steering) with strong isolation, redundancy, and determinism.
81+
82+
The table below compares the characteristics of a General ECU and a Safety Island in terms of their role in supporting Functional Safety.
83+
84+
| Feature | General ECU | Safety Island |
85+
|------------------------|----------------------------|--------------------------------------|
86+
| Purpose | Comfort / non-safety logic | Safety-critical decision making |
87+
| OS/Runtime | Linux, Android | RTOS, Hypervisor, or bare-metal |
88+
| Isolation | Soft partitioning | Hard isolation (hardware-enforced) |
89+
| Functional Safety Req | None to moderate | ISO 26262 ASIL-B to ASIL-D compliant |
90+
| Fault Handling | Best-effort recovery | Deterministic safe-state response |
91+
92+
This contrast highlights why safety-focused software needs a dedicated hardware domain with certified execution behavior.
93+
94+
**Safety Island** is an independent safety subsystem separate from the main processor. It is responsible for monitoring and managing system safety. If the main processor fails or becomes inoperable, Safety Island can take over critical safety functions such as **deceleration, stopping, and fault handling** to prevent catastrophic system failures.
95+
96+
Key Capabilities of Safety Island
97+
- **System Health Monitoring**
98+
- Continuously monitors the operational status of the main processor (e.g., ADAS control unit, ECU) and detects potential errors or anomalies.
99+
- **Fault Detection and Isolation**
100+
- Independently evaluates and initiates emergency handling if the main processing unit encounters errors, overheating, computational failures, or unresponsiveness.
101+
- **Providing Essential Safety Functions**
102+
- Even if the main system crashes, Safety Island can still execute minimal safety operations, such as:
103+
- Autonomous Vehicles → Safe stopping (Fail-Safe Mode)
104+
- Industrial Equipment → Emergency power cutoff or speed reduction
105+
106+
107+
### Why Safety Island Matters for Functional Safety
108+
109+
Safety Island plays a critical role in Functional Safety by ensuring that the system can handle high-risk scenarios and minimize catastrophic failures.
110+
111+
How Safety Island Enhances Functional Safety
112+
1. **Acts as an Independent Redundant Safety Layer**
113+
- Even if the main system fails, it can still operate independently.
114+
2. **Supports ASIL-D Safety Level**
115+
- Monitors ECU health status and executes emergency safety strategies (e.g., emergency braking).
116+
3. **Provides Independent Fault Detection and Recovery Mechanisms**
117+
- **Fail-Safe**: Activates a **safe mode**, such as limiting vehicle speed or switching to manual control.
118+
- **Fail-Operational**: Ensures that high-safety applications (e.g., aerospace systems) can continue operating under certain conditions.
119+
120+
For more insights on **Arm's Functional Safety solutions**, you can refer to: [Arm Functional Safety Compute Blog](https://community.arm.com/arm-community-blogs/b/automotive-blog/posts/functional-safety-compute)
121+
122+
123+
### Functional Safety in the Software Development Lifecycle
124+
125+
Functional Safety impacts **both hardware and software development**, particularly in areas such as requirement changes, version management, and testing validation.
126+
For example, in ASIL-D level applications, every code modification requires a complete impact analysis and regression testing to ensure that new changes do not introduce additional risks.
127+
128+
### Functional Safety Requirements in Software Development
129+
These practices ensure the software development process meets industry safety standards and can withstand system-level failures:
130+
- **Requirement Specification**
131+
- Clearly defining **safety-critical requirements** and conducting risk assessments.
132+
- **Safety-Oriented Programming**
133+
- Following **MISRA C, CERT C/C++ standards** and using static analysis tools to detect errors.
134+
- **Fault Handling Mechanisms**
135+
- Implementing **redundancy design and health monitoring** to handle anomalies.
136+
- **Testing and Verification**
137+
- Using **Hardware-in-the-Loop (HIL)** testing to ensure software safety in real hardware environments.
138+
- **Version Management and Change Control**
139+
- Using **Git, JIRA, Polarion** to track changes for safety audits.
140+
141+
This learning path builds upon the previous containerized [learning path](https://learn.arm.com/learning-paths/automotive/openadkit1_container) guide and introduces Functional Safety design practices from the earliest development stages.
142+
143+
By establishing an ASIL Partitioning software development environment and leveraging [**SOAFEE**](https://www.soafee.io/) technologies, developers can enhance software consistency and maintainability in Functional Safety applications.
Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
---
2+
title: How to Use Data Distribution Service (DDS)
3+
weight: 3
4+
5+
### FIXED, DO NOT MODIFY
6+
layout: learningpathall
7+
---
8+
9+
### Introduction to DDS
10+
Data Distribution Service (DDS) is a real-time, high-performance middleware designed for distributed systems.
11+
It is particularly valuable in automotive software development, including applications such as **autonomous driving (AD)** and **advanced driver assistance systems (ADAS)**.
12+
13+
DDS offers a decentralized architecture that enables scalable, low-latency, and reliable data exchange—making it ideal for managing high-frequency sensor streams.
14+
15+
In modern vehicles, multiple sensors (LiDAR, radar, cameras) must continuously communicate with compute modules.
16+
17+
DDS ensures these components share data seamlessly and in real time, both within the vehicle and across infrastructure (e.g., V2X systems like traffic lights and road sensors).
18+
19+
20+
### Why Automotive Software Needs DDS
21+
22+
Next-generation automotive software architectures —like [SOAFEE](https://www.soafee.io/)- depend on deterministic, distributed communication. Traditional client-server models introduce latency and single points of failure, while DDS’s publish-subscribe model enables direct, peer-to-peer communication across system components.
23+
24+
For example, a LiDAR sensor broadcasting obstacle data can simultaneously deliver updates to perception, SLAM, and motion planning modules—without redundant network traffic or central coordination.
25+
26+
Additionally, DDS provides a flexible Quality of Service (QoS) configuration, allowing engineers to fine-tune communication parameters based on system requirements. Low-latency modes are ideal for real-time decision-making in vehicle control, while high-reliability configurations ensure data integrity in safety-critical applications like V2X communication.
27+
28+
These capabilities make DDS an essential backbone for autonomous vehicle stacks, where real-time sensor fusion and control coordination are critical for safety and performance.
29+
30+
### DDS Architecture and Operation
31+
32+
DDS uses a **data-centric publish-subscribe (DCPS)** model, allowing producers and consumers of data to communicate without direct dependencies. This modular approach enhances system flexibility and maintainability, making it well-suited for complex automotive environments.
33+
34+
DDS organizes communication within **domains**, which act as isolated scopes. Inside each domain:
35+
- ***Topics*** represent named data streams (e.g., /vehicle/speed, /perception/objects)
36+
- ***DataWriters*** (publishers) send data to topics
37+
- ***DataReaders*** (subscribers) receive data from topics
38+
This structure enables concurrent, decoupled communication between multiple modules without hardcoding communication links.
39+
40+
Each domain contains multiple **topics**, representing specific data types such as vehicle speed, obstacle detection, or sensor fusion results. **Publishers** use **DataWriters** to send data to these topics, while **subscribers** use **DataReaders** to receive the data. This architecture supports concurrent data processing, ensuring that multiple modules can work with the same data stream simultaneously.
41+
42+
For example, in an autonomous vehicle, LiDAR, radar, and cameras continuously generate large amounts of sensor data. The perception module subscribes to these sensor topics, processes the data, and then publishes detected objects and road conditions to other components like path planning and motion control. Since DDS automatically handles participant discovery and message distribution, engineers do not need to manually configure communication paths, reducing development complexity.
43+
44+
45+
### Real-World Use in Autonomous Driving
46+
DDS is widely used in autonomous driving systems, where real-time data exchange is crucial. A typical use case involves high-frequency sensor data transmission and decision-making coordination between vehicle subsystems.
47+
48+
For instance, a LiDAR sensor generates millions of data points per second, which need to be shared with multiple modules. DDS allows this data to be published once and received by multiple subscribers, including perception, localization, and mapping components. After processing, the detected objects and road features are forwarded to the path planning module, which calculates the vehicle's next movement. Finally, control commands are sent to the vehicle actuators, ensuring precise execution.
49+
50+
This real-time data flow must occur within milliseconds to enable safe autonomous driving. DDS ensures minimal transmission delay, enabling rapid response to dynamic road conditions. In emergency scenarios, such as detecting a pedestrian or sudden braking by a nearby vehicle, DDS facilitates instant data propagation, allowing the system to take immediate corrective action.
51+
52+
For example: [Autoware](https://www.autoware.org/)—an open-source autonomous driving software stack—uses DDS to handle high-throughput communication across its modules.
53+
54+
The **Perception** stack publishes detected objects from LiDAR and camera sensors to a shared topic, which is then consumed by the **Planning** module in real-time. Using DDS allows each subsystem to scale independently while preserving low-latency and deterministic communication.
55+
56+
### Publish-Subscribe Model and Data Transmission
57+
Let’s explore how DDS’s publish-subscribe model fundamentally differs from traditional communication methods in terms of scalability, latency, and reliability.
58+
59+
Traditional client-server communication requires a centralized server to manage data exchange. This architecture introduces several drawbacks, including increased latency and network congestion, which can be problematic in real-time automotive applications.
60+
61+
DDS adopts a publish-subscribe model, enabling direct communication between system components. Instead of relying on a central entity to relay messages, DDS allows each participant to subscribe to relevant topics and receive updates as soon as new data becomes available. This approach reduces dependency on centralized infrastructure and improves overall system performance.
62+
63+
For example, in an automotive perception system, LiDAR, radar, and cameras continuously publish sensor data. Multiple subscribers, including object detection, lane recognition, and obstacle avoidance modules, can access this data simultaneously without additional network overhead. DDS automatically manages message distribution, ensuring efficient resource utilization.
64+
65+
DDS supports multiple transport mechanisms to optimize communication efficiency:
66+
- **Shared memory transport**: Ideal for ultra-low-latency communication within an ECU, minimizing processing overhead.
67+
- **UDP or TCP/IP**: Used for inter-device communication, such as V2X applications where vehicles exchange safety-critical messages.
68+
- **Automatic participant discovery**: Eliminates the need for manual configuration, allowing DDS nodes to detect and establish connections dynamically.
69+
70+
#### Comparison of DDS and Traditional Communication Methods
71+
72+
The following table highlights how DDS improves upon traditional client-server communication patterns in the context of real-time automotive applications:
73+
74+
| **Feature** | **Traditional Client-Server Architecture** | **DDS Publish-Subscribe Model** |
75+
|-----------------------|--------------------------------------------|--------------------------- |
76+
| **Data Transmission** | Relies on a central server | Direct peer-to-peer communication |
77+
| **Latency** | Higher latency | Low latency |
78+
| **Scalability** | Limited by server capacity | Suitable for large-scale systems |
79+
| **Reliability** | Server failure affects the whole system | No single point of failure |
80+
| **Use Cases** | Small-scale applications | V2X, autonomous driving |
81+
82+
These features make DDS a highly adaptable solution for automotive software engineers seeking to develop scalable, real-time communication frameworks.
83+
84+
In this section, you learned how DDS enables low-latency, scalable, and fault-tolerant communication for autonomous vehicle systems.
85+
86+
Its data-centric publish-subscribe architecture eliminates the limitations of traditional client-server models and forms the backbone of modern automotive software frameworks such as ROS 2 and SOAFEE.
87+
88+
To get started with open-source DDS on Arm platforms, refer to this [installation guide for Cyclonedds](https://learn.arm.com/install-guides/cyclonedds) on how to install open-source DDS on an Arm platform.
89+

0 commit comments

Comments
 (0)