Apply feedback from Javi, Mich and Gera (#31)

JesusSilvaUtrera · xaru8145 · web-flow · commit 0da3412c5040 · 2025-10-28T12:06:24.000+01:00
* Apply feedback from Javi and Mich

* refactor TDD

* clarify and add exemple EXPECT vs ASSERT

* Address comments and try to solve the specific-ci problem

* document pre-commit in modules 1 and 6

* Minor modifications in guide

* Rename Module 1

* Minor change

---------

Co-authored-by: JesusSilvaUtrera &lt;jesus.silva@ekumenlabs.com&gt;
Co-authored-by: Xavier Ruiz &lt;xavier.ruiz@ekumenlabs.com&gt;
diff --git a/README.md b/README.md
@@ -51,16 +51,16 @@ By the end of the workshop, participants will be able to:
 
 ## 📋 Workshop structure
 
-This workshop is organized into six modules that progressively develop the participant’s understanding of **testing in ROS 2**, from code quality fundamentals to complete Continuous Integration pipelines.
+This workshop is organized into six modules that progressively develop the participant’s understanding of **testing in ROS 2**, from static analysis fundamentals to complete Continuous Integration pipelines.
 
 Each module combines conceptual material with practical exercises that apply the ideas directly to real ROS 2 code. All exercises are designed to be executed in a consistent environment using the provided Docker setup.
 
 > [!IMPORTANT]
 > Before starting, build the Docker environment provided for this workshop. It includes all dependencies and tools required for the exercises. Follow the detailed instructions in the [Docker README](./docker/README.md).
 
-1. **[Module 1 – Linters](modules/module_1/README.md)**
+1. **[Module 1 – Static Analysis Tools](modules/module_1/README.md)**
 
-   Understand how automated linters and static analysis tools enforce consistency, readability, and safety across ROS 2 codebases.  
+   Understand how automated formatters, linters and static analyzers enforce consistency, readability, and safety across ROS 2 codebases.  
 
 2. **[Module 2 – Unit Testing](modules/module_2/README.md)**
 
diff --git a/docker/Dockerfile b/docker/Dockerfile
@@ -6,11 +6,11 @@ ENV DEBIAN_FRONTEND=noninteractive
 ENV ROS_DISTRO=jazzy
 
 # Copy requirement files and install dependencies (ignore comments and empty lines)
-COPY docker/requirements.txt .
+COPY docker/apt_packages.txt .
 RUN apt-get update && \
-    apt-get install --no-install-recommends -y $(grep -vE '^\s*#' requirements.txt | grep -vE '^\s*$') && \
+    apt-get install --no-install-recommends -y $(grep -vE '^\s*#' apt_packages.txt | grep -vE '^\s*$') && \
     rm -rf /var/lib/apt/lists/*
-RUN rm requirements.txt
+RUN rm apt_packages.txt
 
 # Some dependencies need to be installed with pip instead of apt
 RUN apt-get update && apt-get install -y --no-install-recommends python3-pip && \
diff --git a/docker/apt_packages.txt b/docker/apt_packages.txt
diff --git a/modules/module_1/README.md b/modules/module_1/README.md
diff --git a/modules/module_2/README.md b/modules/module_2/README.md
@@ -79,7 +79,19 @@ A few core concepts are especially useful:
   - `EXPECT_*` records a failure but allows the test to continue.
   - `ASSERT_*` aborts the test immediately on failure.
 
-    Use `EXPECT_*` for checks that can accumulate, and `ASSERT_*` when later steps would be meaningless if the check fails.
+    Consider testing that a function returns a `std::vector<int>` with the right length and expected contents:
+
+    ```cpp
+    std::vector<int> vec = get_vector();
+    // If the length is wrong, further checks (indexing) would be invalid -> abort test
+    ASSERT_EQ(3u, vec.size());   // stop the test immediately if size != 3
+    // Now it is safe to check contents; these can be EXPECT so we see all mismatches at once
+    EXPECT_EQ(10, vec[0]);
+    EXPECT_EQ(20, vec[1]);
+    EXPECT_EQ(30, vec[2]);
+    ```
+
+    Use `ASSERT_*` for preconditions that must hold for remaining assertions to make sense (avoid crashes and meaningless failures). Use `EXPECT_*` for value checks where continuing to run the test to collect multiple failures is useful
 
 - **Fixtures**: allows code to be reused across multiple tests. Define a test class deriving from `::testing::Test` and use `TEST_F` instead of `TEST`.
 - **Parameterized tests**: The same test logic can be executed against multiple input values with `TEST_P`. This reduces duplication and is especially helpful when validating algorithms across many corner cases
@@ -142,17 +154,17 @@ ROS 2 wraps GoogleTest/GoogleMock with lightweight CMake helpers so tests build
 
 ## How to Write Tests
 
-Writing good unit tests is as much about structure as it is about logic. Two key concepts guide this process: **Test-Driven Development (TDD)** and the **Arrange-Act-Assert (AAA)** pattern.
-
-**Test-Driven Development (TDD)** is an iterative approach where tests are written before the actual code. Each cycle begins by defining a small, failing test that expresses a desired behavior. The minimal code needed to make the test pass is then implemented, followed by a short refactoring step to clean up or generalize the design. This rhythm of red → green → refactor encourages clear requirements, modular code, and continuous verification.
-
-The **AAA** pattern provides a simple mental model for structuring each test.
+A good unit test is clear, concise, and focused. The best way to achieve this is by following the Arrange-Act-Assert (AAA) pattern, which provides a simple mental model for structuring each test:
 
 - **Arrange**: prepare the environment, inputs, and objects needed for the test.
 - **Act**: execute the function or behavior being tested.
 - **Assert**: verify that the observed result matches the expected outcome.
 
-Following this structure makes tests easy to read, maintain, and reason about. Each test should describe one behavior clearly, without hidden dependencies or side effects.
+Following this pattern leads to tests that are consistent, self-explanatory, and easy to debug when they fail.
+
+Beyond how tests are written, it’s also important to consider when they are written. This leads to a popular development workflow known as **Test-Driven Development (TDD)**. TDD follows an iterative approach where tests are written before the actual code. Each cycle begins by defining a small, failing test that expresses a desired behavior. The minimal code needed to make the test pass is then implemented, followed by a short refactoring step to clean up or generalize the design. This rhythm of red → green → refactor encourages clear requirements, modular code, and continuous verification.
+
+While TDD helps drive better design decisions and encourages modular, testable architectures, the same testing principles can be applied in traditional “test-after” workflows. The key takeaway is that **testability should guide design**, regardless of whether tests come before or after the code.
 
 ## Exercises
 
@@ -187,4 +199,4 @@ The task is complete when tests are run and the output shows **0 errors** and **
 - [Google Test Repo](https://github.com/google/googletestl)
 - [Google Test Macros](https://google.github.io/googletest/reference/testing.html)
 - [Google Test Assertions](https://google.github.io/googletest/reference/assertions.html)
-- [Google Mock Basics](https://google.github.io/googletest/gmock_for_dummies.html)
+- [Google Mock Basics](https://google.github.io/googletest/gmock_for_dummies.html)
diff --git a/modules/module_4/README.md b/modules/module_4/README.md
@@ -7,6 +7,7 @@ In this module, the focus is to explore **integration testing in ROS 2** to veri
   - [Motivation](#motivation)
   - [The launch\_testing Framework](#the-launch_testing-framework)
     - [Registering the tests](#registering-the-tests)
+    - [Avoiding flaky tests](#avoiding-flaky-tests)
     - [Alternative: launch\_pytest](#alternative-launch_pytest)
   - [Exercises](#exercises)
     - [Exercise 1](#exercise-1)
@@ -101,6 +102,28 @@ And in the `package.xml`:
 <test_depend>launch_testing_ament_cmake</test_depend>
 ```
 
+### Avoiding flaky tests
+
+The most common mistake in integration testing is writing a **flaky test**. A flaky test is one that passes sometimes and fails other times, even when no code has changed. This is almost always caused by a race condition.
+
+Flaky (Bad) Test Logic:
+
+1. Launch nodes.
+2. Immediately publish a message (for example, on `/scan`).
+3. Check for an expected result (for example, a log message).
+
+**Why it fails**: The nodes in `generate_test_description` are launched, but they are not guaranteed to be ready or subscribed to their topics by the time the test case runs. The `ReadyToTest()` action only means the launch process is complete. If the test publishes its message before the node is subscribed, the message is dropped, and the test fails.
+
+Reliable (Good) Test Logic:
+
+1. Launch nodes.
+2. In the test case, create the publisher.
+3. Wait for the system to be ready. A simple, robust way is to wait until the publisher sees that a subscriber is connected.
+4. Once the subscription has been confirmed, then publish the message.
+5. Check for the expected result.
+
+This event-driven approach is deterministic and eliminates the race condition.
+
 ### Alternative: launch_pytest
 
 While using `launch_testing` with `unittest` is the classic approach used, support for more modern approaches like using `pytest` is also available. `Pytest` is a powerful and modern third party framework (`unittest` is part of the Python standard library) that has become the most used option for Python testing in the community. It is also gaining popularity within the ROS ecosystem.
@@ -130,7 +153,7 @@ Now, run the tests. This will fail because the test script is incomplete:
 colcon test --packages-up-to module_4 --event-handlers console_direct+
 ```
 
-The incomplete test script already handles launching the nodes. You need to fill in the logic inside the `unittest.TestCase` to verify its behavior.
+The incomplete test script already handles launching the nodes. Fill in the logic inside the `unittest.TestCase` to verify its behavior.
 
 The additions to the Python test script must:
 
diff --git a/modules/module_4/test/test_detection_launch.py b/modules/module_4/test/test_detection_launch.py
@@ -3,7 +3,7 @@
 import unittest
 
 from launch import LaunchDescription
-from launch_ros.actions import Node, ComposableNodeContainer
+from launch_ros.actions import ComposableNodeContainer
 from launch_ros.descriptions import ComposableNode
 import launch_testing
 import launch_testing.actions
@@ -109,12 +109,23 @@ def test_obstacle_triggers_red_light(self, proc_output):
         #
         # 1. Create a publisher to the /scan topic.
         #
-        # 2. Create and publish a LaserScan message that will trigger the detector.
-        #    Use the helper function.
+        # 2. Wait for the subscriber to be ready (THIS IS THE CRITICAL PART!)
+        #    - Create a loop that checks `self.scan_publisher.get_subscription_count()`
+        #    - Use a timeout (for example 10 seconds) to prevent an infinite loop.
+        #    - Inside the loop, spin the node (`rclpy.spin_once`)
+        #    - After the loop, use `self.assertGreater` to fail the test if
+        #      no subscriber appeared.
         #
-        # 3. Use 'proc_output.assertWaitFor' to check for the "RED LIGHT" message.
+        # 3. Create and publish a LaserScan message that will trigger the detector.
+        #    Use the helper function above.
         #
-        assert False  # Replace this 'pass' statement with your test logic
+        # 4. Use 'proc_output.assertWaitFor' to check for the "RED LIGHT" message.
+        #    - Give it a timeout (for example, 5 seconds).
+        #
+        # 5. (Optional but good practice) Add a try/except block around
+        #    `assertWaitFor` to provide a clearer failure message if it times out.
+        #
+        assert False  # Replace this 'assert' with the necessary code for the test
         # ====================== END EDIT ==================================
 
 
diff --git a/modules/module_5/README.md b/modules/module_5/README.md
@@ -33,12 +33,9 @@ Its purpose is to validate the robot's ability to meet a high-level requirement,
 
 The importance of E2E testing in robotics includes:
 
-<!-- TODO: Review this, and see if it matches more with field debugging -->
-
 - **Validating the "Mission"**: It's the only test level that answers the question: "Does the robot actually achieve its goal?"
 - **Testing Against Reality**: By using data recorded from the real world (or a high-fidelity simulator), rosbags provide a "ground truth" scenario. This makes possible to test complex, emergent behaviors and edge cases that are impossible to script in a simple integration test.
 - **Ultimate Regression-Proofing**: An E2E test is the ultimate safety net. If a change in any package (perception, control, navigation) breaks the robot's ability to complete its mission, a good E2E test will catch it.
-- **Debugging Complex Failures**: When a robot fails in the field, a rosbag of that failure is invaluable. It can be replayed in a simulator over and over until the root cause (for example, a race condition, a state machine logic error) is found.
 
 ## The rosbag Toolset
 
@@ -104,7 +101,7 @@ This command acts like a "data simulator", providing a perfectly repeatable stre
 
 This is the most common and intuitive form of E2E testing. It involves a human operator launching the system, providing a scenario (usually via a rosbag), and visual or log-based verification of the result.
 
-This is perfect for debugging, or for a final "sanity check" before merging a major feature.
+This is perfect for a final "sanity check" before merging a major feature.
 
 ### Using Pre-recorded Rosbags
 
@@ -115,7 +112,6 @@ A typical manual test session looks like this:
 3. Observe and Verify: The engineer watches the output:
     - In `RViz`: "Does the robot's navigation visualization show it reaching the goal?"
     - In the terminal: "Did the mission control node log 'MISSION_COMPLETE'?"
-4. Analyze: If it fails, now it's possible to debug the running nodes, knowing the input data is identical every single time.
 
 This workflow is incredibly powerful but has one major drawback: it's not automated. It relies on a human to launch, observe, and judge success.
 
diff --git a/modules/module_6/README.md b/modules/module_6/README.md
@@ -48,6 +48,11 @@ Integrating CI into the workflow is essential because it:
 
 For projects hosted on GitHub, the easiest and most popular way to implement CI is with **GitHub Actions**.
 
+> [!TIP]
+> **Use Pre-commit Hooks to Optimize Your Workflow**
+>
+> While **CI is the mandatory quality gate** for merging, it's often faster for developers to catch simple style errors **locally** before they push. Tools like **pre-commit hooks** (as discussed in Module 1) run fast checks like formatting locally, saving the developer time waiting for the CI pipeline to run just to fail on a style inconsistency. They complement the CI by ensuring your commits are clean and focused on functional changes.
+
 ### Introduction to GitHub Actions
 
 GitHub Actions is a CI/CD platform built directly into GitHub. Automation workflows are defined in a **YAML file** in a special directory in the repository: `.github/workflows/`. GitHub automatically detects these files and runs them based on a set of custom-defined rules or triggers, such as when code is pushed, a pull request is opened, or a scheduled job is due.
diff --git a/tools/run_ci_build.sh b/tools/run_ci_build.sh
@@ -27,4 +27,4 @@ rosdep install --from-paths modules --ignore-src -y
 colcon build --packages-up-to "$@" --symlink-install --event-handlers console_direct+
 
 # Test
-colcon test --packages-up-to "$@" --event-handlers console_direct+
+colcon test --packages-up-to "$@" --event-handlers console_direct+ --return-code-on-test-failure