-
Notifications
You must be signed in to change notification settings - Fork 481
Description
Description
Adding many timers in a for-loop is very slow since it takes overall quadratic time.
Motivation
So I stumbled upon this issue when I was creating 10.000 timers in a for-loop (I was benchmarking events throughput).
perf shows 96% of the time is spent in CallbackGroup::add_timer.
Design / Implementation Considerations
So I looked what is going on and saw this:
CallbackGroup::add_timer(const rclcpp::TimerBase::SharedPtr timer_ptr)
{
std::lock_guard<std::mutex> lock(mutex_);
timer_ptrs_.push_back(timer_ptr);
timer_ptrs_.erase(
std::remove_if(
timer_ptrs_.begin(),
timer_ptrs_.end(),
[](rclcpp::TimerBase::WeakPtr x) {return x.expired();}),
timer_ptrs_.end());
}So every time we add a timer, we do a linear search over the list of all existing timers to find and remove timers that have been destroyed meanwhile. This takes (N^2+N)/2 operations, i.e. is a quadratic time algorithm, which is quite bad.
The same is done for all entities, i.e. not only timers but also subscriptions, publishers etc.
Tested on Humble, but the code is the same in rolling, so the issue is still present in rolling as well.
Discussion and possible fixes
The offending linear search was added as a fix for an otherwise growing memory demand, i.e. memory leak issue in #814.
I think the root cause of having to add such a fix is generally the ownership of entities. Since they are owned by the user and not by the rclcpp::Node, the rclcpp::Node does not get to know when an entity is destroyed. But since the rclcpp::Node and its associated classes such as the CallbackGroup work with these entities, they must hold references to them. Therefore, when an entity is destroyed, there will always be an invalid (weak) reference to this entity stored in the CallbackGroup, and potentially many other internal classes. This reference needs to be cleaned up somehow. And currently, since we do not know which entities were destroyed, we need to check all entities on each occasion.
This issue was discussed already in the API Review 2020.
I think the best solution to this problem, fixing the root cause, is to move the ownership of entities to a central class (e.g. rclcpp::Node)
which has methods for deleting entities.
But since that's a big API-breaking change, here are some alternative proposals (including their downsides)
- The executor cleans up the CallbackGroups regularly (ignores separation of concerns)
- Every entity calls a newly added callback
clean_up_all_referencesin its destructor: Entities already clean up things in their destructors, but still has to be implemented for every entity (complexity) - Add an CallbackGroup::add_entities method: Does one instead of N linear searches (hacky)
- Make the cleanup optional and switchable at runtime: Since entities almost always live for as long as the node, users can opt-in to not perform any cleanup to gain performance
Additional Information
ROS 2 Humble on Ubuntu 22.04. rclcpp v16.0.12. RMW used is rmw_cyclonedds_cpp, v1.3.4.

Reproducer code:
Reproducer
#include <atomic>
#include <chrono>
#include <iostream>
#include <memory>
#include <vector>
#include "rclcpp/rclcpp.hpp"
using namespace std::chrono_literals;
class TimerBenchmarkNode : public rclcpp::Node {
public:
TimerBenchmarkNode(size_t total_timers, std::chrono::milliseconds period)
: Node("timer_benchmark"), total_timers_(total_timers), completed_(0) {
timers_.reserve(total_timers_);
start_time_ = std::chrono::steady_clock::now();
for (size_t i = 0; i < total_timers_; ++i) {
auto timer = this->create_wall_timer(period, [this, i]() {
// Cancel the timer inside the callback
timers_[i]->cancel();
// Count completions
if (++completed_ == total_timers_) {
auto end_time = std::chrono::steady_clock::now();
auto duration_ms =
std::chrono::duration_cast<std::chrono::milliseconds>(end_time - start_time_).count();
RCLCPP_INFO(this->get_logger(), "Processed %zu timers in %ld ms (%.0f timers/sec)",
total_timers_, duration_ms, (total_timers_ * 1000.0) / duration_ms);
// Destroy timers explicitly
// timers_.clear();
}
});
timers_.push_back(timer);
}
}
private:
size_t total_timers_;
std::vector<rclcpp::TimerBase::SharedPtr> timers_;
std::atomic<size_t> completed_;
std::chrono::time_point<std::chrono::steady_clock> start_time_;
};
int main(int argc, char** argv) {
rclcpp::init(argc, argv);
getchar();
size_t total_timers = 10000;
auto node = std::make_shared<TimerBenchmarkNode>(total_timers, 0ms);
rclcpp::spin(node);
rclcpp::shutdown();
return 0;
}