Skip to content

Conversation

@QuantamHD
Copy link
Contributor

I'm looking to try to have unified thread pool with OpenSTA and OpenROAD and having this method would make it easier to understand how much parallelism is available in the pool.

I'm looking to try to have unified thread pool with OpenSTA and OpenROAD
and having this method would make it easier to understand how much
parallelism is available in the pool.

Signed-off-by: Ethan Mahintorabi <[email protected]>
@jjcherry56
Copy link
Collaborator

why don't you just use Sta::threadCount() (inherited from StaState)?

@QuantamHD
Copy link
Contributor Author

QuantamHD commented Nov 4, 2025

I didn't see that, thanks for the pointer!

I still think it might be nice to have it on that method instead, because this is roughly what I'm trying to do. Ideally, you only need the DispatchQueue rather than the full StaState. It's nice in particular for the parallel_for where you want to know how many threads you have for chunking purposes.

/**
 * @brief Executes a function for each element in a range in parallel
 * using sta::DispatchQueue.
 *
 * This function partitions the range [begin, end) into chunks and submits
 * a task for each chunk to the provided dispatch queue.
 *
 * It blocks until all tasks are completed.
 *
 * @tparam Iterator Type of the iterator (must be Random Access).
 * @tparam Function Type of the callable function (e.g., lambda).
 * @param pool The sta::DispatchQueue to execute tasks on.
 * @param begin The iterator to the beginning of the range.
 * @param end The iterator to the end of the range.
 * @param func The function to apply to each element.
 */
template <typename Iterator, typename Function>
void parallel_for(sta::DispatchQueue& pool, Iterator begin, Iterator end,
                  const Function& func) {
  // Static assertion to ensure we have iterators that support
  // O(1) advance (like std::vector::iterator).
  static_assert(
      std::is_base_of<
          std::random_access_iterator_tag,
          typename std::iterator_traits<Iterator>::iterator_category>::value,
      "parallel_for requires Random Access Iterators (e.g., "
      "std::vector::iterator).");

  const size_t data_size = std::distance(begin, end);
  if (data_size == 0) {
    return;
  }

  // --- Synchronization Primitives ---
  // We use our own counter and CV to make this a blocking call
  // that waits *only* for the tasks we post here.
  std::atomic<size_t> task_counter(0);
  std::mutex m;
  std::condition_variable cv;

  // --- Chunking Logic ---
  const size_t hardware_threads = pool.getThreadCount();
  const size_t num_tasks =
      std::min(static_cast<size_t>(hardware_threads), data_size);
  size_t chunk_size = data_size / num_tasks;

  task_counter = num_tasks;  // We will post this many tasks

  // --- Submit Tasks ---
  for (size_t i = 0; i < num_tasks; ++i) {
    Iterator chunk_start = begin + (i * chunk_size);
    Iterator chunk_end = (i == num_tasks - 1)
                             ? end  // Last chunk gets the remainder
                             : chunk_start + chunk_size;

    // Dispatch the task to the pool.
    // The lambda must match the queue's required signature: void(int).
    pool.dispatch([&, chunk_start, chunk_end, func](int /* thread_id */) {
      // --- This is the core work ---
      func(chunk_start, chunk_end);
      // This task is done. Decrement counter and notify the main thread
      // if we are the last task to finish.
      if (--task_counter == 0) {
        std::unique_lock<std::mutex> lock(m);
        cv.notify_one();
      }
    });
  }

  // --- Wait for all tasks to complete ---
  std::unique_lock<std::mutex> lock(m);
  cv.wait(lock, [&] { return task_counter == 0; });
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants