DEV: Backfill embeddings concurrently. #941

romanrizzi · 2024-11-21T20:42:02Z

We are adding a new method for generating and storing embeddings in bulk, which relies on Concurrent::Promises::Future. Generating an embedding consists of three steps:

Prepare text
HTTP call to retrieve the vector
Save to DB.

Second one is independently executed on whatever thread the pool gives us.

We are bringing a custom thread pool instead of the global executor since we want control over how many threads we spawn to limit concurrency. We also avoid firing thousands of HTTP requests when working with large batches.

Ran benchmarks locally with the following script:

def benchmark_embeddings
  ActiveRecord::Base.logger.level = 1 # or Logger::INFO
  truncation = DiscourseAi::Embeddings::Strategies::Truncation.new
  vector_rep = DiscourseAi::Embeddings::VectorRepresentations::Base.current_representation(truncation)

  Benchmark.bmbm do |x|
    x.report("With concurrent (90 topics)") do 
      DB.exec("DELETE FROM ai_topic_embeddings")
      vector_rep.gen_bulk_reprensentations(Topic.includes(:tags, :posts).all)
    end

    x.report("Without concurrent (90 topics)") do
      DB.exec("DELETE FROM ai_topic_embeddings")
      Topic.includes(:tags, :posts).all.each { |t| vector_rep.generate_representation_from(t) }
    end
  end
end

Results:

                                     user     system      total        real
With concurrent (90 topics)      2.418655   0.084879   2.503534 (  3.716947)
Without concurrent (90 topics)   5.097356   0.268728   5.366084 ( 63.132123)

lib/embeddings/vector_representations/base.rb

We are adding a new method for generating and storing embeddings in bulk, which relies on `Concurrent::Promises::Future`. Generating an embedding consists of three steps: Prepare text HTTP call to retrieve the vector Save to DB. Each one is independently executed on whatever thread the pool gives us. We are bringing a custom thread pool instead of the global executor since we want control over how many threads we spawn to limit concurrency. We also avoid firing thousands of HTTP requests when working with large batches.

This reverts commit ddf2bf7.

romanrizzi commented Nov 21, 2024

View reviewed changes

lib/embeddings/vector_representations/base.rb Outdated Show resolved Hide resolved

Base automatically changed from tidyup_embeddings to main November 25, 2024 16:12

romanrizzi force-pushed the async_embeddings_backfill branch from 3d30d70 to 2d6d39f Compare November 25, 2024 16:52

romanrizzi marked this pull request as ready for review November 25, 2024 16:52

romanrizzi force-pushed the async_embeddings_backfill branch from 2d6d39f to 5736b49 Compare November 26, 2024 17:05

xfalcox approved these changes Nov 26, 2024

View reviewed changes

romanrizzi merged commit ddf2bf7 into main Nov 26, 2024
6 checks passed

romanrizzi deleted the async_embeddings_backfill branch November 26, 2024 17:12

romanrizzi added a commit that referenced this pull request Nov 26, 2024

Revert "DEV: Backfill embeddings concurrently. (#941)"

125ff71

This reverts commit ddf2bf7.

romanrizzi mentioned this pull request Nov 26, 2024

Revert "DEV: Backfill embeddings concurrently." #959

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DEV: Backfill embeddings concurrently. #941

DEV: Backfill embeddings concurrently. #941

Uh oh!

romanrizzi commented Nov 21, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

DEV: Backfill embeddings concurrently. #941

DEV: Backfill embeddings concurrently. #941

Uh oh!

Conversation

romanrizzi commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

3 participants

romanrizzi commented Nov 21, 2024 •

edited

Loading