Skip to content
This repository was archived by the owner on Jul 22, 2025. It is now read-only.

Conversation

@romanrizzi
Copy link
Member

@romanrizzi romanrizzi commented Nov 21, 2024

We are adding a new method for generating and storing embeddings in bulk, which relies on Concurrent::Promises::Future. Generating an embedding consists of three steps:

  • Prepare text
  • HTTP call to retrieve the vector
  • Save to DB.

Second one is independently executed on whatever thread the pool gives us.

We are bringing a custom thread pool instead of the global executor since we want control over how many threads we spawn to limit concurrency. We also avoid firing thousands of HTTP requests when working with large batches.

Ran benchmarks locally with the following script:

def benchmark_embeddings
  ActiveRecord::Base.logger.level = 1 # or Logger::INFO
  truncation = DiscourseAi::Embeddings::Strategies::Truncation.new
  vector_rep = DiscourseAi::Embeddings::VectorRepresentations::Base.current_representation(truncation)

  Benchmark.bmbm do |x|
    x.report("With concurrent (90 topics)") do 
      DB.exec("DELETE FROM ai_topic_embeddings")
      vector_rep.gen_bulk_reprensentations(Topic.includes(:tags, :posts).all)
    end

    x.report("Without concurrent (90 topics)") do
      DB.exec("DELETE FROM ai_topic_embeddings")
      Topic.includes(:tags, :posts).all.each { |t| vector_rep.generate_representation_from(t) }
    end
  end
end

Results:

                                     user     system      total        real
With concurrent (90 topics)      2.418655   0.084879   2.503534 (  3.716947)
Without concurrent (90 topics)   5.097356   0.268728   5.366084 ( 63.132123)

Base automatically changed from tidyup_embeddings to main November 25, 2024 16:12
@romanrizzi romanrizzi force-pushed the async_embeddings_backfill branch from 3d30d70 to 2d6d39f Compare November 25, 2024 16:52
@romanrizzi romanrizzi marked this pull request as ready for review November 25, 2024 16:52
We are adding a new method for generating and storing embeddings in bulk, which relies on `Concurrent::Promises::Future`. Generating an embedding consists of three steps:

Prepare text
HTTP call to retrieve the vector
Save to DB.
Each one is independently executed on whatever thread the pool gives us.

We are bringing a custom thread pool instead of the global executor since we want control over how many threads we spawn to limit concurrency. We also avoid firing thousands of HTTP requests when working with large batches.
@romanrizzi romanrizzi force-pushed the async_embeddings_backfill branch from 2d6d39f to 5736b49 Compare November 26, 2024 17:05
@romanrizzi romanrizzi merged commit ddf2bf7 into main Nov 26, 2024
6 checks passed
@romanrizzi romanrizzi deleted the async_embeddings_backfill branch November 26, 2024 17:12
romanrizzi added a commit that referenced this pull request Nov 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants