Skip to content

Benchmark Inference Pipeline #1223

@CarsonDavis

Description

@CarsonDavis

Description

We want to run a benchmark of the inference pipeline to see how many documents it is able to successfully process on staging.

To do this, we would first create a benchmarking script that can run in any environment (local/staging/etc) and will feed full texts into the classifier API and track the results. After confirming on local, we can then benchmark the Staging server.

Implementation Considerations

Make a script that loops through the available full texts within COSMOS, up to a maximum of 5000 full texts, and sends them one by one to the classifier API.

It should record the job_ids and check back in with the classifier to see how many classifications were successful and how long they took.

You will need to reference the API documentation located here: https://github.com/NASA-IMPACT/llm-app-classifier-pipeline.

Deliverable

  • stats on classification completion rates
    • num documents sent
    • num documents in each status (failed, unknown, success)
  • stats on classification throughputs
    • len() of documents sent
    • time taken to classify all the documents

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions