Benchmark Inference Pipeline

## Description
We want to run a benchmark of the inference pipeline to see how many documents it is able to successfully process on staging. 

To do this, we would first create a benchmarking script that can run in any environment (local/staging/etc) and will feed full texts into the classifier API and track the results. After confirming on local, we can then benchmark the Staging server.

## Implementation Considerations
Make a script that loops through the available full texts within COSMOS, up to a maximum of 5000 full texts, and sends them one by one to the classifier API. 

It should record the job_ids and check back in with the classifier to see how many classifications were successful and how long they took.

You will need to reference the API documentation located here: https://github.com/NASA-IMPACT/llm-app-classifier-pipeline.

## Deliverable
- stats on classification completion rates
  - num documents sent
  - num documents in each status (failed, unknown, success)
- stats on classification throughputs
  - len() of documents sent
  - time taken to classify all the documents

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark Inference Pipeline #1223

Description

Implementation Considerations

Deliverable

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Benchmark Inference Pipeline #1223

Description

Description

Implementation Considerations

Deliverable

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions