[BUG] Check model availability before running benchmark

I thought I had disabled the judges by not providing the judge name. Through some debugging I realized that the judge was pointed to `llama3-70b`. I have no API keys for this model.

The codebase should check the model availability + judge availability and clearly warn the user/stop the process if the model is not available instead of starting off with the run