Skip to content

Add OLMES L-Eval Benchmarking Scripts and Environment Validation#551

Draft
hemanth346 wants to merge 3 commits intostagingfrom
olmes-leval-benchmark
Draft

Add OLMES L-Eval Benchmarking Scripts and Environment Validation#551
hemanth346 wants to merge 3 commits intostagingfrom
olmes-leval-benchmark

Conversation

@hemanth346
Copy link
Contributor

@hemanth346 hemanth346 commented Feb 28, 2026

Pull Request Template

Description

  • Implemented quick environment validation script (quick_validation.py) to check Python version, core libraries, device availability, and metric libraries.
  • Created requirements.txt for necessary dependencies with version specifications.
  • Developed run_validation.py to execute the OLMES L-Eval benchmarking tasks, including model loading, smoke tests, and metric verification.
  • Added sample validation results in JSON format for reference.
  • Introduced setup.sh script for easy environment setup using UV for dependency management.

Checklist

  • I have added tests that prove my fix is effective or that my feature works.
  • I have added necessary documentation (if applicable).
  • My code follows the style guidelines, gitflow branching strategy, and naming conventions of this project [Contribution Guidelines](https://github.com/The-School-of-AI/LLM/tree/main/experiments/

Reviewers

  • Reviewer 1: A member from your own team.
  • Reviewer 2: A member from the repo owners team (@The-School-of-AI/llm-repo-owners).

Note: Every pull request requires atleast 2 reviewers/approvers before it can be merged.

- Implemented quick environment validation script (quick_validation.py) to check Python version, core libraries, device availability, and metric libraries.
- Created requirements.txt for necessary dependencies with version specifications.
- Developed run_validation.py to execute the OLMES L-Eval benchmarking tasks, including model loading, smoke tests, and metric verification.
- Added sample validation results in JSON format for reference.
- Introduced setup.sh script for easy environment setup using UV for dependency management.
@pankaj1311 pankaj1311 marked this pull request as draft March 2, 2026 13:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants