Skip to content

test(evals): add 13 more math/time tasks; smoke-tested with mock prov… #2

test(evals): add 13 more math/time tasks; smoke-tested with mock prov…

test(evals): add 13 more math/time tasks; smoke-tested with mock prov… #2

Workflow file for this run

name: CI
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install package
run: |
python -m pip install --upgrade pip
pip install -e .[repair]
- name: Run tests
env:
# Force mock provider to avoid external API calls
LLM_PROVIDER: mock
USE_TOOL_CALLS: '0'
run: |
pytest -q
- name: Run evals (smoke)
env:
LLM_PROVIDER: mock
USE_TOOL_CALLS: '0'
run: |
python evals/run_evals.py --n 6