Add benchmark results for deepseek/deepseek-v3.2-exp #115

github-actions · 2025-09-29T17:15:15Z

This PR adds benchmark results for the deepseek/deepseek-v3.2-exp model.

The following files have been updated:

src/benchmark/results.json - Raw benchmark results
src/benchmark/validation-results.json - Validation results against human baseline

This PR was automatically generated by the benchmark workflow.

Note: If you don't want to merge this PR, close it and the model will be added to the untested list to prevent re-processing.

@alrocar

vercel · 2025-09-29T17:15:20Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
llm-benchmark	Ready	Preview	Comment	Sep 29, 2025 5:16pm

github-actions · 2025-09-29T17:18:20Z

Model Rejected

The model deepseek/deepseek-v3.2-exp has been added to the untested models list to prevent automatic re-processing.

What this means:

This model will not be automatically benchmarked again
It's now in the "untested" category (not failed, just not wanted)
You can manually remove it from the untested list if you change your mind

To manage this model:

# Remove from untested list (if you want to try again later)
npm run manage-failed-models remove deepseek deepseek-v3.2-exp

# List all untested models
npm run manage-failed-models list

@alrocar

- Model: deepseek/deepseek-v3.2-exp - Reason: PR rejected/closed without merging - PR: #115 This prevents the model from being automatically benchmarked again.

feat: add benchmark results for deepseek/deepseek-v3.2-exp

6bfef62

vercel bot deployed to Preview September 29, 2025 17:16 View deployment

alrocar closed this Sep 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add benchmark results for deepseek/deepseek-v3.2-exp #115

Add benchmark results for deepseek/deepseek-v3.2-exp #115

Uh oh!

github-actions bot commented Sep 29, 2025

Uh oh!

vercel bot commented Sep 29, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add benchmark results for deepseek/deepseek-v3.2-exp #115

Add benchmark results for deepseek/deepseek-v3.2-exp #115

Uh oh!

Conversation

github-actions bot commented Sep 29, 2025

Uh oh!

vercel bot commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Sep 29, 2025

Model Rejected

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel bot commented Sep 29, 2025 •

edited

Loading