GitHub - g4ix/advLab1-HITS: Project for an advanced lab investigating LLM benchmarks from an IR perspective. Instead of focusing on model performance, we evaluated benchmark robustness, identifying which questions truly differentiate models and whether leaderboard rankings reflect real differences or are dominated by easy, high-hubness items.

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
corr		corr
data		data
docs		docs
plots		plots
.gitignore		.gitignore
Analysis.ipynb		Analysis.ipynb
model_dimensions.json		model_dimensions.json

About

Project for an advanced lab investigating LLM benchmarks from an IR perspective. Instead of focusing on model performance, we evaluated benchmark robustness, identifying which questions truly differentiate models and whether leaderboard rankings reflect real differences or are dominated by easy, high-hubness items.

benchmark information-retrieval ir hits-algorithm llm