Conversation
|
|
||
| ## Generative Nix: Surveying LLM Proficiency In NixOS | ||
|
|
||
| Effort: small (90 hours) |
There was a problem hiding this comment.
I would not trust any survey that took less than 100 hours to conduct.
For reference, see NixOS/nixpkgs#410741 (comment) for a possible "survey", although this one cannot be conducted externally.
There was a problem hiding this comment.
Survey in the sense of "see what's out there", as one might survey a landscape to make a map--not survey as in "let's poll a bunch of people". Sorry for any miscommunication.
There was a problem hiding this comment.
Survey in the sense of "see what's out there", as one might survey a landscape to make a map--not survey as in "let's poll a bunch of people".
IIUC, you want to benchmark and rank LLMs to determine the currently best one for Nix. With LLMs constantly being obsoleted by better ones, would it not be better to establish a benchmark suite for continously updating the ranking instead of providing a one-time ranking?
Delegating this effort to the Nix community sounds like a lot of effort, when IMHO LLMs should be the ones promoting and declaring their domain proficiencies.
Either way, take my input with a grain of salt because I am not really interested in using LLMs.
There was a problem hiding this comment.
The deliverables for this project include exactly that, a reusable selection of benchmarks for that purpose.
|
This would be useful for #32 |
No description provided.