|
1 | 1 | # BigCodeBench |
2 | 2 |
|
3 | | -> [!WARNING] |
4 | | -> The project is under active development. Please check back later for more updates. |
5 | | -
|
6 | 3 | > [!WARNING] |
7 | 4 | > Please use BigCodeBench with caution. Different from [EvalPlus](https://github.com/evalplus/evalplus), BigCodeBench has a much less constrained execution environment to support tasks with diverse library dependencies. This may lead to security risks. We recommend using a sandbox such as [Docker](https://docs.docker.com/get-docker/) to run the evaluation. |
8 | 5 |
|
@@ -54,7 +51,13 @@ We inherit the design of the EvalPlus framework, which is a flexible and extensi |
54 | 51 | To get started, please first set up the environment: |
55 | 52 |
|
56 | 53 | ```shell |
| 54 | +# Install to use bigcodebench.evaluate |
57 | 55 | pip install bigcodebench --upgrade |
| 56 | +pip install -I -r https://raw.githubusercontent.com/bigcode-project/bigcodebench/main/Requirements/requirements-eval.txt |
| 57 | + |
| 58 | +# Install to use bigcodebench.generate |
| 59 | +# You are strongly recommended to install the generate dependencies in a separate environment |
| 60 | +pip install bigcodebench[generate] --upgrade |
58 | 61 | ``` |
59 | 62 |
|
60 | 63 | <details><summary>⏬ Install nightly version <i>:: click to expand ::</i></summary> |
@@ -158,6 +161,10 @@ We provide a tool namely `bigcodebench.sanitize` to clean up the code: |
158 | 161 | bigcodebench.sanitize --samples samples.jsonl |
159 | 162 | # Sanitized code will be produced to `samples-sanitized.jsonl` |
160 | 163 |
|
| 164 | +# 💡 If you want to get the calibrated results: |
| 165 | +bigcodebench.sanitize --samples samples.jsonl --calibrate |
| 166 | +# Sanitized code will be produced to `samples-sanitized-calibrate.jsonl` |
| 167 | + |
161 | 168 | # 💡 If you are storing codes in directories: |
162 | 169 | bigcodebench.sanitize --samples /path/to/vicuna-[??]b_temp_[??] |
163 | 170 | # Sanitized code will be produced to `/path/to/vicuna-[??]b_temp_[??]-sanitized` |
|
0 commit comments