Skip to content

Commit c640c72

Browse files
authored
add environment.md and remove AutoRoundMLLM usage in readme (#1042)
* add environment.md and remove AutoRoundMLLM usage in readme
1 parent 6a8ac7e commit c640c72

File tree

5 files changed

+142
-6
lines changed

5 files changed

+142
-6
lines changed

README.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,8 @@ pip install auto-round-lib
117117
### CLI Usage
118118
The full list of supported arguments is provided by calling `auto-round -h` on the terminal.
119119

120+
> **ModelScope is supported for model downloads, simply set `AR_USE_MODELSCOPE=1`.**
121+
120122
```bash
121123
auto-round \
122124
--model Qwen/Qwen3-0.6B \
@@ -125,6 +127,7 @@ auto-round \
125127
--output_dir ./tmp_autoround
126128
```
127129

130+
128131
We offer another two recipes, `auto-round-best` and `auto-round-light`, designed for optimal accuracy and improved speed, respectively. Details are as follows.
129132
<details>
130133
<summary>Other Recipes</summary>
@@ -252,17 +255,17 @@ results.
252255

253256
**This feature is experimental and may be subject to changes**.
254257

255-
By default, AutoRoundMLLM only quantize the text module of VLMs and uses `NeelNanda/pile-10k` for calibration. To
258+
By default, AutoRound only quantize the text module of VLMs and uses `NeelNanda/pile-10k` for calibration. To
256259
quantize the entire model, you can enable `quant_nontext_module` by setting it to True, though support for this feature
257-
is limited. For more information, please refer to the AutoRoundMLLM [readme](./auto_round/mllm/README.md).
260+
is limited. For more information, please refer to the AutoRound [readme](./auto_round/mllm/README.md).
258261

259262
```python
260-
from auto_round import AutoRoundMLLM
263+
from auto_round import AutoRound
261264

262265
# Load the model
263266
model_name_or_path = "Qwen/Qwen2.5-VL-7B-Instruct"
264267
# Quantize the model
265-
ar = AutoRoundMLLM(model_name_or_path, scheme="W4A16")
268+
ar = AutoRound(model_name_or_path, scheme="W4A16")
266269
output_dir = "./qmodel"
267270
ar.quantize_and_save(output_dir)
268271
```

auto_round/envs.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
1414
# Note: the design of this module is inspired by vLLM's envs.py
15+
# For detailed usage and configuration guide, see: docs/environments.md
1516

1617
import os
1718
from typing import TYPE_CHECKING, Any, Callable, Optional

docs/environments.md

Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
# AutoRound Environment Variables Configuration
2+
3+
This document describes the environment variables used by AutoRound for configuration and their usage.
4+
5+
## Overview
6+
7+
AutoRound uses a centralized environment variable management system through the `envs.py` module. This system provides lazy evaluation of environment variables and programmatic configuration capabilities.
8+
9+
## Available Environment Variables
10+
11+
### AR_LOG_LEVEL
12+
- **Description**: Controls the default logging level for AutoRound
13+
- **Default**: `"INFO"`
14+
- **Valid Values**: `"TRACE"`, `"DEBUG"`, `"INFO"`, `"WARNING"`, `"ERROR"`, `"CRITICAL"`
15+
- **Usage**: Set this to control the verbosity of AutoRound logs
16+
17+
```bash
18+
export AR_LOG_LEVEL=DEBUG
19+
```
20+
21+
### AR_ENABLE_COMPILE_PACKING
22+
- **Description**: Enables compile packing optimization
23+
- **Default**: `False` (equivalent to `"0"`)
24+
- **Valid Values**: `"1"`, `"true"`, `"yes"` (case-insensitive) for enabling; any other value for disabling
25+
- **Usage**: Enable this for performance optimizations during packing FP4 tensors into `uint8`.
26+
27+
```bash
28+
export AR_ENABLE_COMPILE_PACKING=1
29+
```
30+
31+
### AR_USE_MODELSCOPE
32+
- **Description**: Controls whether to use ModelScope for model downloads
33+
- **Default**: `False`
34+
- **Valid Values**: `"1"`, `"true"` (case-insensitive) for enabling; any other value for disabling
35+
- **Usage**: Enable this to use ModelScope instead of Hugging Face Hub for model downloads
36+
37+
```bash
38+
export AR_USE_MODELSCOPE=true
39+
```
40+
41+
### AR_WORK_SPACE
42+
- **Description**: Sets the workspace directory for AutoRound operations
43+
- **Default**: `"ar_work_space"`
44+
- **Usage**: Specify a custom directory for AutoRound to store temporary files and outputs
45+
46+
```bash
47+
export AR_WORK_SPACE=/path/to/custom/workspace
48+
```
49+
50+
## Usage Examples
51+
52+
### Setting Environment Variables
53+
54+
#### Using Shell Commands
55+
```bash
56+
# Set logging level to DEBUG
57+
export AR_LOG_LEVEL=DEBUG
58+
59+
# Enable compile packing
60+
export AR_ENABLE_COMPILE_PACKING=1
61+
62+
# Use ModelScope for downloads
63+
export AR_USE_MODELSCOPE=true
64+
65+
# Set custom workspace
66+
export AR_WORK_SPACE=/tmp/autoround_workspace
67+
```
68+
69+
#### Using Python Code
70+
```python
71+
from auto_round.envs import set_config
72+
73+
# Configure multiple environment variables at once
74+
set_config(
75+
AR_LOG_LEVEL="DEBUG",
76+
AR_USE_MODELSCOPE=True,
77+
AR_ENABLE_COMPILE_PACKING=True,
78+
AR_WORK_SPACE="/tmp/autoround_workspace",
79+
)
80+
```
81+
82+
### Checking Environment Variables
83+
84+
#### Using Python Code
85+
```python
86+
from auto_round import envs
87+
88+
# Access environment variables (lazy evaluation)
89+
log_level = envs.AR_LOG_LEVEL
90+
use_modelscope = envs.AR_USE_MODELSCOPE
91+
enable_packing = envs.AR_ENABLE_COMPILE_PACKING
92+
workspace = envs.AR_WORK_SPACE
93+
94+
print(f"Log Level: {log_level}")
95+
print(f"Use ModelScope: {use_modelscope}")
96+
print(f"Enable Compile Packing: {enable_packing}")
97+
print(f"Workspace: {workspace}")
98+
```
99+
100+
#### Checking if Variables are Explicitly Set
101+
```python
102+
from auto_round.envs import is_set
103+
104+
# Check if environment variables are explicitly set
105+
if is_set("AR_LOG_LEVEL"):
106+
print("AR_LOG_LEVEL is explicitly set")
107+
else:
108+
print("AR_LOG_LEVEL is using default value")
109+
```
110+
111+
## Configuration Best Practices
112+
113+
1. **Development Environment**: Set `AR_LOG_LEVEL=TRACE` or `AR_LOG_LEVEL=DEBUG` for detailed logging during development
114+
2. **Production Environment**: Use `AR_LOG_LEVEL=WARNING` or `AR_LOG_LEVEL=ERROR` to reduce log noise
115+
3. **Chinese Users**: Consider setting `AR_USE_MODELSCOPE=true` for better model download performance
116+
4. **Performance Optimization**: Enable `AR_ENABLE_COMPILE_PACKING=1` if you have sufficient computational resources
117+
5. **Custom Workspace**: Set `AR_WORK_SPACE` to a directory with sufficient disk space for model processing
118+
119+
## Notes
120+
121+
- Environment variables are evaluated lazily, meaning they are only read when first accessed
122+
- The `set_config()` function provides a convenient way to configure multiple variables programmatically
123+
- Boolean values for `AR_USE_MODELSCOPE` are automatically converted to appropriate string representations
124+
- All environment variable names are case-sensitive
125+
- Changes made through `set_config()` will affect the current process and any child processes

docs/step_by_step.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,9 @@ This document presents step-by-step instructions for auto-round llm quantization
1010
+ [Dataset operations](#dataset-operations)
1111
* [3 Quantization](#3-quantization)
1212
+ [Supported Quantization Configurations](#supported-quantization-configurations)
13-
+ [Hardware Compatibility](#hardware-compatibility)
1413
+ [Supported Export Formats](#supported-export-formats)
14+
+ [Hardware Compatibility](#hardware-compatibility)
15+
+ [Environment Configuration](#environment-configuration)
1516
+ [Command Line Usage](#command-line-usage)
1617
+ [API usage](#api-usage)
1718
- [AutoRound API Usage](#autoround-api-usage)
@@ -149,6 +150,10 @@ adopted within the community, **only 4-bits quantization is supported**. Please
149150

150151
CPU, Intel GPU, HPU and CUDA for both quantization and inference.
151152

153+
### Environment Configuration
154+
155+
Before starting quantization, you may want to configure AutoRound's environment variables for optimal performance. For detailed information about available environment variables (logging levels, ModelScope integration, workspace settings, etc.), please refer to the [Environment Variables Guide](./environments.md).
156+
152157
### Command Line Usage
153158

154159

docs/tips_and_tricks.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
**AutoRound** [version 0.4](https://github.com/intel/auto-round) is set for release, introducing major updates to
2-
support Vision-Language Models (VLMs). During this process, weve gathered insights from quantizing various models.
2+
support Vision-Language Models (VLMs). During this process, we've gathered insights from quantizing various models.
33
While the scope of quantization has been somewhat limited, the following tips may still prove useful as a reference.
44

5+
For environment configuration and setup options, please refer to the [Environment Variables Guide](./environments.md).
6+
57
### 1. VLM Quantization and Calibration Dataset Choice
68

79
**Background:** VLM models typically consist of two main components: a language model and a vision model. This gives

0 commit comments

Comments
 (0)