add environment.md and remove AutoRoundMLLM usage in readme (#1042)

xin3he · web-flow · commit c640c72526a6 · 2025-11-19T14:52:21.000+08:00
* add environment.md and remove AutoRoundMLLM usage in readme
diff --git a/README.md b/README.md
@@ -117,6 +117,8 @@ pip install auto-round-lib
 ### CLI Usage
 The full list of supported arguments is provided by calling `auto-round -h` on the terminal.
 
+> **ModelScope is supported for model downloads, simply set `AR_USE_MODELSCOPE=1`.**
+
 ```bash
 auto-round \
     --model Qwen/Qwen3-0.6B \
@@ -125,6 +127,7 @@ auto-round \
     --output_dir ./tmp_autoround
 ```
 
+
 We offer another two recipes, `auto-round-best` and `auto-round-light`, designed for optimal accuracy and improved speed, respectively. Details are as follows.
 <details>
   <summary>Other Recipes</summary>
@@ -252,17 +255,17 @@ results.
 
 **This feature is experimental and may be subject to changes**.
 
-By default, AutoRoundMLLM only quantize the text module of VLMs and uses `NeelNanda/pile-10k` for calibration. To
+By default, AutoRound only quantize the text module of VLMs and uses `NeelNanda/pile-10k` for calibration. To
 quantize the entire model, you can enable `quant_nontext_module` by setting it to True, though support for this feature
-is limited. For more information, please refer to the AutoRoundMLLM [readme](./auto_round/mllm/README.md).
+is limited. For more information, please refer to the AutoRound [readme](./auto_round/mllm/README.md).
 
 ```python
-from auto_round import AutoRoundMLLM
+from auto_round import AutoRound
 
 # Load the model
 model_name_or_path = "Qwen/Qwen2.5-VL-7B-Instruct"
 # Quantize the model
-ar = AutoRoundMLLM(model_name_or_path, scheme="W4A16")
+ar = AutoRound(model_name_or_path, scheme="W4A16")
 output_dir = "./qmodel"
 ar.quantize_and_save(output_dir)
 ```
diff --git a/auto_round/envs.py b/auto_round/envs.py
@@ -12,6 +12,7 @@
 # See the License for the specific language governing permissions and
 # limitations under the License.
 # Note: the design of this module is inspired by vLLM's envs.py
+# For detailed usage and configuration guide, see: docs/environments.md
 
 import os
 from typing import TYPE_CHECKING, Any, Callable, Optional
diff --git a/docs/environments.md b/docs/environments.md
@@ -0,0 +1,125 @@
+# AutoRound Environment Variables Configuration
+
+This document describes the environment variables used by AutoRound for configuration and their usage.
+
+## Overview
+
+AutoRound uses a centralized environment variable management system through the `envs.py` module. This system provides lazy evaluation of environment variables and programmatic configuration capabilities.
+
+## Available Environment Variables
+
+### AR_LOG_LEVEL
+- **Description**: Controls the default logging level for AutoRound
+- **Default**: `"INFO"`
+- **Valid Values**: `"TRACE"`,  `"DEBUG"`, `"INFO"`, `"WARNING"`, `"ERROR"`, `"CRITICAL"`
+- **Usage**: Set this to control the verbosity of AutoRound logs
+
+```bash
+export AR_LOG_LEVEL=DEBUG
+```
+
+### AR_ENABLE_COMPILE_PACKING
+- **Description**: Enables compile packing optimization
+- **Default**: `False` (equivalent to `"0"`)
+- **Valid Values**: `"1"`, `"true"`, `"yes"` (case-insensitive) for enabling; any other value for disabling
+- **Usage**: Enable this for performance optimizations during packing FP4 tensors into `uint8`.
+
+```bash
+export AR_ENABLE_COMPILE_PACKING=1
+```
+
+### AR_USE_MODELSCOPE
+- **Description**: Controls whether to use ModelScope for model downloads
+- **Default**: `False`
+- **Valid Values**: `"1"`, `"true"` (case-insensitive) for enabling; any other value for disabling
+- **Usage**: Enable this to use ModelScope instead of Hugging Face Hub for model downloads
+
+```bash
+export AR_USE_MODELSCOPE=true
+```
+
+### AR_WORK_SPACE
+- **Description**: Sets the workspace directory for AutoRound operations
+- **Default**: `"ar_work_space"`
+- **Usage**: Specify a custom directory for AutoRound to store temporary files and outputs
+
+```bash
+export AR_WORK_SPACE=/path/to/custom/workspace
+```
+
+## Usage Examples
+
+### Setting Environment Variables
+
+#### Using Shell Commands
+```bash
+# Set logging level to DEBUG
+export AR_LOG_LEVEL=DEBUG
+
+# Enable compile packing
+export AR_ENABLE_COMPILE_PACKING=1
+
+# Use ModelScope for downloads
+export AR_USE_MODELSCOPE=true
+
+# Set custom workspace
+export AR_WORK_SPACE=/tmp/autoround_workspace
+```
+
+#### Using Python Code
+```python
+from auto_round.envs import set_config
+
+# Configure multiple environment variables at once
+set_config(
+    AR_LOG_LEVEL="DEBUG",
+    AR_USE_MODELSCOPE=True,
+    AR_ENABLE_COMPILE_PACKING=True,
+    AR_WORK_SPACE="/tmp/autoround_workspace",
+)
+```
+
+### Checking Environment Variables
+
+#### Using Python Code
+```python
+from auto_round import envs
+
+# Access environment variables (lazy evaluation)
+log_level = envs.AR_LOG_LEVEL
+use_modelscope = envs.AR_USE_MODELSCOPE
+enable_packing = envs.AR_ENABLE_COMPILE_PACKING
+workspace = envs.AR_WORK_SPACE
+
+print(f"Log Level: {log_level}")
+print(f"Use ModelScope: {use_modelscope}")
+print(f"Enable Compile Packing: {enable_packing}")
+print(f"Workspace: {workspace}")
+```
+
+#### Checking if Variables are Explicitly Set
+```python
+from auto_round.envs import is_set
+
+# Check if environment variables are explicitly set
+if is_set("AR_LOG_LEVEL"):
+    print("AR_LOG_LEVEL is explicitly set")
+else:
+    print("AR_LOG_LEVEL is using default value")
+```
+
+## Configuration Best Practices
+
+1. **Development Environment**: Set `AR_LOG_LEVEL=TRACE` or `AR_LOG_LEVEL=DEBUG` for detailed logging during development
+2. **Production Environment**: Use `AR_LOG_LEVEL=WARNING` or `AR_LOG_LEVEL=ERROR` to reduce log noise
+3. **Chinese Users**: Consider setting `AR_USE_MODELSCOPE=true` for better model download performance
+4. **Performance Optimization**: Enable `AR_ENABLE_COMPILE_PACKING=1` if you have sufficient computational resources
+5. **Custom Workspace**: Set `AR_WORK_SPACE` to a directory with sufficient disk space for model processing
+
+## Notes
+
+- Environment variables are evaluated lazily, meaning they are only read when first accessed
+- The `set_config()` function provides a convenient way to configure multiple variables programmatically
+- Boolean values for `AR_USE_MODELSCOPE` are automatically converted to appropriate string representations
+- All environment variable names are case-sensitive
+- Changes made through `set_config()` will affect the current process and any child processes
diff --git a/docs/step_by_step.md b/docs/step_by_step.md
@@ -10,8 +10,9 @@ This document presents step-by-step instructions for auto-round llm quantization
   + [Dataset operations](#dataset-operations)
 * [3 Quantization](#3-quantization)
   + [Supported Quantization Configurations](#supported-quantization-configurations)
-  + [Hardware Compatibility](#hardware-compatibility)
   + [Supported Export Formats](#supported-export-formats)
+  + [Hardware Compatibility](#hardware-compatibility)
+  + [Environment Configuration](#environment-configuration)
   + [Command Line Usage](#command-line-usage)
   + [API usage](#api-usage)
     - [AutoRound API Usage](#autoround-api-usage)
@@ -149,6 +150,10 @@ adopted within the community, **only 4-bits quantization is supported**. Please
 
 CPU, Intel GPU, HPU and CUDA for both quantization and inference.
 
+### Environment Configuration
+
+Before starting quantization, you may want to configure AutoRound's environment variables for optimal performance. For detailed information about available environment variables (logging levels, ModelScope integration, workspace settings, etc.), please refer to the [Environment Variables Guide](./environments.md).
+
 ### Command Line Usage
 
 
diff --git a/docs/tips_and_tricks.md b/docs/tips_and_tricks.md
@@ -1,7 +1,9 @@
 **AutoRound** [version 0.4](https://github.com/intel/auto-round) is set for release, introducing major updates to
-support Vision-Language Models (VLMs). During this process, we’ve gathered insights from quantizing various models.
+support Vision-Language Models (VLMs). During this process, we've gathered insights from quantizing various models.
 While the scope of quantization has been somewhat limited, the following tips may still prove useful as a reference.
 
+For environment configuration and setup options, please refer to the [Environment Variables Guide](./environments.md).
+
 ### 1. VLM Quantization and Calibration Dataset Choice
 
 **Background:** VLM models typically consist of two main components: a language model and a vision model. This gives