Skip to content

Commit 09bd66c

Browse files
feat: 1. update docs and example config
fix: 1. change the error level of uncrawlable scenarios
1 parent b686a5d commit 09bd66c

File tree

6 files changed

+375
-84
lines changed

6 files changed

+375
-84
lines changed

README.md

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,9 @@ vibecoding, vibe coding, web evaluation, autonomous exploration, web testing aut
5454

5555
### 📋 Feature Highlights
5656

57-
- **🤖 AI-Powered Testing**: Performs autonomous website testing—explores pages, plans actions, and executes end-to-end flows without manual scripting.
58-
- **📊 Multi-Dimensional Observation**: Covers functionality, performance, user experience, and basic security; evaluates load speed, design details, and links to surface issues.
59-
- **🎯 Actionable Recommendations**: Runs in real browsers and provides concrete suggestions for improvement.
57+
- **🤖 AI-Powered Testing**: Performs autonomous website testing with intelligent planning and reflection—explores pages, plans actions, and executes end-to-end flows without manual scripting. Features 2-stage architecture (lightweight filtering + comprehensive planning) and dynamic test generation for newly appeared UI elements.
58+
- **📊 Multi-Dimensional Observation**: Covers functionality, performance, user experience, and basic security; evaluates load speed, design details, and links to surface issues. Uses multi-modal analysis (screenshots + DOM structure + text content) and DOM diff detection to discover new test opportunities.
59+
- **🎯 Actionable Recommendations**: Runs in real browsers with smart element prioritization and automatic viewport management. Provides concrete suggestions for improvement with adaptive recovery mechanisms for robust test execution.
6060
- **📈 Visual Reports**: Generates detailed HTML test reports with clear, multi-dimensional views for analysis and tracking.
6161

6262
## 📹 Examples
@@ -137,6 +137,7 @@ python webqa-agent.py
137137
target:
138138
url: https://example.com/ # Website URL to test
139139
description: example description
140+
# max_concurrent_tests: 2 # Optional, default parallel 2
140141

141142
test_config: # Test configuration
142143
function_test: # Functional testing
@@ -145,8 +146,8 @@ test_config: # Test configuration
145146
business_objectives: example business objectives # Recommended to include test scope, e.g., test search functionality
146147
dynamic_step_generation: # Optional, configuration for dynamic steps generation
147148
enabled: True # Optional, default False, recommended to set True to enable dynamic step generation
148-
max_dynamic_steps: 5 # Optional, default 5 test steps generated per trigger
149-
min_elements_threshold: 2 # Optional, default trigger threshold is 2 DOM element differences
149+
max_dynamic_steps: 10 # Optional, default 5, this example uses 10
150+
min_elements_threshold: 1 # Optional, default 2, this example uses 1 for higher sensitivity
150151
ux_test: # User experience testing
151152
enabled: True
152153
performance_test: # Performance analysis
@@ -155,28 +156,39 @@ test_config: # Test configuration
155156
enabled: False
156157

157158
llm_config: # Vision model configuration, currently supports OpenAI SDK compatible format only
158-
model: gpt-4.1-2025-04-14 # Recommended
159+
model: gpt-4.1-2025-04-14 # Primary model for Stage 2 test planning (Recommended)
160+
filter_model: gpt-4o-mini # Lightweight model for Stage 1 element filtering (cost-effective)
159161
api_key: your_api_key
160162
base_url: https://api.example.com/v1
163+
temperature: 0.1 # Optional, default 0.1
164+
# top_p: 0.9 # Optional, if not set, this parameter will not be passed
165+
# max_tokens: 8192 # Optional, maximum output tokens (supports generating more test cases)
161166

162167
browser_config:
163168
viewport: {"width": 1280, "height": 720}
164169
headless: False # Automatically overridden to True in Docker environment
165170
language: zh-CN
166171
cookies: []
172+
save_screenshots: False # Whether to save screenshots to local disk (default: False)
173+
174+
report:
175+
language: en-US # zh-CN, en-US
176+
177+
log:
178+
level: info
167179
```
168180
169181
Please note the following important considerations when configuring and running tests:
170182
171183
#### 1. Functional Testing Notes
172184
173-
- **AI Mode**: When specifying the number of test cases to generate in the configuration file, the system may re-plan based on actual page conditions. This may result in the final number of executed test cases differing from the initial configuration to ensure coverage and effectiveness.
185+
- **AI Mode**: Uses a 2-stage planning architecture where Stage 1 (filter_model) prioritizes elements for efficient analysis, and Stage 2 (primary model) generates comprehensive test cases. The system may reflect and re-plan based on actual page conditions and test coverage, which may result in the final number of executed test cases differing from the initial configuration to ensure effectiveness. When `dynamic_step_generation` is enabled, the system automatically generates additional test steps for newly appeared UI elements (e.g., dropdowns, modals) detected through DOM diff analysis.
174186

175187
- **Default Mode**: The `default` mode focuses on whether UI interactions (e.g., clicks and navigations) complete successfully.
176188

177189
#### 2. User Experience Testing Notes
178190

179-
UX (User Experience) testing focuses on usability, and user-friendliness. The model output in the results provides suggestions based on best practices to guide optimization.
191+
UX (User Experience) testing focuses on usability and user-friendliness. Uses multi-modal analysis combining screenshots, DOM structure, and text content to evaluate visual quality, detect typos/grammar issues, and validate layout rendering. The model output in the results provides suggestions based on best practices to guide optimization.
180192

181193
### 🧠 Recommended Models
182194

README_zh-CN.md

Lines changed: 20 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -53,9 +53,9 @@ Vibecoding, Vibe coding, 网页测试自动化, 浏览器测试工具, AI驱动
5353

5454
### 📋 特性概览
5555

56-
- **🤖 AI 自主测试**:WebQA-Agent能够自主进行网站测试,无需手写脚本,自动探索页面、规划动作并执行端到端流程
57-
- **📊 多维度观测**:覆盖功能、性能、用户体验、安全等核心场景,评估页面加载速度、设计细节和链接,全面保障系统质量
58-
- **🎯 可执行建议**:基于真实浏览器运行,输出具体的优化与改进建议
56+
- **🤖 AI 自主测试**:WebQA-Agent具备智能规划与反思能力,能够自主进行网站测试,无需手写脚本,自动探索页面、规划动作并执行端到端流程。采用两阶段架构(轻量级过滤+全面规划),并支持动态生成针对新出现UI元素的测试步骤
57+
- **📊 多维度观测**:覆盖功能、性能、用户体验、安全等核心场景,评估页面加载速度、设计细节和链接,全面保障系统质量。采用多模态分析(截图+DOM结构+文本内容)和DOM差异检测,发现新的测试机会
58+
- **🎯 可执行建议**:基于真实浏览器运行,具备智能元素优先级排序和自动视口管理能力,输出具体的优化与改进建议,并提供自适应恢复机制确保测试稳健执行
5959
- **📈 可视化报告**:生成详细的HTML测试报告,多维度、可视化展示执行结果,便于分析与追踪
6060

6161
## 📹 示例演示
@@ -140,6 +140,7 @@ python webqa-agent.py
140140
target:
141141
url: https://example.com/ # 需要测试的网站URL
142142
description: example description
143+
# max_concurrent_tests: 2 # 可选,默认并行2个
143144

144145
test_config: # 测试项配置
145146
function_test: # 功能测试
@@ -148,8 +149,8 @@ test_config: # 测试项配置
148149
business_objectives: example business objectives # 建议加入测试范围,如:测试搜索功能
149150
dynamic_step_generation: # 可选,动态生成步骤配置
150151
enabled: True # 可选, 默认False,建议设置为True使能动态步骤生成
151-
max_dynamic_steps: 5 # 可选,默认每次触发生成5步测试步骤
152-
min_elements_threshold: 2 # 可选,默认触发阈值为2个dom元素差异
152+
max_dynamic_steps: 10 # 可选,默认为5,此示例使用10
153+
min_elements_threshold: 1 # 可选,默认为2,此示例使用1以提高灵敏度
153154
ux_test: # 用户体验测试
154155
enabled: True
155156
performance_test: # 性能分析
@@ -158,28 +159,39 @@ test_config: # 测试项配置
158159
enabled: False
159160

160161
llm_config: # 视觉模型配置,当前仅支持 OpenAI SDK 兼容格式
161-
model: gpt-4.1-2025-04-14 # 推荐使用
162+
model: gpt-4.1-2025-04-14 # 第二阶段测试规划的主模型(推荐)
163+
filter_model: gpt-4o-mini # 第一阶段元素过滤的轻量级模型(经济实用)
162164
api_key: your_api_key
163165
base_url: https://api.example.com/v1
166+
temperature: 0.1 # 可选,默认0.1
167+
# top_p: 0.9 # 可选,如未设置将不传递该参数
168+
# max_tokens: 8192 # 可选,最大输出token数(支持生成更多测试用例)
164169

165170
browser_config:
166171
viewport: {"width": 1280, "height": 720}
167172
headless: False # Docker环境会自动覆盖为True
168173
language: zh-CN
169174
cookies: []
175+
save_screenshots: False # 是否将截图保存到本地磁盘(默认:False)
176+
177+
report:
178+
language: en-US # zh-CN, en-US
179+
180+
log:
181+
level: info
170182
```
171183
172184
在配置和运行测试时,请注意以下重要事项:
173185
174186
#### 1. 功能测试说明
175187
176-
- **AI模式**:当在配置文件中指定生成测试用例的数量时,系统可能会根据实际测试情况进行代理重新规划和调整。这可能导致最终执行的测试用例数量与初始设定存在一定出入,以确保测试的准确性和有效性
188+
- **AI模式**:采用两阶段规划架构,第一阶段(filter_model)优先排序元素以实现高效分析,第二阶段(主模型)生成全面的测试用例。系统会根据实际页面情况和测试覆盖率进行反思和重新规划,这可能导致最终执行的测试用例数量与初始设定存在一定出入,以确保测试的有效性。当启用 `dynamic_step_generation` 时,系统会通过DOM差异分析自动为新出现的UI元素(如下拉菜单、模态框)生成额外的测试步骤
177189

178190
- **Default模式**:功能测试的 `default` 模式主要验证UI元素的点击行为是否成功执行,包括按钮点击、链接跳转等基本交互功能。
179191

180192
#### 2. 用户体验测试说明
181193

182-
UX(用户体验)评估关注网页可用性与友好性。结果中的模型输出基于最佳实践给出改进建议,便于设计与开发参考。
194+
UX(用户体验)评估关注网页可用性与友好性。采用多模态分析,结合截图、DOM结构和文本内容来评估视觉质量、检测拼写/语法问题以及验证布局渲染。结果中的模型输出基于最佳实践给出改进建议,便于设计与开发参考。
183195

184196
### 🧠 推荐模型
185197

config/config.yaml.example

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,9 @@ test_config: # Test configuration
99
type: ai # default or ai
1010
business_objectives: Test Baidu search functionality, generate 3 test cases
1111
dynamic_step_generation:
12-
enabled: True
13-
max_dynamic_steps: 10
14-
min_elements_threshold: 1
12+
enabled: True # Default: False (this example enables it as recommended)
13+
max_dynamic_steps: 10 # Default: 5 (this example increases the limit)
14+
min_elements_threshold: 1 # Default: 2 (this example uses 1 for higher sensitivity)
1515
ux_test:
1616
enabled: True
1717
performance_test:

0 commit comments

Comments
 (0)