Skip to content

Commit 1de7224

Browse files
committed
Adding unit tests for snippets in README.md - required in issue 176
1 parent 519abed commit 1de7224

File tree

2 files changed

+540
-0
lines changed

2 files changed

+540
-0
lines changed

TESTING_README.md

Lines changed: 268 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,268 @@
1+
# Testing README Examples
2+
3+
This document describes how to run unit tests that verify all code examples from the main [README.md](README.md) work correctly.
4+
5+
## Overview
6+
7+
The test file `tests/test_readme_examples.py` contains comprehensive unit tests for all code snippets and examples shown in the README. These tests ensure that:
8+
9+
- All imports are valid and accessible
10+
- CLI commands exist and are functional
11+
- Core functions and classes work as documented
12+
- File paths referenced in README exist
13+
- API structures match the documentation
14+
15+
## Prerequisites
16+
17+
Before running the tests, ensure you have:
18+
19+
1. **Python 3.11 or 3.12** (or use the uv virtual environment)
20+
2. **AgentLab installed** with all dependencies:
21+
```bash
22+
uv sync
23+
```
24+
25+
3. **Playwright browsers installed**:
26+
```bash
27+
uv run playwright install
28+
```
29+
30+
## Running the Tests
31+
32+
### Run All README Tests
33+
34+
To run all tests for README examples:
35+
36+
```bash
37+
uv run pytest tests/test_readme_examples.py -v
38+
```
39+
40+
### Run Specific Test Classes
41+
42+
You can run tests for specific README sections:
43+
44+
```bash
45+
# Test installation and setup
46+
uv run pytest tests/test_readme_examples.py::TestReadmeInstallationAndSetup -v
47+
48+
# Test UI-Assistant examples
49+
uv run pytest tests/test_readme_examples.py::TestReadmeUIAssistant -v
50+
51+
# Test experiment launching examples
52+
uv run pytest tests/test_readme_examples.py::TestReadmeLaunchExperiments -v
53+
54+
# Test analysis examples
55+
uv run pytest tests/test_readme_examples.py::TestReadmeAnalyseResults -v
56+
57+
# Test AgentXray examples
58+
uv run pytest tests/test_readme_examples.py::TestReadmeAgentXray -v
59+
60+
# Test new agent implementation examples
61+
uv run pytest tests/test_readme_examples.py::TestReadmeImplementNewAgent -v
62+
63+
# Test reproducibility features
64+
uv run pytest tests/test_readme_examples.py::TestReadmeReproducibility -v
65+
66+
# Test benchmark examples
67+
uv run pytest tests/test_readme_examples.py::TestReadmeBenchmarks -v
68+
```
69+
70+
### Run Individual Tests
71+
72+
To run a specific test function:
73+
74+
```bash
75+
uv run pytest tests/test_readme_examples.py::TestReadmeLaunchExperiments::test_make_study_creates_study -v
76+
```
77+
78+
## Test Coverage by README Section
79+
80+
### ✅ Installation and Setup (Lines 72-87)
81+
82+
Tests verify:
83+
- `pip install agentlab` works
84+
- `playwright install` command is available
85+
- Package imports succeed
86+
87+
**Related tests:**
88+
- `TestReadmeInstallationAndSetup::test_agentlab_package_installed`
89+
- `TestReadmeInstallationAndSetup::test_playwright_install_command_exists`
90+
91+
### ✅ UI-Assistant (Lines 110-117)
92+
93+
Tests verify:
94+
- `agentlab-assistant` CLI command exists
95+
- Command accepts `--start_url` and `--agent_config` flags
96+
- Generic agent imports work
97+
98+
**Related tests:**
99+
- `TestReadmeUIAssistant::test_agentlab_assistant_command_exists`
100+
- `TestReadmeUIAssistant::test_generic_agent_import`
101+
102+
### ✅ Launch Experiments (Lines 122-149)
103+
104+
Tests verify:
105+
- `make_study()` function works correctly
106+
- `Study.load()` method exists
107+
- `study.find_incomplete()` method exists
108+
- `study.run()` method exists
109+
- All agent imports are valid
110+
111+
**Related tests:**
112+
- `TestReadmeLaunchExperiments::test_make_study_creates_study`
113+
- `TestReadmeLaunchExperiments::test_study_load_import`
114+
- `TestReadmeLaunchExperiments::test_study_find_incomplete`
115+
- `TestReadmeLaunchExperiments::test_agent_imports`
116+
117+
### ✅ main.py Examples (Line 147)
118+
119+
Tests verify:
120+
- `main.py` file exists in repository
121+
- All agent imports from `main.py` work
122+
- Study class can be imported
123+
124+
**Related tests:**
125+
- `TestReadmeMainPy::test_main_py_exists`
126+
- `TestReadmeMainPy::test_all_agent_imports_from_main`
127+
- `TestReadmeMainPy::test_study_import_from_main`
128+
129+
### ✅ Analyse Results (Lines 193-203)
130+
131+
Tests verify:
132+
- `inspect_results` module imports correctly
133+
- `load_result_df()` function exists
134+
- `ExpResult` class is accessible
135+
136+
**Related tests:**
137+
- `TestReadmeAnalyseResults::test_inspect_results_import`
138+
- `TestReadmeAnalyseResults::test_load_result_df_function`
139+
- `TestReadmeAnalyseResults::test_exp_result_class`
140+
141+
### ✅ AgentXray (Lines 210-226)
142+
143+
Tests verify:
144+
- `agentlab-xray` CLI command exists and is runnable
145+
146+
**Related tests:**
147+
- `TestReadmeAgentXray::test_agentlab_xray_command_exists`
148+
149+
### ✅ Implement a New Agent (Lines 239-245)
150+
151+
Tests verify:
152+
- `MostBasicAgent` file exists at documented path
153+
- `AgentArgs` API file exists
154+
- `AgentArgs` class can be imported
155+
156+
**Related tests:**
157+
- `TestReadmeImplementNewAgent::test_most_basic_agent_file_exists`
158+
- `TestReadmeImplementNewAgent::test_agent_args_file_exists`
159+
- `TestReadmeImplementNewAgent::test_agent_args_api`
160+
161+
### ✅ Reproducibility (Lines 265-278)
162+
163+
Tests verify:
164+
- `reproducibility_journal.csv` exists
165+
- `ReproducibilityAgent` file exists at documented path
166+
- Study class supports reproducibility features
167+
168+
**Related tests:**
169+
- `TestReadmeReproducibility::test_reproducibility_journal_exists`
170+
- `TestReadmeReproducibility::test_reproducibility_agent_exists`
171+
- `TestReadmeReproducibility::test_study_has_reproducibility_info`
172+
173+
### ✅ Supported Benchmarks (Lines 50-66)
174+
175+
Tests verify:
176+
- Benchmark names (like "miniwob") work with `make_study()`
177+
178+
**Related tests:**
179+
- `TestReadmeBenchmarks::test_miniwob_benchmark_accessible`
180+
181+
## Understanding Test Results
182+
183+
### Successful Test Output
184+
185+
When all tests pass, you'll see:
186+
```
187+
tests/test_readme_examples.py::TestReadmeInstallationAndSetup::test_agentlab_package_installed PASSED
188+
tests/test_readme_examples.py::TestReadmeLaunchExperiments::test_make_study_creates_study PASSED
189+
...
190+
======================== XX passed in X.XXs ========================
191+
```
192+
193+
### Failed Test Output
194+
195+
If a test fails, you'll see detailed error information:
196+
```
197+
tests/test_readme_examples.py::TestReadmeUIAssistant::test_agentlab_assistant_command_exists FAILED
198+
199+
FAILED tests/test_readme_examples.py::TestReadmeUIAssistant::test_agentlab_assistant_command_exists
200+
AssertionError: agentlab-assistant command should work
201+
```
202+
203+
This indicates that the README example may be outdated or there's an installation issue.
204+
205+
## Notes
206+
207+
- **API Keys Not Required**: These tests verify code structure and imports, not actual experiment execution. You don't need API keys (OPENAI_API_KEY, etc.) to run these tests.
208+
209+
- **No Actual Experiments**: Tests that call `make_study()` verify the function works but don't call `study.run()`, which would require:
210+
- Configured API keys
211+
- Set up benchmark environments
212+
- Significant time and resources
213+
214+
- **CLI Command Tests**: Tests for `agentlab-assistant` and `agentlab-xray` verify the commands exist and respond to `--help`, but don't actually launch the UIs.
215+
216+
- **File Existence Tests**: Some tests verify that files mentioned in README (like `main.py`, `reproducibility_journal.csv`) exist at their documented locations.
217+
218+
## Continuous Integration
219+
220+
These tests are ideal for CI/CD pipelines to ensure README examples stay up-to-date with code changes.
221+
222+
Example GitHub Actions workflow:
223+
```yaml
224+
- name: Test README Examples
225+
run: uv run pytest tests/test_readme_examples.py -v
226+
```
227+
228+
## Troubleshooting
229+
230+
### Test fails with "ModuleNotFoundError"
231+
232+
Make sure you've installed all dependencies:
233+
```bash
234+
uv sync
235+
```
236+
237+
### Test fails with "playwright not found"
238+
239+
Install Playwright browsers:
240+
```bash
241+
uv run playwright install
242+
```
243+
244+
### Test fails with "File not found"
245+
246+
Ensure you're running tests from the repository root directory:
247+
```bash
248+
cd /path/to/AgentLab
249+
uv run pytest tests/test_readme_examples.py -v
250+
```
251+
252+
## Contributing
253+
254+
When updating the README:
255+
256+
1. **Update code examples** in `README.md`
257+
2. **Update corresponding tests** in `tests/test_readme_examples.py`
258+
3. **Run tests** to verify:
259+
```bash
260+
uv run pytest tests/test_readme_examples.py -v
261+
```
262+
4. **Update this document** if test coverage changes
263+
264+
## Related Documentation
265+
266+
- [Main README](README.md) - Complete AgentLab documentation
267+
- [BrowserGym Documentation](https://github.com/ServiceNow/BrowserGym)
268+
- [Contributing Guidelines](CONTRIBUTING.md) (if applicable)

0 commit comments

Comments
 (0)