Skip to content

Commit 9482c2d

Browse files
quanruyuyutaotao
andauthored
feat(core): enhance cache management with read-only mode and flexible cache options (#1226)
* feat(core): enhance cache management with read-only mode and flexible cache options * feat(core): implement enhanced cache configuration with auto-generated IDs and read-only mode support * feat(web-integration): refactor cache configuration handling across agents and fixtures * chore(cli): lint * feat(docs): update caching documentation and add new test configurations for cache functionality * docs(site): update caching strategies documentation and improve examples for clarity * docs(site): enhance caching documentation with detailed principles and storage format * docs(core): update cache doc * docs(core): update docs for cache * refactor(web-integration): simplify cache flushing logic and update tests for read-write mode * docs(site): update cache flushing logic to persist only on test pass * docs(site): add console logs for cache flushing based on test status --------- Co-authored-by: yutao <[email protected]>
1 parent 11b11ed commit 9482c2d

25 files changed

+1449
-113
lines changed

apps/site/docs/en/caching.mdx

Lines changed: 154 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,92 +1,204 @@
1-
# Caching
1+
# Caching AI planning & locate
22

3-
Midscene supports caching the planning steps and DOM XPaths to reduce calls to AI models and greatly improve execution efficiency.
4-
5-
Android automation supports planning cache but not element location cache (due to lack of XPath support).
3+
Midscene supports caching Plan steps and matched DOM element information to reduce AI model calls and greatly improve execution efficiency. Please note that DOM element cache is only supported for web automation tasks.
64

75
**Effect**
86

9-
After enabling the cache, the execution time of AI service related steps can be significantly reduced.
7+
With caching hit, time cost is significantly reduced. For example, in the following case, execution time was reduced from 51 seconds to 28 seconds.
108

11-
* **before using cache, 39s**
9+
* **before**
1210

1311
![](/cache/no-cache-time.png)
1412

15-
* **after using cache, 13s**
13+
* **after**
1614

1715
![](/cache/use-cache-time.png)
1816

19-
## Instructions
17+
## Cache files and storage
18+
19+
Midscene's caching mechanism is based on input stability and output reusability. When the same task instructions are repeatedly executed in similar page environments, Midscene will prioritize using cached results to avoid repeated AI model calls, significantly improving execution efficiency.
20+
21+
The core caching mechanisms include:
22+
- **Task instruction caching**: For planning operations (such as `ai`, `aiAction`), Midscene uses the prompt instruction as the cache key to store the execution plan returned by AI
23+
- **Element location caching**: For location operations (such as `aiLocate`, `aiTap`), the system uses the location prompt as the cache key to store element XPath information, and verifies whether the XPath is still valid on the next execution
24+
- **Invalidation mechanism**: When cache becomes invalid, the system automatically falls back to AI model for re-analysis
25+
- **Never cache query results**: The query results like `aiBoolean`, `aiQuery`, `aiAssert` will never be cached.
26+
27+
Cache contents will be saved in the `./midscene_run/cache` directory with the `.cache.yaml` as the extension name.
28+
29+
## Cache strategies
2030

21-
There are two key points to use caching:
31+
By configuring the `cache` option, you can enable caching for your agent.
2232

23-
1. Set `MIDSCENE_CACHE=1` in the environment variable to enable matching cache.
24-
2. Set `cacheId` to specify the cache file name. It's automatically set in Playwright and Yaml mode. If you are using javascript SDK, you should set it manually.
33+
### Disable cache
2534

26-
### Playwright
35+
Configuration: `cache: false` or not configuring the `cache` option
2736

28-
When using Playwright integration, you can use the `MIDSCENE_CACHE=1` environment variable to enable caching.
37+
Completely disable cache functionality, always call AI model for every operation. Suitable when you need real-time results or for debugging. By default, if you don't configure the `cache` option, caching is disabled.
2938

30-
The `cacheId` will be automatically set to the test file name.
39+
```javascript
40+
// Direct Agent creation
41+
const agent = new PuppeteerAgent(page, {
42+
cache: false,
43+
});
44+
```
3145

32-
```diff
33-
- playwright test --config=playwright.config.ts
34-
+ MIDSCENE_CACHE=1 playwright test --config=playwright.config.ts
46+
```yaml
47+
# YAML configuration
48+
agent:
49+
cache: false
3550
```
3651
37-
### JavaScript agent, like PuppeteerAgent, AgentOverChromeBridge
52+
### Read-write mode
53+
54+
Configuration: `cache: { id: "my-cache-id" }`
55+
56+
Automatically read existing cache and update cache files during execution.
57+
58+
```javascript
59+
// Direct Agent creation - explicit cache ID
60+
const agent = new PuppeteerAgent(page, {
61+
cache: { id: "my-cache-id" },
62+
});
63+
```
64+
65+
```yaml
66+
# YAML configuration - explicit cache ID
67+
agent:
68+
cache:
69+
id: "my-cache-test"
70+
```
71+
72+
YAML mode also supports `cache: true` to automatically use the file name as the cache ID.
73+
74+
### Read-only, manual write
3875

39-
You should set the `cacheId` to specify the cache identifier.
40-
And also, you should enable cache-matching by setting the `MIDSCENE_CACHE=1` environment variable.
76+
Configuration: `cache: { strategy: "read-only", id: "my-cache-id" }`
4177

42-
```diff
43-
- tsx demo.ts
44-
+ MIDSCENE_CACHE=1 tsx demo.ts
78+
Only read cache, no automatic writing to cache files. Requires manual `agent.flushCache()` call to write cache files. Suitable for production environments to ensure cache consistency.
79+
80+
```javascript
81+
// Direct Agent creation
82+
const agent = new PuppeteerAgent(page, {
83+
cache: { strategy: "read-only", id: "my-cache-id" },
84+
});
85+
86+
// Manual cache write required
87+
await agent.flushCache();
88+
```
89+
90+
```yaml
91+
# YAML configuration
92+
agent:
93+
cache:
94+
id: "my-cache-test"
95+
strategy: "read-only"
4596
```
4697

98+
### Compatibility mode (not recommended)
99+
100+
Configuration via `MIDSCENE_CACHE=1` environment variable with cacheId, equivalent to read-write mode.
101+
47102
```javascript
48-
const mid = new PuppeteerAgent(originPage, {
49-
cacheId: 'puppeteer-swag-sab', // specify cache id
103+
// Old way, requires MIDSCENE_CACHE=1 environment variable and cacheId
104+
const agent = new PuppeteerAgent(originPage, {
105+
cacheId: 'puppeteer-swag-sab'
50106
});
51107
```
52108

53-
### YAML
109+
```bash
110+
MIDSCENE_CACHE=1 tsx demo.ts
111+
```
112+
113+
## Using Playwright AI Fixture from `@midscene/web/playwright`
54114

55-
Enable cache-matching by setting the `MIDSCENE_CACHE=1` environment variable.
56-
The `cacheId` will be automatically set to the yaml filename.
115+
When using `PlaywrightAiFixture` from `@midscene/web/playwright`, pass the same `cache` options to control caching behaviour.
57116

58-
```diff
59-
- npx midscene ./bing-search.yaml
60-
+ # Add cache identifier, cacheId is the yaml filename
61-
+ MIDSCENE_CACHE=1 npx midscene ./bing-search.yaml
117+
### Disable cache
118+
119+
```typescript
120+
// fixture.ts in sample code
121+
export const test = base.extend<PlayWrightAiFixtureType>(
122+
PlaywrightAiFixture({
123+
cache: false,
124+
}),
125+
);
62126
```
63127

64-
## Cache strategy
128+
### Read-write mode
129+
130+
```typescript
131+
// fixture.ts in sample code
132+
// Auto-generate cache ID from test metadata
133+
export const test = base.extend<PlayWrightAiFixtureType>(
134+
PlaywrightAiFixture({
135+
cache: true,
136+
}),
137+
);
138+
139+
// fixture.ts in sample code
140+
// Explicitly provide cache ID
141+
export const test = base.extend<PlayWrightAiFixtureType>(
142+
PlaywrightAiFixture({
143+
cache: { id: "my-fixture-cache" },
144+
}),
145+
);
146+
```
65147

66-
Cache contents will be saved in the `./midscene_run/cache` directory with the `.cache.yaml` as the extension name.
148+
### Read-only, manual write
67149

68-
These two types of content will be cached:
150+
```typescript
151+
// fixture.ts in sample code
152+
export const test = base.extend<PlayWrightAiFixtureType>(
153+
PlaywrightAiFixture({
154+
cache: { strategy: "read-only", id: "readonly-cache" },
155+
}),
156+
);
157+
```
69158

70-
1. the result of planning, like calls to `.ai` `.aiAction`
71-
2. The XPaths for elements located by AI, such as `.aiLocate`, `.aiTap`, etc.
159+
When you run the fixture in read-only mode you need to manually persist the cache after your test steps. Use the `agentForPage` helper provided by the fixture to fetch the underlying agent, then call `agent.flushCache()` at the point where you want to write the cache file:
160+
161+
```typescript
162+
test.afterEach(async ({ page, agentForPage }, testInfo) => {
163+
// Only flush cache if the test passed
164+
if (testInfo.status === 'passed') {
165+
console.log('Test passed, flushing Midscene cache...');
166+
const agent = await agentForPage(page);
167+
await agent.flushCache();
168+
} else {
169+
console.log(`Test ${testInfo.status}, skipping Midscene cache flush.`);
170+
}
171+
});
72172

73-
The query results like `aiBoolean`, `aiQuery`, `aiAssert` will never be cached.
173+
test('manual cache flush', async ({ agentForPage, page, aiTap, aiWaitFor }) => {
174+
const agent = await agentForPage(page);
74175

75-
If the cache is not hit, Midscene will call AI model again and the result in cache file will be updated.
176+
await aiTap('first highlighted link in the hero section');
177+
await aiWaitFor('the detail page loads completely');
76178

77-
## Common issues
179+
await agent.flushCache();
180+
});
181+
```
182+
183+
## FAQ
78184

79185
### No cache file is generated
80186

81-
Make sure you have set the `cacheId` in the constructor.
187+
Please ensure you have correctly configured caching:
188+
189+
1. **Direct Agent creation**: Set `cache: { id: "your-cache-id" }` in the constructor
190+
2. **Playwright AI Fixture mode**: Set `cache: true` or `cache: { id: "your-cache-id" }` in fixture configuration
191+
3. **YAML script mode**: Set `agent.cache.id` in the YAML file
192+
4. **Read-only mode**: Ensure you called the `agent.flushCache()` method
193+
5. **Legacy approach**: Set `cacheId` and enable `MIDSCENE_CACHE=1` environment variable
82194

83195
### How to check if the cache is hit?
84196

85197
You can view the report file. If the cache is hit, you will see the `cache` tip and the time cost is obviously reduced.
86198

87199
### Why the cache is missed on CI?
88200

89-
You should commit the cache file to the repository (which is in the `./midscene_run/cache` directory). And also, check whether the prompt is the same as the one in the cache file.
201+
You need to commit the cache files to the repository in CI and recheck the cache hit conditions.
90202

91203
### Does it mean that AI services are no longer needed after using cache?
92204

0 commit comments

Comments
 (0)