Skip to content

Commit da49791

Browse files
authored
feat(computer): improve Linux headless display quality (#88)
* feat(computer): improve Linux headless display quality - Increase Xvfb resolution from default 1920x1080 to 2560x1440x24 - Add fluxbox window manager for proper window decorations - Install GTK themes (adwaita, gnome-themes-extra) for native UI styling - Add Noto CJK fonts for proper CJK character rendering - Add dbus-x11 for GTK application support * ci: add branch trigger for PR visibility * ci: trigger workflow on pull requests targeting main * fix(computer): maximize Obsidian window on launch * fix(computer): reduce flakiness in community plugin installation steps - Add aiWaitFor verification after each navigation step to catch failures early - Split conditional "Turn on" action and confirmation dialog into separate steps - Wait for "Browse" button to be visible before clicking it - Increase sleep times between UI interactions - Set MIDSCENE_REPLANNING_CYCLE_LIMIT=40 as safety net * fix(computer): maximize window via xdotool instead of --start-maximized --start-maximized Electron flag doesn't work with fluxbox. Use xdotool to find the Obsidian window and resize it to fill the entire screen. * fix(computer): maximize window by double-clicking title bar via aiAct * fix(computer): click maximize button instead of double-clicking title bar * fix(computer): configure fluxbox to auto-maximize all windows Write ~/.fluxbox/apps config before starting fluxbox so all windows launch maximized automatically. This eliminates the need to manually click maximize buttons or double-click title bars.
1 parent 4190529 commit da49791

File tree

2 files changed

+61
-12
lines changed

2 files changed

+61
-12
lines changed

.github/workflows/computer-electron-demo.yaml

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,15 @@ on:
1414
description: 'The branch to checkout'
1515
required: false
1616
default: 'main'
17+
pull_request:
18+
branches:
19+
- main
20+
paths:
21+
- 'computer/electron-demo/**'
22+
- '.github/workflows/computer-electron-demo.yaml'
1723
push:
1824
branches:
1925
- main
20-
- feat/computer-electron-demo
2126
paths:
2227
- 'computer/electron-demo/**'
2328
- '.github/workflows/computer-electron-demo.yaml'
@@ -34,6 +39,7 @@ jobs:
3439
MIDSCENE_MODEL_NAME: 'qwen-vl-max-latest'
3540
MIDSCENE_USE_QWEN_VL: 1
3641
MIDSCENE_COMPUTER_HEADLESS_LINUX: 'true'
42+
MIDSCENE_REPLANNING_CYCLE_LIMIT: '40'
3743
DEBUG: 'midscene:ai:profile:*'
3844

3945
steps:
@@ -63,7 +69,12 @@ jobs:
6369
libnotify4 \
6470
libsecret-1-0 \
6571
libxss1 \
66-
xdg-utils
72+
xdg-utils \
73+
fluxbox \
74+
adwaita-icon-theme \
75+
gnome-themes-extra \
76+
fonts-noto-cjk \
77+
dbus-x11
6778
6879
- name: Download Obsidian AppImage
6980
working-directory: computer/electron-demo

computer/electron-demo/demo.ts

Lines changed: 48 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -103,8 +103,27 @@ function launchApp(
103103
aiActionContext:
104104
'You are interacting with Obsidian, a note-taking desktop application. ' +
105105
'If any dialog or popup appears, dismiss it by clicking the close button or pressing Escape.',
106+
xvfbResolution: '2560x1440x24',
106107
});
107108

109+
// --- Start window manager for proper window decorations on headless Linux ---
110+
if (process.platform === 'linux' && process.env.DISPLAY) {
111+
// Configure fluxbox to auto-maximize all windows
112+
const fluxboxDir = join(homedir(), '.fluxbox');
113+
mkdirSync(fluxboxDir, { recursive: true });
114+
writeFileSync(
115+
join(fluxboxDir, 'apps'),
116+
'[app] (name=.*)\n [Maximized] {yes}\n[end]\n',
117+
);
118+
const fluxbox = spawn('fluxbox', [], {
119+
detached: true,
120+
stdio: 'ignore',
121+
});
122+
fluxbox.unref();
123+
console.log('Fluxbox window manager started (auto-maximize enabled)');
124+
await sleep(1000);
125+
}
126+
108127
// --- Launch Obsidian AFTER Xvfb is ready (DISPLAY is now set) ---
109128
const child = launchApp(binaryPath, VAULT_DIR);
110129
console.log(`Obsidian launched (pid: ${child.pid})`);
@@ -152,35 +171,54 @@ function launchApp(
152171
// --- Install community plugin: LifeOS ---
153172
console.log('Opening Obsidian settings...');
154173
await agent.aiAct('press Ctrl+Comma to open Settings');
155-
await sleep(2000);
174+
await sleep(3000);
156175

157-
await agent.aiWaitFor('Settings dialog is visible', { timeoutMs: 15000 });
176+
await agent.aiWaitFor(
177+
'A Settings dialog/panel is visible with a left sidebar containing menu items',
178+
{ timeoutMs: 15000 },
179+
);
180+
console.log('Settings dialog is open');
158181

159182
// Navigate to Community plugins
160183
await agent.aiAct(
161-
'click "Community plugins" in the left sidebar of the settings dialog',
184+
'In the Settings dialog, click "Community plugins" in the left sidebar',
162185
);
163-
await sleep(1500);
186+
await sleep(2000);
187+
188+
await agent.aiWaitFor(
189+
'The Community plugins settings page is visible',
190+
{ timeoutMs: 10000 },
191+
);
192+
console.log('Community plugins page is visible');
164193

165194
// Turn on community plugins if not enabled
166195
await agent.aiAct(
167-
'If there is a "Turn on community plugins" button, click it. ' +
168-
'If a confirmation dialog appears, click "Turn on" to confirm.',
196+
'If there is a "Turn on community plugins" button, click it',
169197
);
170-
await sleep(1500);
198+
await sleep(2000);
199+
200+
// Handle confirmation dialog separately
201+
await agent.aiAct(
202+
'If a confirmation dialog is visible asking to turn on community plugins, click the "Turn on" button to confirm. Otherwise do nothing.',
203+
);
204+
await sleep(2000);
171205

172206
// Open the plugin browser
173-
await agent.aiAct('click "Browse" button to open the community plugin browser');
207+
await agent.aiWaitFor(
208+
'A "Browse" button is visible on the Community plugins page',
209+
{ timeoutMs: 10000 },
210+
);
211+
await agent.aiAct('click the "Browse" button');
174212
await sleep(3000);
175213

176214
await agent.aiWaitFor(
177-
'Community plugin browser / marketplace is visible with a search box',
215+
'Community plugin browser is visible with a search box',
178216
{ timeoutMs: 20000 },
179217
);
180218
console.log('Community plugin browser is open');
181219

182220
// Search for LifeOS
183-
await agent.aiAct('type "lifeos" in the search box');
221+
await agent.aiAct('click the search box and type "lifeos"');
184222
await sleep(3000);
185223

186224
// Click the LifeOS plugin from results

0 commit comments

Comments
 (0)