Releases: ServiceNow/BrowserGym
v0.2.2: Keyword arguments in high-level action space
browsergym-core
- minor fix: high-level action parser now properly handles keyword arguments in Python function calls (were converted to non-keyword arguments before)
browsergym-webarena
- minor fix: synced with latest webarena version (libwebarena=0.0.3), mostly typo fixes in task intents
v0.2.1: Set-of-Marks, visibility, bbox and more!
browsergym-core
-
New features
-
🎉 Set-of-Marks 🎉 a new method is available to easily overlay element boxes and
bidattributes on top of the screenshot, following ideas from WebVoyager and OSWorldfrom browsergym.utils.obs import overlay_som ... obs, info = env.reset() screenshot_with_som = overlay_som(obs["screenshot"], obs["extra_element_properties"], fontsize = 12, linewidth = 2, tag_margin = 2)
-
new high-level actions
upload_fileandmouse_upload_file -
new field
"extra_element_properties"in each observation. Contains a dict withbidkeys, which gives the extra properties computed by browsergym for every element with a bid on the current page. Example:{ "23": { "visibility": 0.6, # float between 0 and 1 "bbox": [56, 345, 12, 20], # [x, y, width, height] "clickable": True, # boolean "set_of_marks": False, # boolean } -
new
set_of_marksproperty (computed with JS tagbrowsergym_set_of_marks), following WebVoyager implementation (boolean 0 or 1, whether element should be part of the set-of-marks overlay) -
new
clickableproperty, extracted from Chrome's DOMSnapshot'sisClickable -
new info fields
"action_exec_start","action_exec_timeout"and"action_exec_stop"after eachenv.step()call, useful for video editing -
new
resizeable_windowparameter inBrowserEnvto switch between setting the viewport size via Chrome (previous behavior, resizeable window and viewport) or via Playwright (new default behaviour, viewport is not resizeable)
-
-
Breaking changes
- changed visibility tag in JS from
browsergym_is_in_viewport(boolean 0 or 1) tobrowsergym_visibility_ratio(value between 0.0 and 1.0), extracted as thevisibilityextra property (see new features) BrowserEnvparametersviewport(viewport size),slow_mo(pause between playwright calls) andtimeout(default playwright timeout) are now provided by the task. They can still be set in the environment's constructor to override the value provided by the task, which will display a warning.- each task inheriting
AbstractBrowserTaskmust now take a seed at instantiation (in constructor), instead of via thetask.setup()method. This is also where each task should decide its desired browser setting by setting its attributestask.viewport,task.slow_moandtask.timeout(see point above)
- changed visibility tag in JS from
-
Refactors
- bid-based high-level actions fail faster (500 ms)
- shorter nested bids with alphabetical bids for iframes (
21-53->a53) - fix mouse display position in demo mode (
absolute->fixed) - modern chat theme
- refactored coordinate computation using Chrome's DOMSnapshot instead of JS, should be more robust to edge cases
- refactored visibility computation using the
IntersectionObserverAPI, should be more robust to edge cases - more robust frame marking, supports edge cases such as sandboxed iframes, and pdf viewers in
<embed>tags
browsergym-miniwob
- fixed goal conversion to text in task
browsergym/miniwob/click-menu-2
v0.1.0rc7
version bump
v0.1.0rc6
rc6
v0.1.0rc5
version bump
v0.1.0rc4
version bump
v0.1.0rc3
version bump
v0.1.0rc2
rc2
v0.1.0rc1
version bump
v0.1.0rc0
version bump
