Skip to content

Commit 7c0791d

Browse files
authored
v0.2.1 (#8)
* fix problematic task configs * version bump * missing author * black config * README update * unit tests fix * pypi workflow update * README update with dashboard tasks
1 parent 66b7078 commit 7c0791d

File tree

11 files changed

+8171
-8108
lines changed

11 files changed

+8171
-8108
lines changed

.github/workflows/pypi.yml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,10 +48,11 @@ jobs:
4848
uses: pypa/gh-action-pypi-publish@release/v1
4949

5050
github-release:
51-
name: Sign with Sigstore and upload them to GitHub Release
51+
name: Sign packages with Sigstore and upload them to GitHub Release
5252
needs:
5353
- publish-to-pypi
5454
runs-on: ubuntu-latest
55+
5556
permissions:
5657
contents: write # IMPORTANT: mandatory for making GitHub Releases
5758
id-token: write # IMPORTANT: mandatory for sigstore
@@ -64,7 +65,7 @@ jobs:
6465
path: dist/
6566

6667
- name: Sign the dists with Sigstore
67-
uses: sigstore/gh-action-sigstore-python@v1.2.3
68+
uses: sigstore/gh-action-sigstore-python@v2.1.1
6869
with:
6970
inputs: >-
7071
./dist/*.tar.gz

.github/workflows/unit_tests.yml

Lines changed: 26 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -34,38 +34,39 @@ jobs:
3434
run: black . --check
3535

3636
browsergym-workarena-fast:
37-
runs-on: ubuntu-latest
37+
runs-on: ubuntu-latest
3838

39-
defaults:
40-
run:
41-
shell: bash -l {0}
39+
defaults:
40+
run:
41+
shell: bash -l {0}
4242

43-
steps:
43+
steps:
4444

45-
- name: Checkout Repository
46-
uses: actions/checkout@v4
45+
- name: Checkout Repository
46+
uses: actions/checkout@v4
4747

48-
- name: Set up Python
49-
uses: actions/setup-python@v5
50-
with:
51-
python-version: '3.10'
52-
cache: 'pip' # caching pip dependencies
48+
- name: Set up Python
49+
uses: actions/setup-python@v5
50+
with:
51+
python-version: '3.10'
52+
cache: 'pip' # caching pip dependencies
5353

54-
- name: Pip install
55-
run: pip install -r requirements.txt
54+
- name: Pip install
55+
working-directory: ./dev
56+
run: pip install -r requirements.txt
5657

57-
- name: Pip list
58-
run: pip list
58+
- name: Pip list
59+
run: pip list
5960

60-
- name: Install Playwright
61-
run: playwright install --with-deps
61+
- name: Install Playwright
62+
run: playwright install --with-deps
6263

63-
- name: Run non-slow browsergym-workarena Unit Tests
64-
env:
65-
SNOW_INSTANCE_URL: ${{ secrets.SNOW_INSTANCE_URL }}
66-
SNOW_INSTANCE_UNAME: ${{ secrets.SNOW_INSTANCE_UNAME }}
67-
SNOW_INSTANCE_PWD: ${{ secrets.SNOW_INSTANCE_PWD }}
68-
run: pytest -n 5 --durations=10 -m 'not slow and not pricy' --slowmo 1000 -v tests
64+
- name: Run non-slow browsergym-workarena Unit Tests
65+
env:
66+
SNOW_INSTANCE_URL: ${{ secrets.SNOW_INSTANCE_URL }}
67+
SNOW_INSTANCE_UNAME: ${{ secrets.SNOW_INSTANCE_UNAME }}
68+
SNOW_INSTANCE_PWD: ${{ secrets.SNOW_INSTANCE_PWD }}
69+
run: pytest -n 5 --durations=10 -m 'not slow and not pricy' --slowmo 1000 -v tests
6970

7071
browsergym-workarena-slow:
7172
runs-on: ubuntu-latest
@@ -86,6 +87,7 @@ jobs:
8687
cache: 'pip' # caching pip dependencies
8788

8889
- name: Pip install
90+
working-directory: ./dev
8991
run: pip install -r requirements.txt
9092

9193
- name: Pip list

README.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,11 +11,11 @@ WorkArena is included in [BrowserGym](https://github.com/ServiceNow/BrowserGym),
1111
https://github.com/ServiceNow/WorkArena/assets/2374980/68640f09-7d6f-4eb1-b556-c294a6afef70
1212

1313
## ⚠️ Pre-Release warning ⚠️
14-
Please note that the WorkArena benchmark is still undergoing minor bug fixes and updates, which may cause discrepancies with results reported in our latest arXiv preprint. We plan to release soon a stable version of WorkArena v0.1.0 with enhanced stability, and a final version v1.0.0 with a new suite of tasks.
14+
Please note that the WorkArena benchmark is still undergoing minor bug fixes and updates, which may cause discrepancies with results reported in our latest arXiv preprint. We plan to release soon a stable version of WorkArena with enhanced stability, and a final version v1.0.0 with a new suite of tasks.
1515

1616
## Benchmark Contents
1717

18-
At the moment, WorkArena includes `18,050` task instances drawn from `29` tasks that cover the main components of the ServiceNow user interface. The following videos show an agent built on `GPT-4-vision` interacting with every such component. As emphasized by our results, this benchmark is not solved and thus, the performance of the agent is not always on point.
18+
At the moment, WorkArena includes `18,050` task instances drawn from `33` tasks that cover the main components of the ServiceNow user interface. The following videos show an agent built on `GPT-4-vision` interacting with every such component. As emphasized by our results, this benchmark is not solved and thus, the performance of the agent is not always on point.
1919

2020
### Knowledge Bases
2121

@@ -51,14 +51,20 @@ https://github.com/ServiceNow/WorkArena/assets/1726818/7538b3ef-d39b-4978-b9ea-8
5151

5252
https://github.com/ServiceNow/WorkArena/assets/1726818/ca26dfaf-2358-4418-855f-80e482435e6e
5353

54+
### Dashboards
55+
56+
**Goal:** The agent must extract information from a dashboard.
57+
58+
59+
5460
## Getting Started
5561

5662
To setup WorkArena, you will need to get your own ServiceNow instance, install our Python package, and upload some data to your instance. Follow the steps below to achieve this.
5763

5864
### a) Create a ServiceNow Developer Instance
5965

6066
1. Go to https://developer.servicenow.com/ and create an account.
61-
2. Click on `Request an instance` and select the `Utah` release (initializing the instance will take a few minutes)
67+
2. Click on `Request an instance` and select the `Washington` release (initializing the instance will take a few minutes)
6268
3. Once the instance is ready, you should see your instance URL and credentials. If not, click _Return to the Developer Portal_, then navigate to _Manage instance password_ and click _Reset instance password_.
6369
4. You should now see your URL and credentials. Based on this information, set the following environment variables:
6470
* `SNOW_INSTANCE_URL`: The URL of your ServiceNow developer instance

dev/environment.yaml

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
name: workarena-dev
2+
3+
channels:
4+
- huggingface
5+
- conda-forge
6+
- defaults
7+
8+
dependencies:
9+
- python>=3.10
10+
- pip
11+
12+
- pip:
13+
- -r requirements.txt

dev/requirements.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
black[jupyter]==24.2.0
2+
blacken-docs
3+
pre-commit
4+
pytest==7.3.2
5+
pytest-xdist
6+
pytest-playwright
7+
tenacity
8+
browsergym-core
9+
-e .. # local package

pyproject.toml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ authors = [
1111
{name = "Maxime Gasse"},
1212
{name = "Alex Lacoste"},
1313
{name = "Manuel Del Verme"},
14+
{name = "Megh Thakkar"},
1415
]
1516
readme = "README.md"
1617
requires-python = ">3.7"
@@ -39,3 +40,30 @@ files = ["requirements.txt"]
3940

4041
[tool.hatch.build.targets.wheel]
4142
packages = ["src/browsergym"]
43+
44+
[tool.black]
45+
line-length = 100
46+
include = '\.pyi?$'
47+
exclude = '''
48+
/(
49+
\.eggs
50+
| \.git
51+
| \.hg
52+
| \.mypy_cache
53+
| \.nox
54+
| \.tox
55+
| \.venv
56+
| _build
57+
| buck-out
58+
| build
59+
| dist
60+
)/
61+
'''
62+
63+
[tool.pytest.ini_options]
64+
filterwarnings = [
65+
'ignore::UserWarning:gymnasium.*:', # too many "The obs is not within the observation space." warnings.
66+
]
67+
markers = [
68+
"slow: marks tests as slow (deselect with '-m \"not slow\"')",
69+
]

requirements.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
browsergym-core==0.2.0
1+
browsergym-core>=0.2
22
english-words>=2.0.1
33
faker>=24.11.0
44
numpy>=1.14

src/browsergym/workarena/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
__version__ = "0.2.0"
1+
__version__ = "0.2.1"
22

33
from browsergym.core.registration import register_task
44

0 commit comments

Comments
 (0)