Browsergym

Latest version: v0.13.3

Safety actively analyzes 723158 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 10

0.11.2

Bugfixes

* Add incomplete `ExpResult.status` 225


**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.11.1...v0.11.2

0.11.1

New features

* Set max steps to 30 in webarena / visualwenarena benchmarks 214
* Benchmark dependency graph utilities 220
* Include nltk.download() in prepare_backend() for webarena / visualwebarena benchmarks 224

Bugfixes

* Rename benchmark after subset_from_split() 221
* ExpArgs.exp_dir sanitization 222
* get_step_info() bugfix 223


**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.11.0...v0.11.1

0.11.0

New features

**browsergym-experiments**

* New `weblinx` benchmark 🎉 208 (thanks xhluca)
* New `ExpResults.status()` 219 (thanks recursix)

**browsergym-core**

* New `hide_all_bids` option in `flatten_dom_to_str()` and `flatten_axtree_to_str()` 212 (thanks imenelydiaker)
* Leaner `Unicode()` gym space 218

Bugfixes

* `Benchmark.prepare_backends()` fixes 209

**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.10.2...v0.11.0

0.10.2

New features

* New `Benchmark.prepare_backend()` method 204

Bugfixes

* `save_step_info()` bugfix when `obs==None` (truncated episode due to `None` action) 207

**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.10.1...v0.10.2

0.10.1

Minor changes

* train / test splits for WorkArena L2 and L3 tasks 203
* More fine-grained per-benchmark action sets 202

**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.10.0...v0.10.1

0.10.0

New features

* New BrowserGym benchmark [AssistantBench](https://assistantbench.github.io/), packaged as `browsergym-assistantbench`. Thanks oriyor ! #186
python
import browsergym.assistantbench

env = gym.make("browsergym/assistantbench.validation.12")
env = gym.make("browsergym/assistantbench.test.42")

* Default train/test splits for all benchmarks
python
miniwob = DEFAULT_BENCHMARKS["miniwob"] 125 tasks x 5 seeds
miniwob_train = miniwob.subset_from_split("train") 62 tasks x 5 seeds
miniwob_test = miniwob.subset_from_split("test") 63 tasks x 5 seeds


Breaking Changes

* Various updates and refactors to the new `Benchmark` class 197 198 199

Fixes

* Improved experiment logging 182

**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.9.0...v0.10.0

Page 2 of 10

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.