Browsergym

Latest version: v0.13.3

Safety actively analyzes 714792 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 10

0.13.3

What's Changed

**browsergym-core**

* Optional method `AbstractBrowserTask.get_task_id()` 281
* Fixed `BrowserEnv` parameter `resizeable_window`, now working as expected 281

**browsergym-experiments**

* Metadata column fix for visualwebarena 278

**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.13.2...v0.13.3

0.13.2

What's Changed

**browsergym-experiments**

* Experiment traces can now be exported into the [TapeAgents](https://github.com/ServiceNow/TapeAgents) format #238
* Installs weblinx_browsergym as a dependency 261
* WA/VWA full instance reset will only issue a warning instead of crashing if not properly set-up 272
* New debug benchmark `visualwebarena_tiny` 271

**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.13.1...v0.13.2

0.13.1

What's Changed

**browsergym-experiments**

* webarena / visualwebarena instance massage after reset 248 250 254 259

**browsergym-core**

* Fixed gym warnings "obs not within observation space" 251
* Trace downgrades from `INFO` to `DEBUG`252
* More robust `env.close()`, can now be used in a finally block even after reset failure 253
* Optional `AbstractBrowserTask.teardown()` method 255
* Browsergym's `register_task()` now supports both frozen, non-overrideable `task_kwargs` as well as overrideable `default_task_kwargs` arguments 255
* More robust frame marking 256 258

**browsergym-assistantbench**

* Refactored AssistantBench mechanism for saving test predictions to JSON files 242

**browsergym-webarena**

* Relaxed playwright<1.40 restriction 257

**browsergym-visualwebarena**

* Relaxed playwright<1.40 restriction 257

Full Changelog

https://github.com/ServiceNow/BrowserGym/compare/v0.13.0...v0.13.1

0.13.0

What's changed

**browsergym-core**

* More robust frame marking with lenient last try 245
* Tasks can now choose their own `locale` and `timezone_id` 244

**browsergym-experiments**

* Pre-download WebLINX data in prepare_backend() 226
* Increase AssistantBench max_steps to 30 244
* Add `select_option ` to webarena / visualwebarena default action set 247

**browsergym-visualwebarena**

* Hide huggingface progress bar when downloading the visual evaluation model 241

**browsergym-assistantbench**

* Set `locale="en-US"` and `timezone_id="America/New_York"`

**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.12.0...v0.13.0

0.12.0

Bugfixes

**browsergym-experiments**

* Fixes WebLINX task list 235
* Refactors experiment ID generation 236
* Adds VisualWebArena task dependencies 237 239

**browsergym-visualwebarena**

* Fixes VisualWebArena tasks with visual validation (missing captioning_fn in evaluator) 240
* Adds a `torch` dependency (to run the captioning model) 240


**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.11.3...v0.12.0

0.11.3

Bugfixes

* Fix duplicate depends_on in webarena metadata 228

Improvements

* Easier webarena / visualwebarena setup with (running `nltk.download()` at import time) 227
* More robust `full_reset()` for webarena / visualwebarena 230
* Removed ARIA extraction warnings 233
* New benchmark configuration `webarena_tiny` 232

**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.11.2...v0.11.3

Page 1 of 10

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.