Browsergym

Latest version: v0.13.1

Safety actively analyzes 681775 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 9

0.13.1

What's Changed

**browsergym-experiments**

* webarena / visualwebarena instance massage after reset 248 250 254 259

**browsergym-core**

* Fixed gym warnings "obs not within observation space" 251
* Trace downgrades from `INFO` to `DEBUG`252
* More robust `env.close()`, can now be used in a finally block even after reset failure 253
* Optional `AbstractBrowserTask.teardown()` method 255
* Browsergym's `register_task()` now supports both frozen, non-overrideable `task_kwargs` as well as overrideable `default_task_kwargs` arguments 255
* More robust frame marking 256 258

**browsergym-assistantbench**

* Refactored AssistantBench mechanism for saving test predictions to JSON files 242

**browsergym-webarena**

* Relaxed playwright<1.40 restriction 257

**browsergym-visualwebarena**

* Relaxed playwright<1.40 restriction 257

Full Changelog

https://github.com/ServiceNow/BrowserGym/compare/v0.13.0...v0.13.1

0.13.0

What's changed

**browsergym-core**

* More robust frame marking with lenient last try 245
* Tasks can now choose their own `locale` and `timezone_id` 244

**browsergym-experiments**

* Pre-download WebLINX data in prepare_backend() 226
* Increase AssistantBench max_steps to 30 244
* Add `select_option ` to webarena / visualwebarena default action set 247

**browsergym-visualwebarena**

* Hide huggingface progress bar when downloading the visual evaluation model 241

**browsergym-assistantbench**

* Set `locale="en-US"` and `timezone_id="America/New_York"`

**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.12.0...v0.13.0

0.12.0

Bugfixes

**browsergym-experiments**

* Fixes WebLINX task list 235
* Refactors experiment ID generation 236
* Adds VisualWebArena task dependencies 237 239

**browsergym-visualwebarena**

* Fixes VisualWebArena tasks with visual validation (missing captioning_fn in evaluator) 240
* Adds a `torch` dependency (to run the captioning model) 240


**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.11.3...v0.12.0

0.11.3

Bugfixes

* Fix duplicate depends_on in webarena metadata 228

Improvements

* Easier webarena / visualwebarena setup with (running `nltk.download()` at import time) 227
* More robust `full_reset()` for webarena / visualwebarena 230
* Removed ARIA extraction warnings 233
* New benchmark configuration `webarena_tiny` 232

**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.11.2...v0.11.3

0.11.2

Bugfixes

* Add incomplete `ExpResult.status` 225


**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.11.1...v0.11.2

0.11.1

New features

* Set max steps to 30 in webarena / visualwenarena benchmarks 214
* Benchmark dependency graph utilities 220
* Include nltk.download() in prepare_backend() for webarena / visualwebarena benchmarks 224

Bugfixes

* Rename benchmark after subset_from_split() 221
* ExpArgs.exp_dir sanitization 222
* get_step_info() bugfix 223


**Full Changelog**: https://github.com/ServiceNow/BrowserGym/compare/v0.11.0...v0.11.1

Page 1 of 9

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.