This release introduces two new solvers and improves the solver initialization process.
New Features
* **Policy Iteration Solver**: A new solver implementing Howard's policy iteration algorithm, which can converge more qucikly than value iteration on some problems.
* **Semi-Asynchronous Value Iteration Solver**: A variant of value iteration that updates state values in batches, but allows the new values to be used immediately in subsequent batches on the same device instead of waiting until the next iteration.
* **Improved Solver Architecture**: Refactored solver initialization into distinct phases for better organization and extensibility.
Documentation
* Updated API documentation with the new solver classes
Breaking Changes
The internal solver initialization process has been restructured:
* Removed: `_setup_solver()` method from the base `Solver` class
* Added: New granular setup methods for different initialization phases
- `_setup_config`
- `_setup_batch_processing`
- `_setup_jax_functions`
- `_setup_convergence_testing`
- `_initialize_solver_state_elements`
- `_setup_additional_components`
Migration Guide
If you have implemented a custom solver using the `Solver` ABC you will need to replace your `_setup_solver()` implementation with the new phase-specific methods:
1. Move configuration setup to `_setup_config`
2. Move JAX function transformations to `_setup_jax_functions`
3. Move convergence testing setup to `_setup_convergence_testing`
4. Move state initialization to `_initialize_solver_state_elements`
5. Use `_setup_additional_components` for any final processes, such as setting up checkpointing
Note: These changes only affect custom solver implementations. Code using the built-in solvers will continue to work without changes.
**Full Changelog**: https://github.com/joefarrington/mdpax/compare/v0.1.0...v0.2.0