New Features
- QLearning, SARSA, Expected SARSA, DoubleQLearning
- Policy Iteration
- Entropy Regularized Policy Iteration
- Works with python 3.9
- QuickMDP and QuickTabularMDP constructors
- Construction of TabularMDPs from matrices
- New domains: CliffWalking, GridMDP generic class, Russell & Norvig gridworld example
- Gridworld plotting of action values