===
- Added BCFW.
- Added averaging in stochastic gradient descent.
- Added "snakes" dataset.
- Much faster interface to OpenGM.
- Speed improvements in loss-augmented inference.
- Renamed psi to joint_feature, as the joint feature function is sometimes also called phi, with psi referring to the energy.
- Removed the GLPK dependency: now cvxopt is used to solve linear programs.