Updated official README and configs
- More detailed instructions (PRs 55, 56)
- Restructured official configs (PR 55)
- Updated FT config for ImageNet (PR 55)
Support detailed training configurations
- Step-wise parameter update besides epoch-wise parameter update (PR 58)
- Gradient accumulation (PR 58)
- Max gradient norm (PR 58)
Bug/Typo fixes
- Bug fixes (PRs 54, 57)
- Typo fixes (PRs 53, 58)