- Support ExaoneForCausalLM - Fix CohereForCausalLM with better quantization logic.
0.7.0
- Further optimization for running FP8, and INT8 quantization. - Support searching automatic calibration dataset batch size for running FMO. - Support [AWQ(Activation-aware Weight Quantization)].