Top
image credit: Piqsels

Google Releases Quantization Aware Training for TensorFlow Model Optimization

Google announced the release of the Quantization Aware Training (QAT) API for their TensorFlow Model Optimization Toolkit. QAT simulates low-precision hardware during the neural-network training process, adding the quantization error into the overall network loss metric, which causes the training process to minimize the effects of post-training quantization.

Pulkit Bhuwalka, a Google software engineer, gave an overview of the new API at the recent TensorFlow Dev Summit. TensorFlow’s mobile and IoT toolkit, TensorFlow Lite, supports post-training quantization of models, which can reduce model size up to 4x and increase inference speed up to 1.5x.

Read More on InfoQ