The AI Field Guide / Q

Letter Q

1 term, explained without the techno-murk.

/

Quantization

Deeper

Shrinking a model by storing its numbers with less precision.

Quantization can reduce memory use and make inference faster, helping models run on smaller devices. The trade-off is a possible loss of accuracy or subtle capability.

For example

A quantized model fits on a laptop that could not hold the full-precision version.