Compact language models are finding homes on phones, laptops, and industrial devices. They offer privacy, offline capability, and predictable latency.

Quantization and distillation techniques continue to improve quality. Vendors highlight on-device agents as a key growth area this year.