A model distillation guide for VPs of Engineering at scale.
Your AI bill ships even better, in the wrong direction.
Real estate platforms run a small number of AI tasks at very high volume.
The temptation is to negotiate with the model provider.
Most engineering teams have not run a distillation program because the muscle is unfamiliar.
Pick the three tasks that have the highest cost per month, the most repeatable structure, and the most measurable quality. Real estate platforms almost always find these are listing description generation, lead classification, and CMA summarization.
Before any training data is generated, the eval set must exist. The eval set is the truth.
The teacher model generates training data on additional production-shaped examples. We generate 5 to 10 times the eval set size.
Pick the three tasks that have the highest cost per month, the most repeatable structure, and the most measurable quality.
Before any training data is generated, the eval set must exist.
The teacher model generates training data on additional production-shaped examples.
If your AI cost is outrunning your feature growth, distillation is the highest-leverage move on the table.
Not if the eval set is real and the gate is tuned. Across our distillation programs, customer-noticed quality regression has been zero on every workload that passed the eval gate.
Distillation requires ML expertise. We typically embed for the duration. Your engineering team learns the discipline through the worked program.
Most of the program cost is teacher inference for training data. Typically 10 to 20 percent of one quarter's pre-program spend. Pays back inside two quarters.