Why LinkedIn says prompting was a non-starter — and small models was the breakthrough

LinkedIn
LinkedIn is a leader in AI recommendation systems, having developed it over the past 15 years. But accessing the next generation recommendation stack for tomorrow’s job seekers requires a whole new technology. The company had to look beyond off-the-shelf models to achieve the next level of accuracy, latency and efficiency.

“There was no way we would have been able to do this by prompting,” Aaron Berger, vice president of product engineering at LinkedIn, said in a new Beyond the Pilot podcast. “We didn’t even try for the next-generation recommended system because we realized it was a non-starter.”

Instead, his team planned to develop a highly detailed product policy document to improve the initial massive 7-billion-parameter model; This was then distilled into additional teacher and student models optimized for hundreds of millions of parameters.

The technology has created a repeatable cookbook that is now reused in LinkedIn’s AI products.

“Adopting this eval process from start to finish will result in a substantial improvement in the quality of likes that we probably haven’t seen here at LinkedIn in years,” says Berger.

Why was Multi-Teacher Distillation a ‘success’ for LinkedIn?

Berger and his team planned to create an LLM that could interpret individual job questions, candidate profiles and job descriptions in real time, and thus reflect LinkedIn’s product strategy as accurately as possible.

Working with the company’s product management team, engineers ultimately produced a 20 to 30-page document scoring job description and profile pairs “across multiple dimensions.”

“We did many, many iterations on this,” says Berger. That product policy document was then combined with a “golden dataset” containing thousands of pairs of questions and profiles; The team fed it into ChatGPT during data generation and experimentation, prompting the model to learn scoring pairs over time and ultimately creating a very large synthetic data set to train the 7-billion-parameter teacher model.

However, Berger says it is not enough to run an LLM into production based on product policy alone. “At the end of the day, it’s a recommendation system, and we need to do some amount of click prediction and personalization.”

So, his team used that initial product policy-focused teacher model to develop a second teacher model oriented toward click prediction. Using both, they advanced a 1.7 billion parameter model for training purposes. Berger says that final student model was run through “many, many training rounds” and optimized “at every point” to minimize quality loss.

This multi-teacher distillation technique allowed the team to “get a lot more similarity” to the original product strategy and “land” click prediction, he says. They were also able to “modularize and componentize” the training process for the student.

Consider this in the context of a chat agent with two different teacher models: one training the agent on accuracy in responses, the other on tone and how it should communicate. Berger notes that those are two things with very different, yet important, objectives.

“Combining them now gives you better results, but you can also replicate them independently,” he says. “That was a breakthrough for us.”

Changing the way teams work together

Berger says he can’t underestimate the importance of anchoring on a product strategy and an iterative evaluation process.

Achieving a “really, really good product policy” requires translating product manager domain expertise into a unified document. Historically, Berger notes, the product management team was focused on strategy and user experience, leaving the modeling iterative approach to ML engineers. Now, however, both teams work together to “dial in” and create an aligned teacher model.

“How product managers work with machine learning engineers now is very different from anything we’ve done before,” he says. “This is now basically the blueprint for any AI product we do at LinkedIn.”

Check out the full podcast to learn more about it:

  • How LinkedIn optimized every step of the R&D process to support velocity, driving real results in days or hours instead of weeks;

  • Why teams should develop pipelines for pluggability and experimentation and try out different models to support flexibility;

  • The continuing importance of traditional engineering debugging.

You can also listen and subscribe beyond the pilot But spotify, Apple Or wherever you get your podcasts.



<a href

Leave a Comment