TL;DR: LinkedIn has launched AI-powered people search after 18 months of development, revealing a technical playbook for scaling generative AI from pilot to billion-user deployment through multi-stage distillation, GPU infrastructure, and pragmatic optimisation.

Three years after ChatGPT’s launch, LinkedIn is finally deploying generative AI people search at scale—not because the technology was unavailable, but because enterprise-grade deployment at 1.3 billion users requires brutal pragmatic optimisation rather than technological innovation.

From Keyword Search to Semantic Understanding

LinkedIn’s new system allows natural language queries like “Who is knowledgeable about curing cancer?” The old keyword-based search would have returned only profiles explicitly mentioning “cancer.” The AI-powered system understands semantic relationships, surfacing oncology leaders, genomics researchers, and related experts even when profiles don’t contain the exact search terms.

The system balances relevance with utility—prioritising accessible first-degree connections over unreachable world-leading experts, recognising that a “pretty relevant” immediate contact often provides more value than a distant celebrity researcher.

The 18-Month Technical Journey

LinkedIn’s engineering team faced a six-to-nine-month impasse attempting to train a single model balancing policy adherence against user engagement signals. The breakthrough came from decomposing the problem: distilling a 7-billion-parameter “Product Policy” model into a 1.7-billion teacher model focused solely on relevance, paired with separate teacher models predicting member actions.

The final architecture operates as a two-stage pipeline: an 8-billion-parameter model handles broad retrieval, then a heavily compressed student model performs fine-grained ranking. For people search, the team pruned their student model from 440 million down to 220 million parameters—achieving necessary speed with less than 1% relevance loss.

Infrastructure Transformation

Scaling to a billion records broke LinkedIn’s CPU-based retrieval stack. The team migrated indexing to GPU infrastructure—a foundational architectural shift unnecessary for their previous job search product but essential for people search latency requirements.

Additional optimisations included training a reinforcement learning model solely to summarise input context, reducing input size 20-fold. Combined with the 220-million-parameter model, these changes delivered a 10x increase in ranking throughput.

Pragmatism Over Hype

VP of Product Engineering Erran Berger was emphatic about enterprise reality: the value lies in perfecting recommender systems, not chasing agentic technology. The complex system is designed as a tool future agents will use, not an agent itself.

“Agentic products are only as good as the tools that they use to accomplish tasks for people,” Berger noted. “You can have the world’s best reasoning model, and if you’re trying to use an agent to do people search but the people search engine is not very good, you’re not going to be able to deliver.”

For enterprises building AI roadmaps, LinkedIn’s playbook is clear: win one vertical first, even if it takes 18 months; codify that win into a repeatable process; then optimise relentlessly through pruning, distillation, and creative techniques like RL-trained summarisers.


Source: VentureBeat

Share this article