The billion-pound problem with traditional consumer research
Consumer research costs businesses billions annually, yet panel-based approaches carry persistent biases: selection effects, survey fatigue, and demographic skew systematically compromise data quality. For UK SMEs, commissioning even modest panel research represents a significant investment—£5,000–£15,000 per study—with no guarantee that panels accurately represent target markets.
Recent research from PyMC Labs and Colgate-Palmolive, published October 2025, demonstrates a validated alternative: Large Language Models can replicate human purchase intent patterns with 90% test-retest reliability—without requiring any training data whatsoever. This matters because the implementation barrier has dropped dramatically: what once required machine learning expertise and months of data collection now requires thoughtful prompt engineering and hours of validation.
Why direct prompting fails: the semantic similarity breakthrough
The research team tested three approaches for eliciting consumer preferences from LLMs across 57 surveys (9,300 human responses) in the personal care products domain:
Direct Likert Rating (DLR): Asking the model directly for a rating on a 1-5 scale produced unrealistic distributions—responses clustered tightly around ‘3’ (neutral) rather than reflecting the natural variation observed in human panels. This approach failed comprehensively.
Follow-up Likert Rating (FLR): Eliciting textual responses first, then asking the model to self-rate those responses, performed substantially better but still required the model to interpret its own output.
Semantic Similarity Rating (SSR): The breakthrough methodology. Rather than asking models to produce numerical ratings, SSR:
- Elicits detailed textual responses to purchase intent questions
- Generates embedding vectors for those responses
- Compares embeddings to anchor statements representing each Likert point using cosine similarity
- Maps similarity scores to probabilistic rating distributions
This approach achieved Kolmogorov-Smirnov (KS) similarity scores exceeding 0.85 across all tested models, reaching 0.88 with GPT-4o—statistically equivalent to human test-retest reliability of 90%.
The strategic implications: validated synthetic consumers
The research validates three critical capabilities for business decision-makers:
1. Zero-shot viability eliminates training overhead
No historical data required. No fine-tuning. No machine learning expertise needed. Organisations can validate product concepts, test messaging variations, or explore demographic segments using only thoughtfully designed prompts and validation protocols. This transforms AI-driven consumer research from a specialist capability to an accessible strategic tool.
2. Demographics matter: age and income patterns replicate accurately
The research demonstrated that conditioning synthetic consumers on demographic characteristics (age, income level) produced preference patterns statistically consistent with corresponding human subgroups. This enables targeted market exploration without recruiting demographically matched panels—particularly valuable when researching hard-to-reach segments or testing concepts before committing to panel recruitment.
3. Qualitative feedback comes free: textual responses provide context
Unlike traditional Likert-only surveys, the SSR methodology generates rich qualitative feedback explaining preference patterns. Organisations gain both quantitative distributions for statistical analysis and qualitative context for strategic interpretation—without additional survey design complexity.
Implementation framework: from research to production
Deploying validated LLM-based consumer research requires four components:
Prompt engineering discipline: Designing unbiased purchase intent questions, creating anchor statements for each Likert point, and defining demographic personas requires the same rigour as traditional survey design. Poor prompts produce unreliable results regardless of methodology.
Validation protocol: Every implementation must validate synthetic consumer responses against human baseline data for the specific domain and demographic segments. The research methodology provides a replicable validation framework: recruit small human panels (n=50-100), compare distributions using KS similarity, iterate prompts until achieving >0.80 similarity.
Model selection: GPT-4o achieved the highest fidelity (KS=0.88), but Claude 3.5 Sonnet and other frontier models demonstrated comparable performance (KS>0.85). Cost-performance trade-offs vary: GPT-4o costs approximately £0.01 per simulated consumer; Claude 3.5 Sonnet approximately £0.007. For studies requiring 1,000 synthetic consumers, the cost differential (£10 vs £7) rarely justifies compromising fidelity.
Governance considerations: Synthetic consumer data remains synthetic—organisations must not misrepresent simulated responses as human panel data, particularly in regulated sectors or investor communications. Documentation must clearly delineate validated methodologies from exploratory approaches.
Cost-benefit perspective: when synthetic consumers make strategic sense
The research validates LLM-based consumer research as a cost-effective complement, not replacement, for human panels. Consider three scenarios:
Early-stage concept exploration (optimal): Before committing to panel recruitment, organisations can explore 50+ product variations or demographic segments for £50-£200 in API costs. This enables rapid iteration on concepts before investing in human validation.
Geographic expansion (high value): Testing product-market fit in new regions without establishing local panel partnerships reduces time-to-insight from weeks to days. Validate synthetic findings with small human samples (n=50-100) in target markets before full deployment.
Panel augmentation (selective value): Extending existing panel data to under-represented demographics or exploring adjacent segments offers cost savings, but requires careful validation. Never rely solely on synthetic data for final go/no-go decisions.
Risk management: what the research doesn’t validate
The study tested personal care products with established consumer behaviour patterns. Organisations must validate methodology independently when applying to:
- Novel product categories without established preference data
- B2B purchase decisions (which involve organisational dynamics beyond individual preferences)
- Emotionally complex or culturally sensitive product domains
- Regulatory contexts requiring demonstrated human panel data
Additionally, the research team used structured Likert-format questions. Open-ended qualitative research, focus group dynamics, and exploratory interviews remain domains where human insight substantially exceeds synthetic simulation capabilities.
The implementation reality: prompt engineering matters more than model selection
The research demonstrates that methodology dominates model selection: SSR achieved >0.85 KS similarity across multiple frontier models, whilst DLR failed universally regardless of model capability. This shifts implementation focus from “which model?” to “how do we design prompts?”.
UK SMEs implementing validated consumer research require:
- Prompt engineering expertise: Designing unbiased questions, creating representative anchor statements, and defining demographic personas
- Validation capability: Recruiting small human baseline samples, computing KS similarity, and iterating prompts systematically
- Statistical literacy: Interpreting distributions, recognising when synthetic data diverges meaningfully from human patterns
- Integration discipline: Incorporating synthetic insights into existing market research workflows without misrepresenting data provenance
Organisations lacking these capabilities should partner with specialists rather than deploying unvalidated approaches. Poorly designed synthetic consumer research produces confidently wrong insights—more dangerous than acknowledging uncertainty.
Market research transformation: the next 18 months
Validated synthetic consumer methodologies will compress market research timelines and reduce costs, but won’t eliminate human panels. Expect hybrid approaches to emerge:
- Rapid concept screening using synthetic consumers, followed by human validation of finalists
- Demographic augmentation extending panel data to under-represented segments
- International expansion research using validated synthetic methods before establishing local panel partnerships
- Continuous preference monitoring tracking synthetic consumer responses to market events, products, or messaging
Organisations implementing these approaches today gain 12-18 months’ advantage over competitors still relying exclusively on traditional panel methodologies. The research provides a validated starting point, but implementation requires domain-specific adaptation and rigorous validation protocols.
Strategic recommendations for UK SMEs
Based on the validated research findings, we recommend:
Short term (next 3 months):
- Identify one existing market research question suitable for synthetic consumer validation
- Design SSR-based methodology with careful prompt engineering and anchor statement development
- Recruit small human baseline sample (n=50-100) for validation
- Compare synthetic vs. human distributions, iterate prompts until achieving KS similarity >0.80
Medium term (3-12 months):
- Integrate validated synthetic consumer research into existing market research workflows
- Develop domain-specific prompt libraries and anchor statement templates
- Train marketing and product teams on appropriate use cases and limitations
- Document governance standards for synthetic data representation
Long term (12+ months):
- Build continuous preference monitoring capability using validated methodologies
- Expand to adjacent product categories or demographic segments with independent validation
- Develop hybrid research designs combining synthetic exploration with human confirmation
- Evaluate cost savings and time-to-insight improvements quantitatively
What this means for your organisation
The research validates LLM-based consumer research as a strategic capability for organisations willing to invest in rigorous methodology. The barrier isn’t access to models—it’s developing prompt engineering discipline, validation protocols, and integration capabilities.
UK SMEs can implement validated approaches at fractional cost compared to traditional panel research, but shortcuts compromise reliability. Poorly designed synthetic consumer research produces confidently wrong insights that mislead strategy rather than inform it.
The opportunity exists today. The methodologies are validated. The implementation requires expertise, not just access to APIs.
How we can help: Resultsense provides prompt engineering design, validation protocol development, and integration support for organisations implementing AI-driven consumer research. We work with your existing market research teams to develop validated synthetic consumer methodologies specific to your domain, products, and demographic segments.
Related Articles
- Four Design Principles for Human-Centred AI Transformation - Framework for AI deployment balancing innovation with safety and organisational readiness
- SME AI Transformation Strategic Roadmap - Comprehensive guide to structured AI adoption addressing strategy, implementation, and measurement
- Shadow AI is a Demand Signal: Turn Rogue Usage into Enterprise Advantage - Strategic reframing demonstrating prompt engineering and validation disciplines
Further reading
The complete research paper provides detailed methodology, statistical analysis, and validation protocols: LLMs Reproduce Human Purchase Intent via Semantic Similarity Elicitation of Likert Ratings (Benjamin F. Maier et al., PyMC Labs and Colgate-Palmolive, October 2025)
For organisations exploring AI-driven market research capabilities, our services include consumer research methodology assessment and implementation planning.
Strategic analysis by Resultsense. For market research methodology consulting or implementation support, book a consultation.