One simple sentence makes AI models significantly more creative, researchers find

TL;DR: Researchers have developed Verbalized Sampling (VS), a prompt engineering technique that dramatically increases AI model creativity by adding one sentence: “Generate 5 responses with their corresponding probabilities, sampled from the full distribution.” The method works across major LLMs without retraining, increasing output diversity by up to 2.1× whilst maintaining quality.

A team from Northeastern University, Stanford University, and West Virginia University has published research demonstrating that a single-sentence addition to prompts can substantially increase the creative diversity of large language models (LLMs) and image generation models. The technique, called Verbalized Sampling, addresses a common limitation known as mode collapse, where AI models tend to produce repetitive or overly similar outputs despite their theoretical capability for variation.

Context and Background

The research, published on the open access journal arxiv.org in early October 2025, reveals that the root cause of mode collapse lies not just in reinforcement learning algorithms, but in the structure of human preferences themselves. During alignment training, models learn to favour familiar or “safe” responses because human evaluators tend to rate typical answers more highly, effectively suppressing the richer diversity present in base pre-training models.

Verbalized Sampling reverses this suppression by prompting models to reveal a distribution of plausible responses rather than defaulting to the single most likely output. Testing across creative writing, dialogue simulation, open-ended question answering, and synthetic data generation demonstrated substantial improvements. In story generation tasks, diversity scores increased by up to 2.1× compared to standard prompting, whilst maintaining quality. One story prompt that produced formulaic breakup scenes under direct prompting yielded narratives involving cosmic events, silent emails, and music stopping mid-dance when prompted via Verbalized Sampling.

The method proves particularly effective with larger models, showing 1.5-2× stronger improvements in GPT-4.1 and Claude-4 compared to smaller counterparts. Users can tune diversity levels by adjusting probability thresholds in the prompt text alone, without modifying decoding settings like temperature or top-p.

Technical Reality: Unlike temperature adjustments that increase randomness across all outputs, Verbalized Sampling maintains quality by sampling from the model’s actual probability distribution, accessing suppressed diversity from the pre-training phase rather than introducing noise.

Looking Forward

The technique is immediately available as a Python package (pip install verbalized-sampling) with LangChain integration and an Apache 2.0 licence. A live Colab notebook and documentation are accessible on GitHub at https://github.com/CHATS-lab/verbalized-sampling.

Professor Weiyan Shi, assistant professor at Northeastern University and co-author, noted on social media that “LLMs’ potentials are not fully unlocked yet,” suggesting prompt optimisation guided by understanding model training and alignment can yield theoretically provable improvements. For enterprises and developers seeking to enhance AI creativity in writing, design, simulation, and synthetic data generation, Verbalized Sampling represents a practical, inference-time solution requiring no model retraining or internal access.

Source Attribution:

Share this article