AI in This Book

AI tools and methods are integrated throughout this book — not as a separate track alongside “traditional” methods, but woven into the methods themselves. Many methods include AI-augmented variants, and every method’s online How-To section includes copyable AI prompts for prep, execution, and analysis. There are also three methods built specifically around AI capabilities: Synthetic Persona Screening, AI-Moderated Interview at Scale, and the Vibe-Coded Disposable MVP.

Using these tools well requires the same discipline as any other method: a clear hypothesis, an experiment design that can detect “no” answers, and the judgment to distinguish AI-generated hypotheses from validated customer insights.

Common Pitfalls

While each method describes likely biases and how to avoid them, please keep these pitfalls in mind whenever using an AI augmented method.

Synthetic Sycophancy

Large language models are trained on human feedback that rewards helpful, agreeable responses. This means synthetic personas are architecturally biased toward saying yes to your ideas. If your AI-generated customer says “that sounds really useful,” that is not a valid signal — it is a default behavior designed to make you feel good.

Real customers can do this to, but they also hedge, deflect, push back, and misunderstand. Every synthetic insight is a hypothesis to test with real humans, not an insight to act upon.

Hallucinated Validation

When AI generates realistic-sounding customer quotes, it creates text that looks like research and enters decision-making as if it were evidence. This is particularly dangerous in team settings: a founder runs a synthetic session, pastes the most compelling “quotes” into a shared doc, and two weeks later someone references them in a pitch deck as real customer feedback. Label AI-generated content at the moment of creation. If you can’t trace a customer insight to a named person and a real conversation date, it’s a hypothesis — not a finding.

The Demo Trap

AI coding tools can produce a working prototype in an afternoon, and a working prototype creates a powerful emotional sense of progress. It isn’t.

A demo is evidence that AI can write code. It is not evidence that customers want the product, will pay for it, or will change their behavior to use it. The hypothesis that matters is whether customers want the product — and the only way to test that is with real users in real conditions.

Scale Without Signal

Running 300 AI-moderated interviews feels like an overwhelming amount of evidence. It isn’t.

Scale amplifies whatever is in the design: signal if the questions are good, noise if they’re leading or the participant pool is off-target. A badly designed study at 300 participants produces confident-sounding noise and is worse than a well-designed study with 10, because the large number creates false confidence in bad data. Pilot tests before you scale.

Anchoring on AI Suggestions

When AI generates a list of customer pain points or solution concepts, that list can become the boundary of your imagination. You design your interview guide around its framing and never surface insights that didn’t appear on the list. The patterns the AI produces reflect its training data — which overrepresents tech-savvy, English-speaking, Western perspectives and underrepresents everyone else. Generate your own hypotheses first. Use AI to augment your thinking, not to replace it.

The Automation Paradox

Faster experiments only help if the experiments are well designed. AI dramatically reduces the cost and time of running experiments, which means founders can run more per week — but if each experiment is poorly designed, more experiments just accumulates false confidence faster. The time saved on execution is wasted if it came from cutting design time. Measure learning velocity, not experiment velocity: how many of your experiments actually changed your strategy? If the answer is close to zero, you’re not learning — you’re confirming.

The Limites of AI

AI is a fast evolving technology. It’s capabilities may outstrip humans at some point. But not quite yet. Until then, it has serious flaws when misused or misunderstood.

AI cannot, as of yet, replace a genuine conversation with a real customer, simulate actual willingness to pay, or substitute for observing real behavior.

When in doubt, go talk to someone.

Got something to add? Share with the community.