How do we run lean startup experiments for continuous innovation methods?

When I was studying marketing I had an arms length list of research techniques like conjoint analysis, surveys, and focus groups. After I read Four Steps to the Epiphany, there was only one: Get out of the building and talk to customers!

At LUXr, Janice Fraser introduced me to a whole new host of tools to gain insight such as hallway usability testing, contextual inquiry, and mental models.

Add this to lean startup standards like smoke tests and the list gets pretty overwhelming.

Should we run a Pocket Test with Picnic in the Graveyard to follow up? Should we do a Wizard of Oz or a concierge approach? Would you like a lemon twist with that?

So what type of experiment should you run for innovation methods? And when?

If you’d like to cut to the chase, you can download the list and index: Real Startup Book

Qualitative vs. Quantitative


Qualitative vs. Quantitative

Many, many people have weighed in on which is superior.

Ash Maurya recommends using qualitative research and then validating it with quantitative. Laura Klein’s post When to Listen & When to Measure does the battle the most justice.

As Laura points out in her post, it’s not a question of better. A hammer is not inherently better than a screwdriver. A hammer is better than a screwdriver for hammering in nails.

Any tool can be used for good or evil. You can build a house or you can hit yourself in the thumb. Share on X

Generative vs. Evaluative

Generative Research vs. Evaluative Experiments

Janice Fraser introduced me to the distinction of Generative vs. Evaluative techniques for innovation methods:

  • Generative – Research techniques which don’t necessarily start with a hypothesis, but result in many new ideas. e.g. Customer Discovery Interviews
  • Evaluative – Testing a specific hypothesis to get a clear yes or no result. e.g. Landing Page a.k.a Smoke tests

This distinction explains why we often get crappy results from experiments conducted for innovation methods. We might run a smoke test with the hypothesis:

Some people will sign up to a “coming soon” landing page featuring 100% compostable shoes.

We advertise via twitter and come back with a paltry 1% conversion rate to email sign up. Good idea? Bad idea? We confirmed our hypothesis…”some people” did indeed sign up!

But is our conversion rate low because no one is interested? Or because we advertised via the wrong channel?

Does no one want our value proposition? Or does no one understand it?

There are a hundred reasons why we might get a false negative result from this test. There are also quite a few reasons why we might get a false positive!

It's difficult to interpret tests because often the hypothesis is fundamentally flawed or just vague. Share on X

In this case our hypothesis is incredibly vague and flawed.

Some people will sign up to a “coming soon” landing page featuring 100% compostable shoes.

Who are these people? People on twitter? Who are they following? Are they eco-friendly dads who bike to work? Or are they professional runners who care more about durability than being environmentally friendly?

When our hypothesis is specific and falsifiable, we can run an Evaluative Experiment such as a smoke test.

When our hypothesis is vague or we don’t even have a hypothesis, we need to do Generative Research such as getting out of the building and talking to potential customers to get new ideas or refine our hypothesis.

Market vs. Product

Market vs. Product

The other obvious distinction among tools and methods in the Real Startup Book is between Market and Product.

Some methods tell us a lot about customers, their problems, and how to reach them. For example, we can talk listen to our customers and this will help us understand their situation and what their day to day problems are.

Other methods tell us about the product or solution that will help solve that problem. We can do usability testing on a set of wireframes and see if our interface is usable. However, this won’t tell us anything about whether or not anyone will buy it in the first place.

These methods generally don’t overlap.

What type of lean startup experiment should I run?

If we combine the useful distinctions of Generative Research vs. Evaluative Experiments and Market vs. Product we have four nice boxes which we can use to help us determine what we should do next to optimize the innovation methods:

The Lean Startup Playbook - Which test should I run next?

Each of these boxes helps us answer different questions.

Generative Market Research

Generative Market Research for innovation methods

  • Who is our customer?
  • What are their pains?
  • What job needs to be done?
  • Is our customer segment too broad?
  • How do we find them?

If we can’t answer these questions yet, we’re doing what Steve Blank would call Customer Discovery (The first Step to the Epiphany.) We need to to understand the basis of the problem before testing a solution. If we’re not sure what our hypothesis is, we need to generate ideas.

We could talk to customers and see what’s bothering them (Steve’s advice and always a good idea) or we might try data mining if we happen to have access to a large set of data. We could even do a broad survey with open ended questions.*

Some of these research methods are qualititative (e.g. talking listening to customers) and some are quantitative (e.g. data mining). That distinction is not important.

Data mining is quantitative but helped identify problems such as food deserts in the United States. We couldn’t have done that by talking listening to customers! Both tools can discover problems.

Here are some of  Generative Market Research methods:

  • Customer Discovery Interviews
  • Contextual inquiry / ethnography
  • Data mining
  • Focus groups*
  • Surveys* (open ended)

* Don’t do this. Surveys and focus groups generally suck.

Evaluative Market Experiments

Lean Startup Playbook - Evaluative Market Experiment

  • Are they really willing to pay?
  • How much will they pay?
  • How do we convince them to buy?
  • How much will it cost to sell?
  • Can we scale marketing?

To evaluate a specific hypothesis, we might run a landing page test to see if there is demand. We might run a sales pitch if you were doing B2B enterprise product. We could even run a conjoint analysis to understand the relative positioning of a few value propositions.

Here are a few Evaluative Market Experiments to discover innovation methods, if we have a clear, falsifiable hypothesis:

  • 5 second tests
  • Comprehension – link to tool
  • Conjoint Analysis
  • Data mining / market research
  • Surveys* (closed)
  • Smoke tests
    • Video
    • Landing page
    • Sales pitch
    • Pre-sales
    • Flyers
    • Pocket test
    • Event
    • Fake door
    • High bar

* Again, don’t use these. They generally suck.

Warning: Here be False Positives

Customer Development Survey False Positive - Survey.ioBefore we move on from here, we must remember: we’re probably wrong. Even if we have tens of thousands of users signed up to our landing page, that doesn’t mean we have a validated problem.

If the customer didn’t have to commit to anything aside from their email address or they misunderstood the value proposition, then those signups don’t signify true customer demand. It just means we make awesome landing pages.


Our Value Proposition is not our product. Share on X The Value Proposition is the benefit that the product delivers to that specific customer. Share on X We cannot have a validated value proposition without a validated customer segment. Share on X

Generative Product Research

Lean Startup Playbook - Generative Product Research

  • How can we solve this problem?
  • What form should this take?
  • How important is the design?
  • What’s the quickest solution?
  • What is the minimum feature set?
  • How should we prioritize?

Once we’ve validated the market and value proposition sufficiently, we need to understand what the solution would look like.

If we truly have a validated Customer with a clear Problem and a Value Proposition, then we can start asking how.

Unfortunately, while our market hypotheses tend to be overly vague, our solution hypotheses tend to be overly specific and way too comprehensive.

The 20 features that are absolutely critical to a comprehensive solution, often turn out to be distracting and confusing to the user. Share on X

To simplify our solution and help us prioritize which features to build first, we can use methods like concierge testing or solution interviews to help us generate ideas about what our MVP should be.

Here are some methods for Generative Product Research:

  • Solution interview
  • Contextual inquiry / ethnography
  • Demo pitch
  • Concierge test / Consulting
  • Competitor Usability
  • Picnic in the Graveyard

Evaluative Product Experiments

Lean Startup Playbook - Evaluative Product Experiment for innovation methods

  • Is this solution working?
  • Are people using it?
  • Which solution is better?
  • How should we optimize this?
  • What do people like / dislike?
  • Why do they do that?
We probably started off with a clear idea of the product. It's probably wrong. Share on X

Fortunately, there are a list of well defined tools that have been around for decades to figure that out.

We can do user testing to look for usability issues that might prevent the solution from working. We can A/B test two alternatives to see what works better. We can use a Net Promoter Score survey (one of the few surveys that I like) to see overall satisfaction. All of these Evaluative Product Experiments tell us if our solution is doing the trick.

  • Paper prototypes
  • Clickable prototypes
  • Usability
  • Hallway
  • Live
  • Remote
  • Wizard of Oz
  • Takeaway
  • Functioning products
  • Analytics / Dashboards
  • Surveys*
    • Net Promoter Score
    • Product/Market Fit Survey

* These aren’t quite as bad as most surveys, but be sure you understand them before you use them to measure Product / Market Fit.

The Real Startup Book

The Real Startup Book - Which test should I run next?One last tip: The Real Startup Book is an arbitrary framework.

Any idiot with an MBA knows how to make a 2×2 grid that will look impressive in a powerpoint when doing a consulting gig. That doesn’t mean reality fits neatly into those boxes.

A lot of research/experiments will blur the lines. It’s rare that we’ll do generative research without having a hypothesis in the back of our mind about who our customer is. We may inadvertently evaluate (and invalidate) that hypothesis. That’s ok.

This is just a framework to get ourselves headed in the right direction and make it more likely that we use the right tool for the job.

Any tool can be used for good or evil. We can build a house or we can hit ourselves in the thumb. Share on X

Choose wisely.

You can download the Real Startup Book for quick reference:

Last note: Contribute!

This book isn’t finished and never will be.

Lots of people have suggested the innovation methods listed and maybe you can suggest one that isn’t there yet. We’ll list all contributions and the whole thing is creative commons licensed.

Got something to add? Add something in the comments or let me know.

Make better product and business decisions with actionable data


Gain confidence that you're running the right kind of tests in a five-week series of live sessions and online exercises with our Running Better Experiments program. Refine your experiment process to reduce bias, uncover actionable results, and define clear next steps.
Join the waitlist →