Hypothesis Checklist

Writing a Good Hypothesis

Lean startup practices turn project managers, business leaders, and designers into scientists who constantly validate their ideas by running an array of experiments. But experiments can get out of hand and turn perfectly sane people into mad scientists. A sure way to keep our sanity is to start with a strong hypothesis to give the experiment structure.

A strong hypothesis will tell us what we are testing and what we expect to get out of the test. By stating expectations, we are delineating the goals that the experiment has to hit to make it a success or failure. This will help us define when to determine that the experiment needs to be scrapped or the idea is ready to be taken to market.

Key elements of writing a good hypothesis:

The change that you are making
The aspect that will change
The success or fail metric
How long we are going to run the test

A hypothesis should end up looking like this:

This new feature will cause a ten percent increase of new users visiting the homepage in three months.

(the change) ----------------------------(the metric)-------------------(the impact)----------------------(the timeframe)

Let’s break down each aspect:

The change

This is the aspect that we are going to change, launch, or create that is going to affect our overall business or product. It can be as simple as changing the color of a button or as big as launching a new marketing campaign. Make sure that only one aspect is changed at a time, otherwise there is no way to tell which aspect contributed to the effect.

The impact

This is the expected results of the experiment. If we change x, then we expect y to happen.

The metric

This is a measurement that needs to be hit or surpassed. This can be a fail metric, where if the experiment does not meet the minimum goal, then the project must pivot to a completely new direction. This can also be a success metric, where the experiment is deemed a success if it hits the goal. Choosing between the success or fail metric is dependent on if we want the baseline to know when to scrap a project or when to launch a project.

The timeframe

This is the length of time it takes to run the test. If the timebox is too short, then the amount of data might be too small, or there might not have been enough time for effects to take place. But if the timebox is too long, we are wasting valuable time collecting unnecessary data.

Now let’s use these elements to form a hypothesis in an example scenario.

Say that you are a product manager at a startup that creates a mobile app to help waiters and waitresses keep track of their tips. You have noticed that users who document their tips four times a week have a higher retention rate. You want to see if you can increase the number of times current users use the app within a week.

What are you going to change within the app?

How about adding a notification system so the user can set reminders to ping them at the end of a shift?

What do you want the outcome to be?

You want more users to open the app four or more times in a week.

What is the metric of failure or success?

At this time you have 50,000 monthly users, and 10,000 use the app four days a week. You want to increase the current user’s rate of opening the app from 10,000 to 15,000. This translates to a ten percent increase.

How long are you going to run the test?

This always depends on a number of variables within the company, but let’s say that you are at a midsize company that has a little more time to get the correct data. So let’s say three months.

The end hypothesis would be:

If I add a notification feature that allows the waiter/waitress to set reminders to add in their tips, then I am going to see a ten percent increase in the number of users opening the app four times or more in a week over the next three months.

Now let's go through a worksheet that will test if you can figure out the strongest hypothesis for a given scenario. The answers are at the bottom.

Scenario A: You work for a company that rents out toddlers’ clothes. It is a monthly subscription where families get a box of five pieces of clothing, and when the toddler grows out of them, they return the clothes for a new box. The data shows that there might be a correlation between members who frequently send back items to higher customer retention rates. Your goal is to have members return more boxes. You have decided that you can do this by adding pieces that are seasonal, holiday themed, or super trendy so that the family will need to keep updating the clothes.

By adding one piece of special-occasion clothing, you will see a ten percent rise in returned boxes in three months.
If you include one special-occasion outfit, a new designer piece, and a seasonal accessory, then you will see a 15 percent increase in returned boxes in the next 12 weeks.
When you add three seasonal pieces, families will learn to request more items, and you will see growth in the next two months.
By including one trending designer piece, you will see a 15 percent increase in requests for those designers once the experiment is completed.

Scenario B: You already made your millions with the Uber for parrots, so you decided to invest your money into saving the manatees. You designed a tracking app that shows boaters where herds of manatees are sleeping so they don’t run the herds over. You are having a hard time getting the boaters to download the app, so you decide to start advertising. You want to conduct a test to see if a promotion will increase the app’s downloads.

If you pair up with dock owners to offer a ten percent discount on monthly docking fees to boaters who download the app, then you will see a ten percent increase in downloads over the next three years.
If you give out ten percent coupons to boat rentals for downloading the app, and 15 percent off tack shops, and you advertise around piers, then you will see an increase of 15 percent new downloads in the next three months.
By pairing up with ten boat rentals to give a coupon for ten percent off the boat rentals for downloading the app, you will see a five percent increase in downloads over a six month period.
When you have a special where someone downloads the app, they get a one-of-a-kind lure at Ted’s tack shop (which has 15 stores in Florida), then you will see a ten percent decrease in manatee deaths over the next five months.

Scenario C: Your labrador retriever is obsessed with a tennis ball and you are tired of throwing the slobbery thing. It inspired you to start a drone company that drops tennis balls and takes funny pictures of the dogs. Your customer-support team has received complaints that it is hard to understand how to download the pictures from the iPhone app. You want to test moving the photos section to various parts of the app.

If you add a photos section to the navigation bar, then you will see a five percent increase in new users over a four month period.
If you advertise the photo feature in your app, you will have more users and fewer complaints within the next ten weeks.
If you add three pages to the onboarding process that explain how to move the photos, then you will get a 25 percent increase in dog pictures.
By moving the download photos to the home screen, you will receive 50 percent fewer complaints about the photos section in the next three months.

Scenario D: You are so sick of wearing the same outfits that you developed an AI software to pick out your clothes every morning. A venture capitalist saw your tweets about it and gave you a million dollars to start the company. You need the AI to address weather conditions when choosing the clothes. You want to run an experiment to test a method for collecting data for when it is 80 degrees and sunny.

If you poll people in popular cities on sunny days, then you will be able to add five percent more data points.
If you see what is in clothing stores on sunny days, you will be able to add ten percent more data points to the algorithm in a month.
If you send out a survey to ask people what they are wearing when it is 80 degrees and sunny, then you will get a 75 percent answer rate in a week.
By collecting data points of Instagram selfie dates to days that are 80 degrees and sunny, your AI can identify 75 percent of the clothing in three months.

Scenario E: Men’s socks are a great way to jazz up an outfit, so you decided to start a men’s sock e-commerce store. Your customers are not completing the checkout process, and usability tests show that some users question the site’s security. You want to add a small adjustment to the payments page to see if more users complete the checkout process.

If you add a password-strength indicator, then more people will create passwords in the next two months.
If you add a lock icon next to the credit card information, the completion of the checkout process will increase by 15 percent in three months.
If you make the site prettier, the completion of the checkout process will increase by 25 percent in six months.
If you add a review page before the confirmation page, then 20 percent of customers will be able to complete the checkout flow in ten minutes.

Answers:

A - 1
B - 3
C - 4
D - 4
E - 2

Learn from your mistakes. Look at the questions you got wrong and see which key element is either missing or vague.

Common Mistakes

There are too many variables. If you are testing multiple things then you cannot pinpoint which variable caused the results.
There is not an achievable metric attached to the hypothesis to know the point at which the experiment succeeds or fails.
The success or failure of the experiment is not directly linked to the experiment. If the success or failure could have been the result of a number of variables, then you don’t know if the experiment was the reason for the change.
The timebox is too long or too short. Some experiments are going to take longer, but make sure that the timebox is reasonable for the stage of your company. You don’t want to have an experiment that takes years or extends past your runway.

Other Resources