Joyful study notes

Learning the Udacity A/B Testing Course through its Final Project (Part 1/3)

Wrap all the learnings in one real-world project

6 min readMar 30, 2022

Table of Content

Preface
Project Overview
Before getting started: Know your game
➤ Business impact
➤ Focal question and goal
➤ A/B testing feasibility
Next

Preface

One learning resource of A/B testing you can’t miss is a free course available on Udacity: A/B testing by Google. It offers a comprehensive and practical guide of running A/B testing end-to-end, and also covers the technical details like the calculation of standard errors and confidence intervals. If you are looking for a summary of this course, this post by Kelly Peng is your go-to. Other good resources are Emma Ding’s post about common pitfalls and YouTube videos. It’s also worth reading Trustworthy Online Controlled Experiments: A Practical Guide to A/B Testing (One of its authors developed the Udacity course.)

If you are more a learning-by-doing person, this post is for you. I will use the final project of the course, which is based on an actual experiment with real business impact, as a thread to connect all the key points in the course and demonstrate steps to solve a typical A/B testing problem, both qualitatively and quantitatively.

This is going to be a 3-part series. In Part 1, I will go through the project description and preparation work of A/B testing before going into the weeds. Part 2 will be about experiment design, including metric selection, variability, experiment size and duration. Part 3 will be about experiment analysis, including running statistical tests and making recommendations.

Project Overview

Long story short, this project is to:

Test a new feature (“Free Trial Screener”) on Udacity website that, when users want to start a 14-day free trial, asks how many hours they could spend on learning per week, and prompt light users (<5hrs/week) to use freely available course resources instead of enrolling in a free trial for full access. Note that it is only suggesting alternatives, not actually denying access to the free trial.

Example of a free trial screener. Source

Customer activity flow can be summarized by the funnel below, where each phase will have fewer users than before. Adding a free trial screener is adding another filter that will potentially further reduce the number of users moving to the next level.

Customer activity funnel (Source: Author)

The hypothesis is that if users are informed of the commitment before enrolling, they would choose an option better suiting their needs and therefore have better experience. Concretely, only users who are ready to commit efforts will enroll and hence course resources are better utilized. Moreover, enrolled users will more likely to renew and finish the course eventually since they are willing to devote time.

The complete project description can be found here.

Before getting started: Know your game

Don’t rush into experiment design just yet. It’s important to pause and get to know your game. What is this new feature expected to do? How does it impact users and business? Is A/B testing the proper tool (even though this project automatically assumes so, it’s not necessarily true in other cases)? Only when we can confidently answer these questions can we better define the focal question and the goal of this project before moving toward the next step.

Business impact

Think about who are affected, how they are affected, pros and cons, how it potentially affects the overall ecosystem of products.

A free trial screener will affect all users planning to start a free trial, i.e., users in the second stage of the funnel. For those who are not sure about committing enough time, they will be discouraged to enter a free trial, not to mention buying the course afterwards. For those determined to enroll, they get to enjoy more resources and better user experience. But they may have second thoughts: “Maybe free resources work equally well for me?”

On one hand, this feature could douse impulsive shopping and that’s revenue lost (Try to think if a gym salesperson suggests you to not join if you are too busy to come.) It could also result in false negative (users who will continue after free trials are flagged as otherwise) due to the discretionary 5-hour threshold. However, it could be effective in improving the free trial dropout rate without significantly sacrificing revenue as the users it drives away are less likely to renew after the free trial anyway.

Focal question and goal

Based on the potential business impact identified, what to focus on for this project

From discussions above, a free trial screener will affect user enrollment, course utilization, user experience, and renewal. Here we can summarize that the goal is to make sure free trials are open to users who are more likely to renew and finish the course, without cannibalizing the revenue-generating user base.

Hence the focal question of this project can be framed as to whether a free trial screener is effective in achieving this goal, and consequently whether to launch it.

A/B testing feasibility

Although A/B testing is the most powerful tool for establishing causality, it may not be available to all problems. Specifically:

When it’s a foundational change rather than incremental improvement: A/B testing cannot tell if a product catalog is good enough, or if a company should change its logo. Such big changes often invoke too much emotion to accurately measure the impact due to novelty effect and change aversion.
When the underlying experimental units cannot be well randomized: For example, if for some reason only Firefox users are able to see the new feature, the control and experiment groups will not be randomly assigned.
When it’s unethical to conduct A/B testing: Experiments should not impose unjustifiable risks on participants, and should always get informed consent from participants about risks, benefits, and user-identifiable information collection.
When impact takes too long to manifest: It can be costly to monitor long-term effects, and more likely to be exposed to confounding factors that contaminate the results.

When A/B testing is not feasible, other methods to be considered:

Causal inference: E.g. difference-in-difference model, instrument variables, propensity score matching, etc.
Focus group: gather a small group of users to review and provide feedback on a new product
User survey: collect user feedback and experience through designing and distributing surveys
Retrospective study: E.g. panel OLS regression

In this case, the screener is an additive feature to the existing UI, and its impact on conversion can be evaluated once the 14-day free trial ends. There is no ethical concern as no user sensitive information needs to be collected, and Udacity has a large enough user base to randomly assign to two groups and run powerful tests. Therefore, this is a good use case of A/B testing.

Next…

Now we have fully understood the project, focal question and goal, and confirmed that A/B testing is the way to go, next steps are designing and analyzing a controlled experiment.

Please go to Part 2 of the series (Coming soon!) for steps on experiment design, and Part 3 (Coming soon!) for experiment analysis. The fun’s just getting started!

If you like this article, please give me a clap. I write about life reflection, career development, and new things I’ve learned. You can also find me on LinkedIn.