TMH #9: Experimentation with Matt Gershoff

Released October 19, 2021

Matt Gershoff of Conductrics sits down with me to discuss experimentation, A/B testing, statistical inference, and how to help maintain an experimentation program in an organization.

After the “drought” of the last three episodes, you can finally relax as it’s not just me doing a solo episode. And what an amazing guest I managed to persuade to join me!

Matt Gershoff of Conductrics is a good friend and an eloquent speaker. Above all, he’s so incredibly knowledgeable about experimentation and about statistical inference in general.

All the work we do with data is a subset of statistical inference. We collect data to draw conclusions, which we hope to extrapolate to the experiences and intuitions that we haven’t (yet) measured against. Often this is done through means of passive analysis, which is more related to the traditional “analytics” work done in organizations. Data is collected, and it is mined for insight and for feeding into new hypothesis generation.

The other approach is that of experimentation, where a business question is first formulated, then data is collected specifically for the needs of answering this question, and finally the results are analyzed on the basis of whether there’s (sufficient) evidence to suggest that an answer was found to the question or not.

Matt walks us through the ins and outs of techniques such as A/B testing, the frequentist and Bayesian approaches, and how it’s really up to organizations and the individuals within to make the right calls with how they approach data. No tool or service delivers automatic gratification.

Having said that, I would be remiss to not point out how amazing Conductrics is as an experimentation platform. It promotes transparency and open-source approaches over black boxes and magically produced results. It’s very developer-friendly, being designed and developed with powerful APIs in mind first and foremost, although it does provide a very intuitive user interface for running more traditional A/B test setups with.

We didn’t have enough time to discuss the Conductrics approach, which is why I’m comfortable in sharing this little unsolicited (and definitely not paid-for) shout out to what is my favorite experimentation platform out there!

Anyway, listen to the podcast to find out what trams in Helsinki have to do with understanding the p-value in the frequentist approach..

Listen to the episode using the player or find it in your favorite podcast service.

Topics

00:00:00 – Introduction
00:08:07 – Matt shares his thoughts on experimentation and the scientific method.
00:12:47 – What are A/B tests used for?
00:17:54 – How do you know what type of data to collect, and how much of it, for any given test?
00:23:21 – Should the analytics tools we use guide the users to drawing better and more statistically sound conclusions?
00:31:07 – The analyst should be aware of the circumstances in which the data was collected.
00:35:19 – How to deal with analysts or stakeholders who misinterpret or refuse to accept the results of a test?
00:41:44 – The problem with analyzing the data without having formulated the correct questions before data collection.
00:44:57 – By analyzing the results in different ways, could you change an inconclusive result to a conclusive one?
00:46:50 – Using experiment results for discovery and new hypothesis generation.
00:48:14 – What is the p-value?
00:54:04 – What is the Bayesian approach?
00:56:41 – How important is it to account for errors (e.g. false positives/negatives) in test design?
01:00:16 – If you could make everyone in digital organizations adopt a skill or know-how to help the organization be more sensitive to data and statistical inference, what would that skill be?
01:04:22 – How can people follow what Matt is writing about and sharing online?
01:05:23 – Outro

Notes and references

Conductrics
Neyman-Pearson lemma
T-test
Randomized Control Trial (RCT)
Frequentist and Bayesian methods
Nobel Prize in Economic Sciences 2021
Deborah Mayo: Statistical Inference As Severe Testing
HiPPO: Highest Paid Person’s Opinion
Benjamini-Hochberg Procedure
Prior probability in Bayesian statistical inference
Minimum detectable effect
Matt Gershoff (Twitter)
Conductrics Blog

TMH #9: Experimentation with Matt Gershoff

Topics

Notes and references

Join the Simmer newsletter!