Does the probability of a basketball player making a second free throw depend on the outcome of the first? In this notebook, I demonstrate the use of IPCluster and PyMC to investigate the “hot hand” phenomenon using Bayesian estimation.

Phrased in the language of probability theory, the existence of a hot hand would imply a difference in the probability of making a second free throw conditional on the outcome of the first attempt. With data from the NBA 2010-2012 seasons, we can test this hypothesis for individial players.

Spoiler alert: The existence of a hot hand depends on the player, as hinted at in the example output below.

PyMC uses Markov Chain Monte Carlo (MCMC) sampling to compute Bayesian estimates of the model parameters, which presents an ideal use case for IPCluster: MCMC sampling is an embarassingly parallel, CPU-limited process that involves moving relatively small amounts of data between nodes.

View the notebook in Wakari.


Sample input and output

game_id        event_idx event
20110416ATLORL    164    [ATL] Crawford Free Throw 1 of 2 Missed
20110416ATLORL    167    [ATL 34-35] Crawford Free Throw 2 of 2 (5 PTS) 

Example input, NBA free throw play-by-play.

Posterior distributions for two different players.

Example output, posterior distributions for two different players.

About the Author

Clayton is an interdisciplinary student and researcher, always looking to connect people and ideas from different academic tribes. Much of his current work revolves around taking scientific data and workflows to the cloud in order to inc …

Read more

Join the Disucssion