Does the probability of a basketball player making a second free throw depend on the outcome of the first? In this notebook, I demonstrate the use of IPCluster and PyMC to investigate the “hot hand” phenomenon using Bayesian estimation.

Phrased in the language of probability theory, the existence of a hot hand would imply a difference in the probability of making a second free throw conditional on the outcome of the first attempt. With data from the NBA 2010-2012 seasons, we can test this hypothesis for individial players.

Spoiler alert: The existence of a hot hand depends on the player, as hinted at in the example output below.

PyMC uses Markov Chain Monte Carlo (MCMC) sampling to compute Bayesian estimates of the model parameters, which presents an ideal use case for IPCluster: MCMC sampling is an embarassingly parallel, CPU-limited process that involves moving relatively small amounts of data between nodes.

View the notebook in Wakari.

 

Sample input and output

game_id        event_idx event
20110416ATLORL    164    [ATL] Crawford Free Throw 1 of 2 Missed
20110416ATLORL    167    [ATL 34-35] Crawford Free Throw 2 of 2 (5 PTS) 

Example input, NBA free throw play-by-play.

Posterior distributions for two different players.

Example output, posterior distributions for two different players.