The Anaconda team headed to Cleveland for PyCon 2018, the largest annual gathering for the community using and developing the open-source Python programming language.
On Wednesday, May 9, Data Scientist Tom Augspurger, Software Developer Jim Crist, and Software Engineer Martin Durant presented a tutorial on Parallel Data Analysis with Dask. The libraries that power data analysis in Python are essentially limited to a single CPU core and to datasets that fit in RAM. Attendees saw how Dask can parallelize their workflows, while still writing what looks like normal Python, NumPy, or pandas code.
On Saturday, May 12, Lead Dask Developer Matthew Rocklin discussed Democratizing Distributed Computing with Dask and JupyterHub. Matt presented a case study on using JupyterHub, XArray, Dask, and Kubernetes together in an operational setting. He demonstrated how to build up and deploy a running system that the audience then used to access distributed computing resources.
Tutorial: Parallel Data Analysis with Dask
Presenters: Tom Augspurger, Jim Crist, and Martin Durant
Wednesday, May 9
Talk: Democratizing Distributed Computing with Dask and JupyterHub
Presenter: Matt Rocklin
Saturday, May 12