Don’t miss out on talks from Dask core contributor Matt Rocklin discussing Dask: Flexible analytic computing for Python on Wednesday, May 24, at 11:15am, and Meet the Expert with Matthew Rocklin directly after at 12:05pm at the O’Reilly booth – Table A.
TALK: Dask: Flexible Analytic Computing for Python, Matt Rocklin (@mrocklin)
The data science Python ecosystem (NumPy, pandas, and scikit-learn) are efficient and intuitive for advanced analytics workloads. Unfortunately, these tools are restricted to data that fits into memory and runs on a single core. Dask is a parallel computing library that complements the Python ecosystem by providing a distributed parallel framework for high-performance task scheduling.
Dask now parallelizes Python libraries like NumPy, pandas, parts of scikit-learn, and other more custom algorithms. This effort was done in collaboration with those core development communities and has led to a seamless big data experience for Python users for data analysis and complex analytics.
SESSION: Meet the Expert with Matt Rocklin of Anaconda
Matthew will explain how to parallelize Python data science workflows with NumPy, pandas, and scikit-learn across a cluster with Dask or other parallel computing tools.