background

Strata Data Conference

Don’t miss out on talks from Dask core contributor Matt Rocklin discussing Dask: Flexible analytic computing for Python on Wednesday, May 24, at 11:15am, and Meet the Expert with Matthew Rocklin directly after at 12:05pm at the O’Reilly booth – Table A.

TALK: Dask: Flexible Analytic Computing for Python, Matt Rocklin (@mrocklin

The data science Python ecosystem (NumPy, pandas, and scikit-learn) are efficient and intuitive for advanced analytics workloads. Unfortunately, these tools are restricted to data that fits into memory and runs on a single core. Dask is a parallel computing library that complements the Python ecosystem by providing a distributed parallel framework for high-performance task scheduling.

Dask now parallelizes Python libraries like NumPy, pandas, parts of scikit-learn, and other more custom algorithms. This effort was done in collaboration with those core development communities and has led to a seamless big data experience for Python users for data analysis and complex analytics.

See the full talk abstract here. 

SESSION: Meet the Expert with Matt Rocklin of Anaconda

Matthew will explain how to parallelize Python data science workflows with NumPy, pandas, and scikit-learn across a cluster with Dask or other parallel computing tools.

See more details here.