Presentation topics include scaling Python workloads; solving problems faced by enterprise data science teams; exploring the architecture and current applications of dask
New York—September 21—Anaconda, the most popular Python data science platform provider, today announced that several company experts will present two sessions and one tutorial at The Strata Data Conference on September 26 and 27 at the Javits Center in New York City. The conference, formerly known as Strata + Hadoop World, is designed to bring together big data’s most influential business decision makers, strategists, architects, developers and analysts to shape the future of their businesses and technologies.
Peter Wang, co-founder and CTO of Anaconda, will highlight challenges data scientists experience when sharing their work with other departments in the enterprise, like business analysts, IT teams, developers and others. In a session titled “Data science beyond the sandbox,” Peter will explain how to overcome these issues and empower managers and analysts to consume the outcomes of data science analysis by leveraging the company’s recently released Anaconda Enterprise 5, its enterprise data science platform built with the entire organization and development cycle in mind.
Anaconda’s Matthew Rocklin, computational scientist, and Ben Zaitlen, developer, will host a tutorial titled, “Scaling Python data analysis.” This presentation will examine how developers, data scientists and researchers can parallelize and scale Python workloads to multicore machines and multimachine clusters using a variety of tools, including the standard libraries, Spark and dask. Using guided exercises in Jupyter notebooks, attendees will gain hands-on experience with parallel computing tools and understand how to choose the right tool for the job.
Finally, Matthew will present once again, on “Dask: Flexible parallelism in Python for advanced analytics,” intended for data scientists and algorithm researchers. Matthew will discuss how dask can intuitively scale existing and novel Python advanced analytic workloads to distributed systems and avoid the limitations of NumPy, pandas and scikit-learn, tools in the data science Python ecosystem that are restricted to data that fits into memory and runs on a single core. Matthew will discuss the basic architecture of dask, classes of applications in which it is commonly useful, and how it fits into the broader Hadoop ecosystem. He will also participate in a Q&A during a “Meet the Expert” session.
WHO: Peter Wang, co-founder and CTO, Anaconda
WHAT: Data science beyond the sandbox
WHEN: September 27, 4:35 p.m. ET
WHERE: 1E 06
–Matthew Rocklin, computational scientist, Anaconda
–Ben Zaitlen, developer, Anaconda
WHAT: Scaling Python data analysis
WHEN: September 26, 9:00 a.m. ET
WHERE: 1E 15/16
WHO: Matthew Rocklin, computational scientist, Anaconda
WHAT: Dask: Flexible parallelism in Python for advanced analytics
WHEN: September 27, 2:55 p.m. ET
WHERE: 1A 08/10
Meet the Expert with Matthew Rocklin
WHEN: September 28, 11:20 a.m. ET
WHERE: O’Reilly booth (Table A)
About Anaconda, Inc.
With over 4.5 million users, Anaconda is the world’s most popular Python data science platform. Anaconda, Inc. continues to lead open source projects like Anaconda, NumPy and SciPy that form the foundation of modern data science.