April 13, 2016

Python is the fastest growing Open Data Science language & is used more than 50% of the time to extract value from Big Data in Spark. However, both PySpark & SparkR serialize data when interacting with Spark which negatively impacts the time-to-value from your Big Data. What if there was a way to leverage the entire Python ecosystem without refactoring your Hadoop-based data science investments & get high performance?

Anaconda, the leading Open Data Science Platform, delivers high performance Python for Hadoop. On April 13th, Dr. Kristopher Overholt & Dr. Matthew Rocklin of Continuum Analytics present a webinar on High Performance Hadoop with Python.  

 

Scale Up & Scale Out with Anaconda

Python is the fastest growing Open Data Science language & is used more than 50% of the time to extract value from Big Data in Spark.
 
However, both PySpark & SparkR serialize data when interacting with Spark which negatively impacts the time-to-value from your Big Data. What if there was a way to leverage the entire Python ecosystem without refactoring your Hadoop-based data science investments & get high performance?
 
Anaconda, the leading Open Data Science Platform, delivers high performance Python for Hadoop. You get to leverage your existing Python-based data science investments with your existing Hadoop or HPC clusters. Anaconda bypasses the typical Hadoop performance issues, leverages existing high performance scientific and array-based computing in Python and now leverages Dask, the powerful parallel execution framework, to deliver fast results on any enterprise Hadoop distribution such as Cloudera & Hortonworks.
 
On April 13th, Dr. Kristopher Overholt & Dr. Matthew Rocklin of Continuum Analytics will present a webinar on High Performance Hadoop with Python.  
 
In this webinar, you’ll learn to:
  • Analyze NYC taxi data through distributed DataFrames on a cluster on HDFS

  • Create interactive distributed visualizations of global temperature data

  • Distribute in-memory natural language processing & interactive queries on text data in HDFS

  • Wrap and parallelize existing legacy code on custom file formats 

Join us on April 13th for the webinar & learn to scale up & out with Anaconda.

About the Author

admin has been with the Anaconda Global Inc. team for over 2 years.

Read more

Join the Disucssion