2018 Anaconda State of Data Science Report Released

 

We at Anaconda greatly value our data science community and are always striving to learn more about how you are using our products and how we can improve your overall experience.

With this goal in mind, we recently launched our first Anaconda State of Data Science Survey to gain a better understanding of what users accomplish with Anaconda, what they think about it, and the data sources, visualization, and scale-out approaches they are using. The survey, which ran from March 22 to April 30, 2018, resulted in 4,218 responses with a 100 percent survey completion rate.

In addition to giving us key insights into how to improve our products, our resulting report reveals current trends in data science and machine learning within the Anaconda community.

The State of Data Science

The Anaconda State of Data Science is strong. With 2 to 2.5 million downloads per month during January to March 2018, Anaconda is easily the most popular Python distribution, with a growing R following. Key findings of the survey include:

  • Applying cloud-native technologies such as Docker containers and Kubernetes to data science is growing at the expense of traditional Big Data (Hadoop/Spark).
  • Google Cloud’s data services outrank those of Amazon Web Services (AWS) and Microsoft Azure. Although Google Cloud is the third largest cloud provider, its focus on data services is paying off with the Anaconda community.
  • Anaconda is gaining popularity with software developers (15%), in addition to data scientists (16%) and academics (16%).
  • Matplotlib continues to enjoy its first-mover advantage in visualization, sweeping the category, but it is a highly-crowded space with many strong competitors, both open source and commercial. Plotly, Tableau, Microsoft Power BI, and Tibco Spotfire are all strong commercial competitors to Matplotlib and other open source projects like ggplot, Bokeh, D3, and Altair.
  • It matters a lot to our users that Anaconda is free, but not so much that it is open source. Free was ranked the most important attribute, while the open source licensing was second to last.

To learn more about the 2018 State of Data Science, we invite you to download the full report here.

We’d like to thank all the respondents for taking the time to complete our survey and help us gain insights into the state of data science in 2018. Let’s do it again next year!


You May Also Like

Data Science Blog
Who You Gonna Call? Halloween Tips & Treats to Protect You from Ghosts, Gremlins…and Software Vulnerabilities
Happy Halloween, readers. At Anaconda, we’re not too scared about things that go bump in the night. We’ve examined the data and concluded that it’s just the cleaning sta...
Read More
Data Science Blog
Database Trends & Applications: Anaconda Partners with Microsoft to Provide Data Science Python Programs
Anaconda, Inc., a Python data science platform provider, is partnering with Microsoft to embed Anaconda into Azure Machine Learning, Visual Studio and SQL Server to deliver da...
Read More
Data Science Blog
Credit Modeling with Dask
I’ve been working with a large retail bank on their credit modeling system. We’re doing interesting work with Dask to manage complex computations (see task graph below...
Read More