2021 State of Data Science

The 2021 State of Data Science report looks at how data science as a field is growing, the overall trends in adoption from commercial environments and academic institutions, and what students can do to prepare for the future.

For this year’s online survey, we received more than 4,200 responses from individuals using data science and machine learning tools in more than 140 countries.


Did the COVID-19 pandemic impact your organization’s investment in data science?

COVID-19 had a trickle-down effect that impacted virtually every industry – from healthcare to government, financial institutions, and more; they all needed to find ways to act quickly on data and find solutions to new problems. Additionally, when asked how involved their role is in business decisions, 14% of respondents said “all” decisions rely on insights interpreted by them or their team, and 39% said “many” business decisions rely on them. While there is still work needed to ensure we bring data scientists into the fold, it’s encouraging to see their value is recognized in organizations and might be why the field avoided a sharp decrease in investment.

In commercial organizations, 50% said their data science investment stayed the same or increased during the pandemic, while 37% saw a decrease.


What is your sentiment toward automation or AutoML, the process of automating tasks involved in applying machine learning to real-world problems, in data science?

A common theme in the news today is that automation is taking over and will eventually replace human workers. However, results show that automation is welcomed in the data science sector and isn’t viewed as a competitor but rather a complementary tool to practitioners.

55% of respondents hope to see more automation and AutoML in data science, while only 4% are concerned with how automation will impact data science


Does your employer encourage you and your team to contribute to open-source projects?

Using and contributing to open-source software (Python / R libraries such as pandas, NumPy, etc.) is a key differentiator of the most innovative organizations. By using open-source software, organizations can save tremendous amounts of time and resources.

The majority of respondents said that their employers are empowering them to contribute to open source through an increase in funding related to open-source project development.


How often do you use the following languages?

63% of respondents said they always or frequently use Python, making it the most popular language included in this year’s survey. In addition, 71% of educators are teaching Python, and 88% of students reported being taught Python in preparation to enter the data science/ML field.

For data scientists, researchers, students, and professionals worldwide, Python is becoming an increasingly popular programming language.


As we prepare for what’s next in the field, we’ve outlined themes for enterprises to focus on.

As data roles continue to grow and expand within enterprises, it’s critical to understand the day-to-day of a data professional, the tools and languages needed for success, and how organizations can ensure data literacy to make the most out of their data-focused teams. In addition to answering hot-button questions about bias, explainability, automation, and more, this report shares opportunities for enterprise adoption, focus areas for universities, and how individuals can prepare for careers in the field.