Enterprises Need to Think Differently about Data Science. Here’s How.

 

Laboratoy glassware with chemicals, chemistry science concept

Companies that are data science literate make and communicate decisions on the basis of real data models, and not merely instinct or tradition. They welcome new data science technologies as opportunities for potential innovation, rather than adopting a stance of suspicion and resistance. As enterprises rush to embrace data science, they are discovering a couple of important truths that lie at the foundation of developing data fluency across the enterprise. 

Firstly, effective data science should not be treated like just another business process, and can not be operationalized assembly-line style.

Data science — as the name suggests — is a mode of inquiry and exploration similar to “real” science. 

Just as a physicist uses math to reason about the natural world, data scientists harness mathematical and computational tools to reason about the business world. 

Like the real sciences, some successful experiments may yield a null result; others may yield accurate models that are impossible to use in practice. While no one celebrates these kinds of outcomes, any business that wants to embrace data science must develop an organizational tolerance for “failure” as a natural, inevitable cost of even the most effective data science.

Since traditional business analytics (e.g. generating reports & dashboards) are relatively procedural in nature, there is a tendency to want to regard data science as a similar type of activity, and manage it accordingly. However, data science is not procedural, because it rests on the irreducible human elements of creative exploration and intuition. This renders the entire process much more difficult to operate and manage. Effective data science also requires knowledge and experience that transverses traditional departments. 

Second, data scientists are not traditional technologists that fit neatly within any of the traditional silos. They are a diverse group of hybrid practitioners, and this should be leveraged as an opportunity, rather than treated as an aberration to be managed away.

Modern data scientists come from diverse backgrounds. They combine subject matter expertise with applied statistics and highly specialized coding skills (quite different from the usual software developer skills). This unique combination of skills and technology needs don’t fit neatly into any traditional corporate silo.

A typical corporate immune response is to compartmentalize everything, even at the potential cost of divvying up the integrated functions that makes data science effective. This is sometimes exacerbated by a corporate mindset that treats any type of programming activity as “software development”, even though a data scientist building models with Python code is an entirely different animal than a Java business application.

Furthermore, as we discovered in the development of our 2019 State of Data Science report, people in a variety of fields are learning data science to apply it to their current roles. In a few years, the skills we currently call “data science” will be widely distributed among many people whose job titles may not be “data scientist”. Organizations that embrace collaboration across traditional lines as part of their digital transformation will be much better prepared for that eventuality. Businesses that don’t learn how to embrace the hybrid, inter-departmental nature of data science will be fundamentally stuck in the past.

How Do I Make My Business Ready for Data Science?

Here are my suggestions to start incorporating data science into your organization in a more realistic way:

Conduct an honest assessment of your internal data science readiness. Data-driven decision making is one component of a larger, global “digital transformation” movement in the business world. The challenges towards enabling effective data science are not transient ones, but rather the steps along a much bigger journey (often, towards AI and alongside cloud initiatives).  Make an honest assessment of your organization’s commitment to this journey, and “right-size” your data science initiatives accordingly to fit the risk and budget profile. For full-scale data science enablement to succeed, your organization must fully embrace data-driven digital transformation. 

No one can sell you a big red “Predict!” button. Data science requires deep subject matter expertise and familiarity with the unique aspects of your business. Even if you engage consultants to kickstart a data science initiative, in time, your internal teams must learn how to maintain and evolve the models and pipelines that they have built. Effective data science affects how you think about core concerns in your business; there is almost never an off-the-shelf prediction system that just works for everyone.

Make friends in IT. The computational needs of data scientists are relatively novel for many corporate IT teams: hundreds of open source libraries; languages like Python and R; massive memory requirements; GPU and vector accelerators. In order for data science teams to succeed, they need the support and assistance of allies in IT.

Unfortunately, corporate IT group tends to have a bad reputation as a “tower of No”, and it’s easy for data science leaders (especially those new to the enterprise) to grow cynical about their counterparts in IT. We’ve seen data science leaders and IT productively meet in a middle ground via implementing Anaconda Enterprise as an enterprise-grade platform for collaboration, deployment, security, and governance. 

You can start small. Although the space of data science & machine learning may seem vast and intimidating, remember you can always start small.  Choosing one truly painful problem with clear business drivers, and get a quick win to help build buy-in and trust. Early on, it’s more important to definitively solve a clear & well-understood problem, than to produce ambiguous results on a large, ambitious problem.

As companies truly embrace data science, they will experience shifts well beyond the org chart, extending into culture, norms, and ways of thinking. The change is worth the effort: while standard assembly-line business analytics is about counting measurements, data science gives you the insight to build better rulers.


You May Also Like

For Practitioners
How to Troubleshoot Python Software in Anaconda Distribution
Below is a question that was recently asked on StackOverflow and I decided it would be helpful to publish an answer explaining the various ways in which to troubleshoot a prob...
Read More
For Practitioners
Keeping Anaconda Up To Date
Below is a question that gets asked so often that I decided it would be helpful to publish an answer explaining the various ways in which Anaconda can be kept up to date. The ...
Read More
For Practitioners
Announcing General Availability of conda 4.3
We’re excited to announce that conda 4.3 has been released for general availability. The 4.3 release series has several new features and substantial improvements. Below ...
Read More