Why Power Efficiency Is Key to AI Innovation, and What It Means for Hardware and Software Makers
Apr 07, 2022By Stanley Seibert
AI adoption is continuing to grow, with 56% of respondents in a 2021 survey reporting AI has been adopted within at least one business function, up from 50% the previous year. And in parallel, AI itself continues to progress rapidly, with ever-larger, ever-faster models with seemingly insatiable appetites for compute power and training data. For a long time, the focus has been about going bigger at all costs: more extensive data sets, bigger models, and more significant deployments. The constant demand for scale has been met with a combination of the traditional Moore’s Law increases in chip density, the transition to more specialized hardware (like GPUs), and the use of brute force in increasingly large compute clusters.
However, the often-overlooked factor that makes the expansion of AI/ML usage possible is power efficiency. Most computer environments, whether mobile, laptops, workstations, or data centers, have long since reached the limits of their power budgets. This means that future improvements in scale and speed must come from doing more with the same amount of energy. This focus on efficiency will lead to a new era of close collaboration among hardware and software designers, as well as a more modular approach to analysis workflows for data scientists.
How We Arrived at This Point
Historically, the high capital costs of servers meant that flexibility and performance across a wide range of tasks were the hallmarks of a solid system that would be useful for many years. However, this flexibility did come with some tradeoffs: because the hardware needed to accommodate a variety of tasks, it couldn’t be tailored for maximum efficiency for particular workloads. This was less of an issue when the models most people ran were on a smaller scale, but in the age of billion-parameter models and trillions of data points, it’s less than ideal.
As a result, we started to see specialization of hardware to specific use cases, such as GPUs and TPUs. This approach does increase efficiency, but only if a suitable ecosystem of software supports the hardware. And with the rapid proliferation of specialized, purpose-built chips and servers, making software and hardware play nicely together has gotten more complicated in recent years. Mobile-focused chip architectures, like ARM, are starting to make significant inroads into laptop, desktop, and server markets; new entrants like the Apple M1 are arriving on the scene; and in 2020 alone, venture capitalists poured $1.8B into U.S. chip startups. This is a product market poised to have more new entrants than ever before; even in the face of inevitable consolidation, it’s clear that demand for specialized hardware won’t slow down any time soon.
So how can data scientists take advantage of all these new choices and innovations without being burdened by compatibility challenges as they try to run various workloads efficiently?
A Closer Relationship Between Software and Hardware Yields Greater Efficiency
As AI/ML tasks continue to be deployed more widely in the industry, we’ll begin to see a strong push towards more power-efficient architectures and co-design of software and hardware to work together more effectively. What does this look like in practice?
As advances are made in hardware, it will be crucial that these companies are investing in the software development experience, too. For the data science space in particular, this includes outreach to open-source communities to allow for a two-way flow of information: hardware makers can better understand the needs and workflows of practitioners, and software developers can see the direction hardware innovation is headed, and create supportive architectures that can be optimized for new and evolving chips. Data science is an interesting use case because the community is so open-source driven, which means there may not always be an obvious point of contact for a hardware vendor to reach out to. But the effort to connect is well worthwhile for new chips and servers to be adopted, and for open-source projects to continue to be broadly compatible.
A More Modular Approach to Workflows Maximizes Opportunities for Optimizations
The onus for creating an efficient data science experience doesn’t only lie with hardware and software developers. Data scientists, too, need to adjust their working habits in order to reap the benefits of hardware innovation while minimizing overhead. In particular, they should take a more modular approach to their analysis, so as to take advantage of the hardware that most efficiently supports each element of their workflow. This will also allow data scientists to more easily mix and match hardware and software as new options become available over time. A monolithic software approach is conceptually appealing, but will limit data scientists’ ability to benefit from the innovation that’s coming.
To implement a modular workflow, data scientists should ask themselves “What is this tool best-suited for?” when considering each hardware platform and software package. Building a mental model of what kinds of hardware and software packages are good in specific situations will save data scientists and MLOps folks a lot of time and effort. This is challenging work, to be sure—especially with an ever-increasing number of options as new entrants arrive on the scene. But the efficiency gains can be significant when you find a great hardware/software combination for your specific use case.
Making Modular Workflows More Practical
The biggest component of making modular analysis workflows feasible and easy-to-use will be continued integration among the various hardware platforms and software packages. The more that these two—thought of as separate entities for so long—can work in harmony, the easier it will be to mix-and-match for maximum efficiency. Data scientists can also make the modular approach a bit easier to implement by considering what the most critical parts of their workflow are, and focusing on matching the best hardware for the job with the best software for those tasks. This is where the biggest gains will come from; there’s not much benefit to speeding up a task that only takes 5% of your overall workflow runtime.
Unfortunately, there’s no silver bullet hardware solution today that is optimized for all data science workflows, and there probably never will be. But just as they unlocked massive-scale compute power for ordinary users, cloud providers could play an important role here by making it easier for data scientists to move between different specialized chips as they run their workload. Of course, the success of that approach will depend on the ability of the software and hardware designers to keep up with each other and build complementary, compatible systems.
A Bright Future Ahead
It will take work from all parties to realize maximum efficiencies in AI/ML workflows, but advances in the past two years have already shown that it’s possible to create specialized hardware with high performance and low power consumption for these types of tasks. As data scientists begin to focus more on power efficiency, the connection and interplay between hardware and software will become more apparent and important. The challenge will be in bringing a critical mass of software packages to new platforms. The hardware makers and software communities that can work together to meet this challenge stand to gain a significant market share from data scientists who can use their technology to power a new wave of AI/ML innovation.