Continuing our goal to have a new release of Numba every month, we are celebrating the month of September with Numba version 0.14! Along with a number of bug fixes, it brings support for many new features, including:
- Support for nearly all of the NumPy math functions (including comparison, logical, bitwise and some previously missing float functions) in nopython mode.
- The NumPy datetime64 and timedelta64 dtypes are supported in nopython mode with NumPy 1.7 and later.
- Support for NumPy math functions on complex numbers in nopython mode.
- ndarray.sum() is supported in nopython mode.
- Improved warnings and error messages.
- Support for NumPy record arrays on the GPU.
See the full changelog for more details.
Along with the new release, we will be releasing a couple of blog posts soon discussing where Numba is at, some tips on how to best use it, and show some worked examples of how it speeds up particular algorithms in Python.
Where is Numba today?
As a Python compiler, Numba is part of a robust and diverse ecosystem of tools like Cython, PyPy, ShedSkin, Nuitka, Parakeet, Pythran, and Pyston, among others. Numba tackles the challenge of improving Python performance with a unique combination of design choices and features:
- It is a JIT compiler for Python functions using the LLVM compiler library to produce machine code at runtime.
- It does not require an external compiler toolchain, like Visual Studio or gcc, to be available during code generation. (A compiler is required to compile the Numba package from source code.)
- It works within the standard CPython runtime. Numba is not a replacement Python interpreter.
- It does not compile your entire program, only the functions you indicate.
- It interoperates well with the scientific Python stack, including NumPy arrays and direct calls to C functions.
- Thanks to the broad industry support for LLVM, Numba can generate code for the CPU or GPU. (Calling Numba-generated GPU code requires a slightly different API.)
- It supports Python 2.6, 2.7, 3.3 and 3.4 on Linux 32/64-bit, Windows 32/64-bit and OS X 64-bit operating systems.
Today, Numba provides a complete compiler chain from Python bytecode to machine code, including: a type inference engine, a translator from the stack machine model used by the Python interpreter to a register machine model, and an interface to generate and compile LLVM IR to machine code.
We have implemented machine-native versions of a variety of commonly used data types and control structures in Python, focusing on the use cases found most frequently in scientific computing: manipulation of NumPy arrays and math-heavy algorithms. We also open sourced our compiler target for CUDA earlier this year, allowing CUDA kernels to be written directly in Python and compiled for execution on NVIDIA GPUs.
The Future of Numba
We want to see Numba become the standard way to embed generic compilation for CPUs and GPUs inside Python programs. To achieve this goal, we are pushing Numba forward on three fronts.
Improve the User Experience
The first and foremost priority for the Numba team is to expand the range of Python code that we can accelerate. Growing beyond our current core use cases, we want to add support for additional types, such as bytes (str in Python 2), bytearray, Decimal, and others. Better support for control structures such as iterators, exceptions and recursion will also help Numba understand more idiomatic Python styles.
We also want to make it easier to debug Numba-compiled code. Along with better compiler warning messages and feedback, new features in LLVM will eventually allow us to embed debug information into compiled functions.
Better Support for High Level Interfaces
Building on a solid compiler base, our second priority is supporting high level APIs for computation. Numba already supports creation of custom NumPy ufuncs and CUDA kernels, but we want to do more. We are currently evaluating options for optimizing and compiling array expressions, but also want to support other parallel coding styles, such as the “kernel”-based approach popularized by CUDA and OpenCL.
Our goal is to enable concise code to also be fast code, and to promote abstractions that allow Numba users to more easily switch targets from the CPU to the GPU without massive code rewrites.
Evolve the Compiler Internals
Our third priority is to continue to improve the compiler technology and the performance of our generated code, taking advantage of new developments in the LLVM ecosystem. Future versions of Numba will include the ability to inline user functions, greater use of CPU vector instructions in more situations, and improved compilation speeds. We also would like to add support for other LLVM-supported architectures, such as ARM, AMD GPUs and the Intel Xeon Phi.
As the Numba internals mature, we want to document a standard API for extending Numba to new types and compilation targets so that third party developers can customize Numba for their application needs without having to modify the Numba source code itself.
Numba has grown rapidly as a Python JIT compiler, supporting a wide range of scientific computing use cases on a variety of hardware platforms and operating systems. However, this is only the beginning for Numba. We have plans for new ways to continue to expand our capabilities and performance. The next year will be exciting to watch!
Next week, we’ll give some tips for how to use Numba in your projects and how to maximize your benefit from it.