We are pleased to announce the release of Numba 0.13. This release continues to improve on the 0.12 refactor and contains numerous bug and regression fixes. We’re also excited about the introduction of CUDA support, which has long been available in NumbaPro. Numba 0.13 now has preliminary capabilities for writing CUDA kernels in Python using the cuda.jit and cuda.autojit decorators. NumbaPro will continue to have more advanced CUDA support like vectorize, guvectorize, and CUDA Library bindings.
Get It Now!
If you are using Anaconda, you can install with conda:
conda install numba
Alternatively, you can install with pip:
pip install numba
You’ll also need to install cudatoolkit through conda:
conda install cudatoolkit
You’ll also need an NVIDIA CUDA-enabled GPU with compute capability 2.0 or above.
Additional tools, CUDA-C/C++ examples, and developer driver is available with the CUDA toolkit from NVIDIA.
CUDA is a parallel programming model and platform created by NVIDIA for executing many tasks concurrently on the GPU rather than sequentially on the CPU. CUDA support in Numba is still preliminary and will be fleshed out more in the coming weeks and months. For now, Numba provides an entry point to CUDA programming via the cuda.jit and cuda.autojit decorators. The CUDA jit decorators translate Python functions into PTX code which then executes on CUDA enabled GPUs. To show the CUDA jit decorators in action, here is the dependable Mandelbrot example:
This example is from the ‘examples’ directory in the Numba github repo, with the normal jit decorators replaced with CUDA jit decorators. For the ‘create_fractal’ function, the cuda.autojit decorator is used which will attempt to automatically infer the function argument types. Note that the output array is passed in, rather than created and returned, since CUDA kernel functions cannot have return values. For the ‘mandel’ function, the cuda.jit decorator is used which allow you to specify the arguments types yourself. Note the ‘device=True’ flag, which tells Numba that this function is a device function that can only be called from another device function or from a CUDA kernel function. Note also that this function returns a value, which is allowed for device functions. For more information on CUDA support in Numba, visit the Numba 0.13 documentation.
For more information on the CUDA programming model, visit the NVIDIA CUDA-C Programming Guide.
On the Horizon…
The 0.12 refactor introduced a few more regressions than we had liked, but the Numba team is working hard to fix these regressions while continuing to add new features and performance improvements. The new codebase has already proven to be easier to work with and modify, and has resulted in improved import and compile times in most cases. In the coming weeks and months, watch for complete ufunc and math support in nopython mode, deferred array expressions built on a new NumPy compatible array type, reintroduction of jit classes for building new numba types in Python, better C++ integration, caching, continued performance improvements, and much more. If there is anything you’d like to see in Numba, or something in Numba that you’re having trouble with, join the discussion on the Numba mailing list or submit a bug report or feature request through github.