Utilizing the New Compilers in Anaconda Distribution 5

Part of what made the recent release of Anaconda Distribution 5 so exciting was our switch from OS-provided compiler tools to our own Anaconda toolsets. This change has allowed us to make major steps forward in the capabilities of our compilers, specifically regarding security and performance. In this post, we’ll show you how to use these tools so that you can reap the benefits of utilizing our new compilers.

Compiler packages

Compilers have always been something that you installed using your system tools—yum install gcc or install XCode. Now, however, Anaconda Distribution 5 offers conda packages for compilers on Linux and macOS. Compared with earlier available compiler packages (gcc on both macOS and Linux), the new packages are split up into separate compilers:

Linux:

  • gcc_linux-64
  • gxx_linux-64
  • gfortran_linux-64

macOS:

  • clang_osx-64
  • clangxx_osx-64
  • gfortran_osx-64

Note that all of these package names end in a platform identifier. This is because all compiler packages are specific to the platform on which they run, as well as the platform for which they output code.

Using new compiler packages

New compiler packages can be conda-installed, but they’re a bit tricky to use. Specifically, because they are designed with (pseudo)cross-compiling in mind, all of the executables in a compiler package are “prefixed.” Instead of gcc, you have something like x86_64-conda_cos6-linux-gnu-gcc. This full compiler name shows up in the build logs, keeping it extremely clear what you’re building code for and making it hard to accidentally use the wrong compiler. However, because make, cmake, and other such programs are unable to find plain old gcc, we help them. We set many environment variables that tell the other build tools what to use. We set these variables in conda activate.d scripts, so it is essential that you activate any environment in which you’re hoping to use the compilers. Conda-build does this activation for you through activation hooks installed with the compiler packages in CONDA_PREFIX/etc/conda/activate.d—no additional effort is necessary. As a side note, you can activate the root environment by typing source activate root.

MacOS SDK

There is one additional step for using the Mac compilers. You must have the macOS 10.9 SDK available. Unfortunately, we cannot bundle this as part of our package due to licensing constraints. We know of two current ways of obtaining the macOS 10.9 SDK:

  1. https://github.com/devernay/xcodelegacy
  2. https://github.com/phracker/MacOSX-SDKs

We generally install this SDK to /opt/MacOSX10.9.sdk. Wherever you install it, you need an entry in whatever conda_build_config.yaml file you’ll be using. For example:


 CONDA_BUILD_SYSROOT: 
 - /opt/MacOSX10.9.sdk # [osx]

At Anaconda, we have this in a centralized conda_build_config.yaml at the root of our recipe repo. Since this is where we issue build commands from, it is used for all recipes. More on the conda_build_config.yaml search order can be found at: https://conda.io/docs/user-guide/tasks/build-packages/variants.html#creating-conda-build-variant-config-files.

Backward compatibility

As a compatibility shim for people who want to use our latest Anaconda Distribution packages, but aren’t ready to switch to the new compilers, we needed to build some extra complexity into the Python packages. The latest Python package builds now have a default _sysconfigdata file that keeps the default compilers as the ones (gcc, g++, etc.) provided by the host system. This hopefully keeps legacy recipes working.

Python packages also include an alternative _sysconfigdata file that refers to the new compilers, which match the compilers used to compile Python itself. The compiler packages set an environment variable (_PYTHON_SYSCONFIGDATA_NAME) that tells Python which _sysconfigdata file to use. This variable is set at activation time through the activate hooks described previously. Again, this new _sysconfigdata customization scheme is added in recent versions of the Python package: you should be aware that if you’re not using conda-build to play with the new compilers, you may need to update to a newer Python package so that you have the correct _sysconfigdata files.

New compilers and conda-build 3

These new compiler packages and conda-build 3 were designed to work together. Conda-build 3 defines a special jinja2 function, compiler(), to make it easy to dynamically specify compiler packages on many platforms. The compiler function takes at least one argument, the language of the compiler to use:


 requirements:
 build:
 - {{ compiler(‘c’) }}
 

To write recipes that are “cross-capable”—recipes that can be used to produce packages for a different platform than the one running conda-build—you may also need to use the new “host” section in the requirements section:


 requirements:
 build:
 - {{ compiler(‘c’) }}
 host:
 - zlib
 

As a general rule of thumb, compilers and other build tools should go in the build section, and everything else (shared libraries, python, python libraries) should go in the host section.

But wait, there’s more! How to customize compilers

To keep this post a manageable length, we’ve omitted information about how to customize compilers to your liking. You can find more information on our docs page, https://conda.io/docs/user-guide/tasks/build-packages/compiler-tools.html#customizing-the-compilers

Talk to an Expert

Talk to one of our financial services and banking industry experts to find solutions for your AI journey.

Talk to an Expert