Install the Anaconda parcel
The following procedure describes how to install the Anaconda parcel on a CDH cluster using Cloudera Manager. The Anaconda parcel provides a static installation of Anaconda, based on Python 2.7, that can be used with Python and PySpark jobs on the cluster.- In the Cloudera Manager Admin Console, in the top navigation bar, click the Parcels icon.
- At the top right of the parcels page, click the Edit Settings button.
- In the Remote Parcel Repository URLs section, click the plus symbol, and then add the following URL for the Anaconda parcel: https://repo.anaconda.com/pkgs/misc/parcels/
- At the top of the page, click the Save Changes button.
- In the top navigation bar, click the Parcels icon to return to the list of available parcels, where you should see the latest version of the Anaconda parcel that is available.
- To the right of the Anaconda parcel listing, click the Download button.
- After the parcel is downloaded, click the Distribute button to distribute the parcel to all of the cluster nodes.
- After the parcel is distributed, click the Activate button to activate the parcel on all of the cluster nodes.
- When prompted, confirm the activation.
PYSPARK_PYTHON environment
variable that refers to the location of Anaconda. For example, enter the following command all on one line:
The repository URL shown above installs the most recent version of the
Anaconda parcel. To install an older version of the Anaconda parcel, add
https://repo.anaconda.com/pkgs/misc/parcels/archive/ to the Remote Parcel Repository URLs in Cloudera
manager, and then follow the above steps with your desired version of the Anaconda
parcel.

