Pivotal Knowledge Base

Follow

How To Install Python 3.x for Spark

Environment

 Product  Version
 Pivotal HDP  2.3/2.4
 Spark  1.2.2
 Python  2.x, 3.x

Purpose

Spark on HDP is delivered with Python 2.x. If there is a need to use Python 3.x instead, the steps in this article should be followed.

Procedure

1. Install zlib-devel, openssl-devel, tcl, tcl-devel, tcllib, tk and tk-devel:

yum install zlib-devel 
yum install openssl-devel
yum install tcl tcl-devel tcllib tk tk-devel

2. Download the latest release of python 3.x for example, https://www.python.org/ftp/python/3.5.2/Python-3.5.2.tar.xz.

3. Extract the Python tar file:

tar xf Python-3.5.2.tar.xz

4. cd into the extracted directory and build python for your system:

cd Python-3.5.2
./configure --prefix=/opt/python3
make
make install ln -s /opt/python3/bin/python3 /usr/bin/python3

5. From Ambari add  or modify the following files in Ambari / Configs / Advanced / spark-env.sh:

export HDP_VERSION=2.4.2.0-258 
export PYTHONHOME=/opt/python3
export PYSPARK_PYTHON=python3
export PYTHONPATH=/opt/python3/lib/python3.5 

Comments

Powered by Zendesk