Skip to content

Building TensorFlow Transform

aborkar-ibm edited this page Dec 4, 2020 · 23 revisions

Building TensorFlow Transform

The instructions provided below specify the steps to build TensorFlow Transform version 0.22.0 on Linux on IBM Z for the following distributions:

  • Ubuntu (18.04, 20.04)

General Notes:

  • When following the steps below please use a standard permission user unless otherwise specified.
  • A directory /<source_root>/ will be referred to in these instructions, this is a temporary writable directory anywhere you'd like to place it.

Step 1: Build and Install TensorFlow Transform v0.22.0

1.1) Build using script

If you want to build TensorFlow Transform using manual steps, go to STEP 1.2.

Use the following commands to build TensorFlow Transform using the build script. Please make sure you have wget installed.

wget -q https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/TensorflowTransform/0.22.0/build_tensorflow_transform.sh

# Build TensorFlow Transform
bash build_tensorflow_transform.sh    [Provide -t option for executing build with tests]

If the build completes successfully, go to STEP 2. In case of error, check logs for more details or go to STEP 1.2 to follow manual build steps.

1.2) Install the dependencies

export SOURCE_ROOT=/<source_root>/
  • Ubuntu (18.04)
 sudo apt-get update
 sudo apt-get install -y build-essential libffi-dev libjemalloc-dev libboost-dev libboost-filesystem-dev libboost-system-dev libboost-regex-dev autoconf flex bison
  • Ubuntu (20.04)
 sudo apt-get update
 sudo apt-get install -y build-essential cmake libffi-dev libjemalloc-dev libboost-dev libboost-filesystem-dev libboost-system-dev libboost-regex-dev autoconf flex bison

1.3) Build and Install TensorFlow

  • Instructions for building TensorFlow can be found here.

1.4) Build and Install Apache Arrow 0.16.0

  • Build CMake 3.16.3 (for 18.04 only)
 cd $SOURCE_ROOT
 wget https://cmake.org/files/v3.16/cmake-3.16.3.tar.gz
 tar -xzf cmake-3.16.3.tar.gz
 cd cmake-3.16.3
 ./bootstrap --prefix=/usr
 make
 sudo make install
  • Download source code
 cd $SOURCE_ROOT
 git clone https://github.com/apache/arrow.git
 cd arrow
 git checkout apache-arrow-0.16.0
  • Build and install Arrow C++ library
 cd $SOURCE_ROOT/arrow/cpp
 mkdir release
 cd release
 export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
 cmake -DCMAKE_INSTALL_PREFIX=/usr/local \
    -DCMAKE_INSTALL_LIBDIR=lib \
    -DARROW_PARQUET=ON \
    -DARROW_PYTHON=ON \
    -DCMAKE_BUILD_TYPE=Release \
    ..
 make -j4
 sudo make install
  • Install/Update python packages version
 sudo pip3 uninstall -y enum34 (Ubuntu 18.04 only)
 sudo pip3 install 'avro-python3==1.9.1' 'setuptools>=41.0.0' 'Cython>=0.29' 'httplib2<0.18.0,>=0.8' 'tensorflow-serving-api==2.2.0'
  • Build and install pyarrow library
 cd $SOURCE_ROOT/arrow/python
 export ARROW_BUILD_TYPE='release' && export PYARROW_WITH_PARQUET=1
 python setup.py build_ext --build-type=$ARROW_BUILD_TYPE --bundle-arrow-cpp bdist_wheel
 sudo pip3 install dist/*.whl

1.5) Build and Install tfx-bsl 0.22.1

  • Download source code
 cd $SOURCE_ROOT
 git clone https://github.com/tensorflow/tfx-bsl.git
 cd tfx-bsl
 git checkout v0.22.1
  • Build and install
 ./configure.sh
 bazel run -c opt tfx_bsl:build_pip_package
 sudo pip3 install dist/*.whl

1.6) Install TensorFlow Transform from binary

  sudo pip3 install tensorflow-transform==0.22.0

1.7) Install TensorFlow Transform from source (optional)

It is also possible to build and install TensorFlow Transform manually. This step is required if you intend to run the test cases as in Step 3.

  • Download source code
 cd $SOURCE_ROOT
 git clone https://github.com/tensorflow/transform.git
 cd transform
 git checkout v0.22.0
  • Applying the following patch
export PATCH_URL="https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/TensorflowTransform/0.22.0/patch/tft.patch"
curl -o tft.patch $PATCH_URL
git apply --ignore-whitespace tft.patch
  • Build and install
 sudo python3 setup.py install

Note: If any other particular version of a python package is required during installation, please run sudo pip3 install '<package-name>==<version>' to install it:

Step 2: Verify TensorFlow Transform (Optional)

  • Run TensorFlow Transform from command Line

     $ cd $SOURCE_ROOT
     $ /usr/bin/python3
      >>> import tensorflow as tf
      >>> import tensorflow_transform as tft
      >>> tft.version.__version__
      '0.22.0'
      >>>

Step 3: Execute Test Suite (Optional)

  • Run complete testsuite

    python3 -m unittest discover -v -p '*_test.py'

All tests should pass successfully.

References:

https://www.tensorflow.org/tfx/transform/api_docs/python/tft https://github.com/tensorflow/transform

Clone this wiki locally