Skip to content

Building TensorFlow

aborkar-ibm edited this page Sep 6, 2022 · 50 revisions

Building TensorFlow

The instructions provided below specify the steps to build TensorFlow version 2.9.1 on Linux on IBM Z for the following distributions:

  • Ubuntu (18.04, 20.04, 22.04)

General Notes:

  • When following the steps below please use a standard permission user unless otherwise specified.
  • A directory /<source_root>/ will be referred to in these instructions, this is a temporary writable directory anywhere you'd like to place it.

Step 1: Build and Install TensorFlow v2.9.1

1.1) Build using script

If you want to build TensorFlow using manual steps, go to STEP 1.2.

Use the following commands to build TensorFlow using the build script. Please make sure you have wget installed.

wget -q https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Tensorflow/2.9.1/build_tensorflow.sh

# Build Tensorflow
bash build_tensorflow.sh    [Provide -t option for executing build with tests]

If the build completes successfully, go to STEP 2. In case of error, check logs for more details or go to STEP 1.2 to follow manual build steps.

1.2) Install the dependencies

export SOURCE_ROOT=/<source_root>/
  • Ubuntu 18.04

    sudo apt-get update
    sudo apt-get install wget git unzip zip python3-dev python3-pip openjdk-11-jdk pkg-config libhdf5-dev libssl-dev libblas-dev liblapack-dev gfortran libopenblas-dev curl -y
    wget -q https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Python3/3.10.4/build_python3.sh
    bash build_python3.sh -y
    sudo update-alternatives --install /usr/local/bin/python python /usr/local/bin/python3 40
    sudo ldconfig
    sudo pip3 install --upgrade pip
    sudo pip3 install --no-cache-dir numpy==1.22.3 wheel scipy portpicker protobuf==3.13.0 packaging
    sudo pip3 install keras_preprocessing --no-deps
  • Ubuntu 20.04

    sudo apt-get update
    sudo apt-get install wget git unzip zip python3-dev python3-pip openjdk-11-jdk pkg-config libhdf5-dev libssl-dev libblas-dev liblapack-dev gfortran curl -y
    sudo ldconfig
    sudo pip3 install --upgrade pip
    sudo pip3 install --no-cache-dir numpy==1.22.3 wheel scipy==1.6.3 portpicker protobuf==3.13.0 packaging
    sudo pip3 install keras_preprocessing --no-deps
  • Ubuntu 22.04

    sudo apt-get update
    sudo apt-get install wget git unzip zip python3-dev python3-pip openjdk-11-jdk pkg-config libhdf5-dev libssl-dev libblas-dev liblapack-dev gfortran gcc-9* g++-9* curl -y
    sudo ldconfig
    sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 60 --slave /usr/bin/g++ g++ /usr/bin/g++-9
    sudo update-alternatives --auto gcc
    sudo pip3 install --upgrade pip
    sudo pip3 install --no-cache-dir numpy==1.22.3 wheel scipy==1.7.2 portpicker protobuf==3.13.0 packaging
    sudo pip3 install keras_preprocessing --no-deps
  • Ensure /usr/bin/python points to Python3 to build TensorFlow in a Python3 environment (On Ubuntu (20.04, 22.04))

    sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 40
  • Install grpcio

    export GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=True
    sudo -E pip3 install grpcio
  • Build Bazel v5.1.1

    wget -q https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Bazel/5.1.1/build_bazel.sh
    bash build_bazel.sh -y

    Note: Bazel community does not support Ubuntu 20.04 and Ubuntu 22.04 officially but build_bazel.sh script can still be used to build Bazel on Ubuntu 20.04 and Ubuntu 22.04.

1.3) Build TensorFlow

  • Download source code

    cd $SOURCE_ROOT
    git clone https://github.com/tensorflow/tensorflow
    cd tensorflow
    git checkout v2.9.1
    curl -o build_patch.diff https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Tensorflow/2.9.1/patch/build_patch.diff
    git apply --ignore-whitespace build_patch.diff
    
    sed -i 's/float_t/float/g' tensorflow/stream_executor/tpu/c_api_decl.h  # Only on Ubuntu (18.04, 20.04)
  • Configure

      yes "" | ./configure || true
  • Build TensorFlow

      bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" build  --define=tensorflow_mkldnn_contraction_kernel=0 --define tflite_with_xnnpack=false //tensorflow/tools/pip_package:build_pip_package

    Note:
    1. TensorFlow build is resource intensive operation. If build continues to fail try increasing the swap space and reduce the number of concurrent jobs by specifying --jobs=n in the build command above, where n is the number of concurrent jobs.

    2. Building TensorFlow from source can use a lot of RAM. If your system is memory-constrained, limit Bazel's RAM usage with: --local_ram_resources=2048.

1.4) Build tensorflow_io_gcs_filesystem and libclang wheel

  • Build tensorflow_io_gcs_filesystem wheel
    sudo pip install --upgrade pip
    cd $SOURCE_ROOT
    git clone https://github.com/tensorflow/io.git
    cd io/
    git checkout v0.23.1
    python3 setup.py -q bdist_wheel --project tensorflow_io_gcs_filesystem          
    cd dist
    sudo pip3 install ./tensorflow_io_gcs_filesystem-0.23.1-cp*-cp*-linux_s390x.whl
  • Build libclang wheel
    cd $SOURCE_ROOT
    sudo apt-get install -y cmake
    git clone https://github.com/llvm/llvm-project
    cd llvm-project
    git checkout llvmorg-9.0.1
    mkdir -p build
    cd build
    cmake ../llvm -DLLVM_ENABLE_PROJECTS=clang -DBUILD_SHARED_LIBS=OFF -DLLVM_ENABLE_TERMINFO=OFF -DLLVM_TARGETS_TO_BUILD=SystemZ -DCMAKE_BUILD_TYPE=MinSizeRel -DCMAKE_CXX_FLAGS_MINSIZEREL="-Os -DNDEBUG -static-libgcc -static-libstdc++ -s"
    make libclang -j$(nproc)
    cd $SOURCE_ROOT
    git clone https://github.com/sighingnow/libclang.git
    cd libclang
    cp $SOURCE_ROOT/llvm-project/build/lib/libclang.so.9 native/
    cp $SOURCE_ROOT/llvm-project/build/lib/libclang.so native/
    python3 setup.py -q bdist_wheel 
    cd dist
    sudo pip3 install libclang-*-py2.py3-none-any.whl

1.5) Build and Install TensorFlow wheel

cd $SOURCE_ROOT/tensorflow
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_wheel

sudo ln -s /usr/include/locale.h /usr/include/xlocale.h (On Ubuntu 18.04 only)

sudo pip3 install /tmp/tensorflow_wheel/tensorflow-2.9.1-cp*-linux_s390x.whl

Step 2: Verify TensorFlow (Optional)

  • Run TensorFlow from command Line and check the installed version
     $ cd $SOURCE_ROOT
     $ python -c "import tensorflow as tf; print(tf.__version__)"
       2.9.1
    
     $ /usr/bin/python3
      >>> import tensorflow as tf
      >>> tf.add(1, 2).numpy()
      3
      >>> hello = tf.constant('Hello, TensorFlow!')
      >>> hello.numpy()
      b'Hello, TensorFlow!'
      >>>

Step 3: Execute Test Suite (Optional)

  • Run complete testsuite

    cd $SOURCE_ROOT/tensorflow
    # Apply minor BE specific fixes to some tests
    curl -o test_patch.diff https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Tensorflow/2.9.1/patch/test_patch.diff
    git apply --ignore-whitespace test_patch.diff
    
    bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test --test_tag_filters=-gpu,-tpu,-benchmark-test,-v1only,-no_oss,-oss_serial -k --test_timeout 300,450,1200,3600 --build_tests_only --test_output=errors --define=tensorflow_mkldnn_contraction_kernel=0 --define tflite_with_xnnpack=false  -- //tensorflow/... -//tensorflow/compiler/... -//tensorflow/lite/... -//tensorflow/core/platform/cloud/...

    Note: //tensorflow/core/platform/cloud skipped due to BoringSSL, refer #14039 for details.

  • Run individual test

    bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test //tensorflow/<module_name>:<testcase_name>

    For example,

    bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test //tensorflow/python/kernel_tests:topk_op_test

    Note:
    1. There are some known limitations in TensorFlow Lite such as reading FlatBuffers tflite model file or quantization process on s390x and related tests would fail, Refer this for more details.

    2. Few test cases from //tensorflow/java/... fail on s390x and x86, refer to this for more details.

    3. Few tests from core, python and compiler module fail due to lack of certain CPU operations in SystemZ LLVM backend. They can pass by adding --copt=-O -c opt to bazel test command. Additionally few tests from //tensorflow/core/kernels/mlir_generated require --jit=true which can be set as below:

    diff --git a/tensorflow/core/kernels/mlir_generated/build_defs.bzl b/tensorflow/core/kernels/mlir_generated/build_defs.bzl
    index f5bcd7c2e2c..a0f806b2986 100644
    --- a/tensorflow/core/kernels/mlir_generated/build_defs.bzl
    +++ b/tensorflow/core/kernels/mlir_generated/build_defs.bzl
    @@ -342,7 +342,8 @@ def _gen_kernel_library(
    		 "--cpu_codegen=true" if enable_cpu else "--arch={}".format(gpu_arch_option),
    		 "--tile_sizes=%s" % typed_tile_size,
    		 "--enable_ftz=%s" % (type == "f32"),
    -            ]
    +                "--jit=true",
    +           ]
    	     if typed_unroll_factors:
    		 test_args.append("--unroll_factors=%s" % typed_unroll_factors)
    	     native.sh_test(

    4. Below mentioned test cases expect ICU encoding data to be present for big endian format. If needed, this data can be manually generated on s390x for test cases to pass.
    //tensorflow/python/kernel_tests:unicode_decode_op_test
    //tensorflow/python/kernel_tests:unicode_transcode_op_test

    5. Test case //tensorflow/python/tools:saved_model_cli_test fails due to incorrect target_triple value for s390x. Refer to this for more details.

References:

Clone this wiki locally