Building TensorFlow

The instructions provided below specify the steps to build TensorFlow version 2.9.1 on Linux on IBM Z for the following distributions:

Ubuntu (18.04, 20.04, 22.04)

General Notes:

When following the steps below please use a standard permission user unless otherwise specified.
A directory /<source_root>/ will be referred to in these instructions, this is a temporary writable directory anywhere you'd like to place it.

Step 1: Build and Install TensorFlow v2.9.1

1.1) Build using script

If you want to build TensorFlow using manual steps, go to STEP 1.2.

Use the following commands to build TensorFlow using the build script. Please make sure you have wget installed.

wget -q https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Tensorflow/2.9.1/build_tensorflow.sh

# Build Tensorflow
bash build_tensorflow.sh    [Provide -t option for executing build with tests]

If the build completes successfully, go to STEP 2. In case of error, check logs for more details or go to STEP 1.2 to follow manual build steps.

1.2) Install the dependencies

export SOURCE_ROOT=/<source_root>/

Ubuntu 18.04

sudo apt-get update
sudo apt-get install wget git unzip zip python3-dev python3-pip openjdk-11-jdk pkg-config libhdf5-dev libssl-dev libblas-dev liblapack-dev gfortran libopenblas-dev curl -y
wget -q https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Python3/3.10.4/build_python3.sh
bash build_python3.sh -y
sudo update-alternatives --install /usr/local/bin/python python /usr/local/bin/python3 40
sudo ldconfig
sudo pip3 install --upgrade pip
sudo pip3 install --no-cache-dir numpy==1.22.3 wheel scipy portpicker protobuf==3.13.0 packaging
sudo pip3 install keras_preprocessing --no-deps

Ubuntu 20.04

sudo apt-get update
sudo apt-get install wget git unzip zip python3-dev python3-pip openjdk-11-jdk pkg-config libhdf5-dev libssl-dev libblas-dev liblapack-dev gfortran curl -y
sudo ldconfig
sudo pip3 install --upgrade pip
sudo pip3 install --no-cache-dir numpy==1.22.3 wheel scipy==1.6.3 portpicker protobuf==3.13.0 packaging
sudo pip3 install keras_preprocessing --no-deps

Ubuntu 22.04

sudo apt-get update
sudo apt-get install wget git unzip zip python3-dev python3-pip openjdk-11-jdk pkg-config libhdf5-dev libssl-dev libblas-dev liblapack-dev gfortran gcc-9* g++-9* curl -y
sudo ldconfig
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-9 60 --slave /usr/bin/g++ g++ /usr/bin/g++-9
sudo update-alternatives --auto gcc
sudo pip3 install --upgrade pip
sudo pip3 install --no-cache-dir numpy==1.22.3 wheel scipy==1.7.2 portpicker protobuf==3.13.0 packaging
sudo pip3 install keras_preprocessing --no-deps

Ensure /usr/bin/python points to Python3 to build TensorFlow in a Python3 environment (On Ubuntu (20.04, 22.04))
```
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3 40
```

Install grpcio

export GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=True
sudo -E pip3 install grpcio

Build Bazel v5.1.1
```
wget -q https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Bazel/5.1.1/build_bazel.sh
bash build_bazel.sh -y
```
Note: Bazel community does not support Ubuntu 20.04 and Ubuntu 22.04 officially but build_bazel.sh script can still be used to build Bazel on Ubuntu 20.04 and Ubuntu 22.04.

1.3) Build TensorFlow

Download source code

cd $SOURCE_ROOT
git clone https://github.com/tensorflow/tensorflow
cd tensorflow
git checkout v2.9.1
curl -o build_patch.diff https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Tensorflow/2.9.1/patch/build_patch.diff
git apply --ignore-whitespace build_patch.diff

sed -i 's/float_t/float/g' tensorflow/stream_executor/tpu/c_api_decl.h  # Only on Ubuntu (18.04, 20.04)

Configure
```
  yes "" | ./configure || true
```
Build TensorFlow
```
  bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" build  --define=tensorflow_mkldnn_contraction_kernel=0 --define tflite_with_xnnpack=false //tensorflow/tools/pip_package:build_pip_package
```
Note:
1. TensorFlow build is resource intensive operation. If build continues to fail try increasing the swap space and reduce the number of concurrent jobs by specifying --jobs=n in the build command above, where n is the number of concurrent jobs.

2. Building TensorFlow from source can use a lot of RAM. If your system is memory-constrained, limit Bazel's RAM usage with: --local_ram_resources=2048.

1.4) Build tensorflow_io_gcs_filesystem and libclang wheel

Build tensorflow_io_gcs_filesystem wheel

sudo pip install --upgrade pip
cd $SOURCE_ROOT
git clone https://github.com/tensorflow/io.git
cd io/
git checkout v0.23.1
python3 setup.py -q bdist_wheel --project tensorflow_io_gcs_filesystem          
cd dist
sudo pip3 install ./tensorflow_io_gcs_filesystem-0.23.1-cp*-cp*-linux_s390x.whl

Build libclang wheel

cd $SOURCE_ROOT
sudo apt-get install -y cmake
git clone https://github.com/llvm/llvm-project
cd llvm-project
git checkout llvmorg-9.0.1
mkdir -p build
cd build
cmake ../llvm -DLLVM_ENABLE_PROJECTS=clang -DBUILD_SHARED_LIBS=OFF -DLLVM_ENABLE_TERMINFO=OFF -DLLVM_TARGETS_TO_BUILD=SystemZ -DCMAKE_BUILD_TYPE=MinSizeRel -DCMAKE_CXX_FLAGS_MINSIZEREL="-Os -DNDEBUG -static-libgcc -static-libstdc++ -s"
make libclang -j$(nproc)
cd $SOURCE_ROOT
git clone https://github.com/sighingnow/libclang.git
cd libclang
cp $SOURCE_ROOT/llvm-project/build/lib/libclang.so.9 native/
cp $SOURCE_ROOT/llvm-project/build/lib/libclang.so native/
python3 setup.py -q bdist_wheel 
cd dist
sudo pip3 install libclang-*-py2.py3-none-any.whl

1.5) Build and Install TensorFlow wheel

cd $SOURCE_ROOT/tensorflow
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_wheel

sudo ln -s /usr/include/locale.h /usr/include/xlocale.h (On Ubuntu 18.04 only)

sudo pip3 install /tmp/tensorflow_wheel/tensorflow-2.9.1-cp*-linux_s390x.whl

Step 2: Verify TensorFlow (Optional)

Run TensorFlow from command Line and check the installed version

 $ cd $SOURCE_ROOT
 $ python -c "import tensorflow as tf; print(tf.__version__)"
   2.9.1

 $ /usr/bin/python3
  >>> import tensorflow as tf
  >>> tf.add(1, 2).numpy()
  3
  >>> hello = tf.constant('Hello, TensorFlow!')
  >>> hello.numpy()
  b'Hello, TensorFlow!'
  >>>

Step 3: Execute Test Suite (Optional)

Run complete testsuite

cd $SOURCE_ROOT/tensorflow
# Apply minor BE specific fixes to some tests
curl -o test_patch.diff https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Tensorflow/2.9.1/patch/test_patch.diff
git apply --ignore-whitespace test_patch.diff

bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test --test_tag_filters=-gpu,-tpu,-benchmark-test,-v1only,-no_oss,-oss_serial -k --test_timeout 300,450,1200,3600 --build_tests_only --test_output=errors --define=tensorflow_mkldnn_contraction_kernel=0 --define tflite_with_xnnpack=false  -- //tensorflow/... -//tensorflow/compiler/... -//tensorflow/lite/... -//tensorflow/core/platform/cloud/...

Note: //tensorflow/core/platform/cloud skipped due to BoringSSL, refer #14039 for details.

Run individual test
```
bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test //tensorflow/<module_name>:<testcase_name>
```
For example,
```
bazel --host_jvm_args="-Xms1024m" --host_jvm_args="-Xmx2048m" test //tensorflow/python/kernel_tests:topk_op_test
```
Note:
1. There are some known limitations in TensorFlow Lite such as reading FlatBuffers tflite model file or quantization process on s390x and related tests would fail, Refer this for more details.

2. Few test cases from //tensorflow/java/... fail on s390x and x86, refer to this for more details.

3. Few tests from core, python and compiler module fail due to lack of certain CPU operations in SystemZ LLVM backend. They can pass by adding --copt=-O -c opt to bazel test command. Additionally few tests from //tensorflow/core/kernels/mlir_generated require --jit=true which can be set as below:
```
diff --git a/tensorflow/core/kernels/mlir_generated/build_defs.bzl b/tensorflow/core/kernels/mlir_generated/build_defs.bzl
index f5bcd7c2e2c..a0f806b2986 100644
--- a/tensorflow/core/kernels/mlir_generated/build_defs.bzl
+++ b/tensorflow/core/kernels/mlir_generated/build_defs.bzl
@@ -342,7 +342,8 @@ def _gen_kernel_library(
		 "--cpu_codegen=true" if enable_cpu else "--arch={}".format(gpu_arch_option),
		 "--tile_sizes=%s" % typed_tile_size,
		 "--enable_ftz=%s" % (type == "f32"),
-            ]
+                "--jit=true",
+           ]
	     if typed_unroll_factors:
		 test_args.append("--unroll_factors=%s" % typed_unroll_factors)
	     native.sh_test(
```
4. Below mentioned test cases expect ICU encoding data to be present for big endian format. If needed, this data can be manually generated on s390x for test cases to pass.
//tensorflow/python/kernel_tests:unicode_decode_op_test
//tensorflow/python/kernel_tests:unicode_transcode_op_test

5. Test case //tensorflow/python/tools:saved_model_cli_test fails due to incorrect target_triple value for s390x. Refer to this for more details.

References:

The information provided in this article is accurate at the time of writing, but on-going development in the open-source projects involved may make the information incorrect or obsolete. Please open issue or contact us on IBM Z Community if you have any questions or feedback.

Building TensorFlow

Building TensorFlow

General Notes:

Step 1: Build and Install TensorFlow v2.9.1

1.1) Build using script

1.2) Install the dependencies

1.3) Build TensorFlow

1.4) Build tensorflow_io_gcs_filesystem and libclang wheel

1.5) Build and Install TensorFlow wheel

Step 2: Verify TensorFlow (Optional)

Step 3: Execute Test Suite (Optional)

References:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!