-
Notifications
You must be signed in to change notification settings - Fork 56
Building TensorFlow Transform
The instructions provided below specify the steps to build TensorFlow Transform version 1.16.0 on Linux on IBM Z for the following distributions:
- Ubuntu (22.04, 24.04)
- When following the steps below please use a standard permission user unless otherwise specified.
- A directory
/<source_root>/
will be referred to in these instructions, this is a temporary writable directory anywhere you'd like to place it.
If you want to build TensorFlow Transform using manual steps, go to STEP 1.2.
Use the following commands to build TensorFlow Transform using the build script. Please make sure you have wget installed.
wget -q https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/TensorflowTransform/1.16.0/build_tensorflow_transform.sh
# Build TensorFlow Transform
bash build_tensorflow_transform.sh [Provide -t option for executing build with tests, -p option for choosing the Python version from {3.9, 3.10, 3.11}, if not specified, the script will use the distro provided Python version (i.e., Python 3.11).]
If the build completes successfully, go to STEP 2. In case of error, check logs
for more details or go to STEP 1.2 to follow manual build steps.
export SOURCE_ROOT=/<source_root>/
export PYTHON_VERSION=<python_version> #Choose Python version from {3.9, 3.10, 3.11}
export PATCH_URL="https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/TensorflowTransform/1.16.0/patch"
-
Ubuntu 22.04
sudo apt-get update sudo apt-get install -y build-essential cargo curl git cmake libopenblas-dev libgeos-dev
-
Ubuntu 24.04
sudo apt-get update sudo apt-get install -y build-essential cargo curl git cmake gcc-11 g++-11 libopenblas-dev libgeos-dev sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 60 sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-11 60
-
The instructions for building TensorFlow 2.18.0 can be found here.
-
Use the following commands to build TensorFlow 2.18.0 with required python version:
cd $SOURCE_ROOT wget -O build_tensorflow.sh https://raw.githubusercontent.com/linux-on-ibm-z/scripts/master/Tensorflow/2.18.0/build_tensorflow.sh bash build_tensorflow.sh -p $PYTHON_VERSION -y
-
Download source code
cd $SOURCE_ROOT git clone -b apache-arrow-10.0.1 --depth 1 https://github.com/apache/arrow.git
-
Build and install Arrow C++ library
cd $SOURCE_ROOT/arrow/cpp mkdir release cd release cmake -DCMAKE_INSTALL_PREFIX=/usr/local \ -DARROW_PARQUET=ON \ -DARROW_PYTHON=ON \ -DCMAKE_BUILD_TYPE=Release \ .. make -j$(nproc) sudo make install export LD_LIBRARY_PATH=/usr/local/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
-
Build and install pyarrow library
cd $SOURCE_ROOT/arrow/python curl -o pyarrow.diff ${PATCH_URL}/pyarrow.diff git apply pyarrow.diff export PYARROW_WITH_PARQUET=1 export PYARROW_PARALLEL=4 sed -i '2d' requirements-build.txt sed -i '2a oldest-supported-numpy>=0.14; python_version<'\''3.9'\''' requirements-build.txt sed -i '3a numpy<2.0.0,>=1.26.0; python_version>='\''3.9'\''' requirements-build.txt pip3 install -r requirements-build.txt python setup.py build_ext bdist_wheel pip3 install dist/*.whl
-
Build and install Apache Beam
cd $SOURCE_ROOT GRPC_PYTHON_BUILD_SYSTEM_OPENSSL=True pip3 install 'apache-beam[gcp]'==2.60.0
-
Download source code
cd $SOURCE_ROOT git clone -b v1.16.1 --depth 1 https://github.com/tensorflow/tfx-bsl.git
-
Build and install tfx-bsl
pip3 install --upgrade pip sudo update-alternatives --install /usr/local/bin/pip3 pip3 /usr/local/bin/pip${PYTHON_VERSION} 50 curl -o tfx-bsl.diff ${PATCH_URL}/tfx-bsl.diff cd $SOURCE_ROOT/tfx-bsl git apply ../tfx-bsl.diff sudo touch /usr/local/include/immintrin.h sed -i "179s/.*/ default=\'>=2.16,<2.19\',/" setup.py export BAZEL_HTTP_TIMEOUT=300 python3 setup.py bdist_wheel pip3 install dist/*.whl
It is also possible to build and install TensorFlow Transform manually. This step is required if you intend to run the test cases as in Step 3.
-
Install Keras
pip3 install tf-keras
-
Download source code
cd $SOURCE_ROOT git clone -b v1.16.0 --depth 1 https://github.com/tensorflow/transform.git
-
Build and install
cd $SOURCE_ROOT/transform sed -i "55s/.*/ default=\'>=2.16,<2.19\',/" setup.py python3 setup.py install --user
Note: If any other particular version of a python package is required during installation, please run sudo pip3 install '<package-name>==<version>'
to install it:
-
Run TensorFlow Transform from command Line
$ cd $SOURCE_ROOT $ python3 >>> import tensorflow as tf >>> import tensorflow_transform as tft >>> tft.version.__version__ '1.16.0' >>>
-
Follow instructions in this tutorial to use TensorFlow Transform to preprocess data.
-
Run the complete testsuite
cd $SOURCE_ROOT/transform python3 -m unittest discover -v -p '*_test.py'
-
Run a single test case (for example
BeamImplTest.testHandleBatchError
)cd $SOURCE_ROOT/transform python3 -m unittest -v tensorflow_transform/beam/impl_test.py -k BeamImplTest.testHandleBatchError
Note: Test case BeamImplTest.testNumericAnalyzersWithCompositeInputssparse_elementwise_tf.float64
fails intermittently on both s390x and Intel but will pass after an individual rerun.
The information provided in this article is accurate at the time of writing, but on-going development in the open-source projects involved may make the information incorrect or obsolete. Please open issue or contact us on IBM Z Community if you have any questions or feedback.