-
Notifications
You must be signed in to change notification settings - Fork 56
Building word2vec
The instructions provided below specify the steps to build word2vec version 0.1c on Linux on IBM Z for following distributions:
- RHEL (6.9, 7.3, 7.4, 7.5)
- SLES (11 SP4, 12 SP2, 12 SP3)
- Ubuntu (16.04, 17.10, 18.04)
General notes:
-
When following the steps below please use a standard permission user unless otherwise specified.
-
A directory
/<source_root>/
will be referred to in these instructions, this is a temporary writeable directory anywhere you'd like to place it.
- Install standard utilities, packages and platform specific dependencies
-
RHEL (6.9, 7.3, 7.4, 7.5)
sudo yum install -y git gcc make wget tar unzip
-
SLES (11 SP4, 12 SP2, 12 SP3)
sudo zypper install -y git gcc make wget tar unzip
-
Ubuntu (16.04, 17.10, 18.04)
sudo apt-get update sudo apt-get install -y git gcc make wget tar unzip
-
Create a working directory and download word2vec source code
mkdir /<source_root>/ cd /<source_root>/ wget https://storage.googleapis.com/google-code-archive-source/v2/code.google.com/word2vec/source-archive.zip unzip source-archive.zip
-
Build word2vec
cd word2vec/trunk make CFLAGS="-lm -pthread -O3 -Wall -funroll-loops"
-
Set environment variables
export PATH=$PATH:/<source_root>/word2vec/trunk
-
Test word2vec using demo scripts
./demo-word.sh ./demo-phrases.sh
Note: Enter test corpus as input and get word vectors as output, e.g. Input=france
-
Run word2vec binary
word2vec
Note: The word2vec tool takes a text corpus as input and produces the word vectors as output.
The information provided in this article is accurate at the time of writing, but on-going development in the open-source projects involved may make the information incorrect or obsolete. Please open issue or contact us on IBM Z Community if you have any questions or feedback.