To install Hadoop and set System Variable follow: https://github.com/AdityaSinghRathore/HadoopLinuxMint/blob/master/main.pdf
- Firstly we need to start our hadoop cluster.
$ start-all.sh
Or,
$ start-dfs.sh Then do $ start-yarn.sh
- Code the mapper in python (code files above).
$ touch mapper.py
$ vim mapper.py
- Writing a sample text file.
$ touch sample.txt
$ vim sample.txt
- Testing the Mapper code on above sample file.
$ cat sample.txt | python mapper.py
- Creating and coding the reducer.
$ touch reducer.py
$ vim reducer.py
- SSH to localhost and create /user/wce/input directory in HDFS
$ ssh localhost
$ hadoop fs -mkdir /user/wce/input
- Copy sample.txt file to the HDFS /user/wce/input/ directory.
$ hadoop fs -put sample.txt /user/wce/input











