Hadoop is open source framework written in Java for complex high volume computation. Today’s industry data is expanded in 3 Vs (Volume, Velocity and Variety), making it difficult to analyze/interpret such data. Now hadoop’s distributed high fault tolerant filesystem (HDFS) is solution for 3Vs data expansion and map-reduce is programming plateform to analyze data in HDFS.
Today, we will be discuss step for simple installing up and running Hadoop on CentOS server machine.
Step 1: Installing Java
Hadoop require Java 1.6 or higher version of installation. Please check if java exists and if not install using the below command.
[root@localhost ~]$ sudo yum install java-1.7.0-openjdk Output ...... Dependency Installed: giflib.x86_64 0:4.1.6-3.1.el6 jpackage-utils.noarch 0:1.7.5-3.14.el6 pcsc-lite-libs.x86_64 0:1.5.2-15.el6 ttmkfdir.x86_64 0:3.0.9-32.1.el6 tzdata-java.noarch 0:2015f-1.el6 xorg-x11-fonts-Type1.noarch 0:7.2-11.el6 Complete! [root@localhost ~]$ java -version Output: java version "1.7.0_85" OpenJDK Runtime Environment (rhel-2.6.1.3.el6_7-x86_64 u85-b01) OpenJDK 64-Bit Server VM (build 24.85-b03, mixed mode)
Step 2: Create a dedicated Hadoop user
We recommend to create the dedicated user (non root) for hadoop installation.