Cluster setup in hbase
Before Starting hbase cluster
To configure HBase, we need to have a running Hadoop cluster, which will be the storage for hbase(Hbase store data in HDFS). Please refere to Installing and configuring hadoop cluster .And plese make sure that user name of all machines and the path where the hbase is installed are same in all machines.In my case user is hduser.
These are the steps ,how to setup and run Hbase cluster.We have build hbase cluster using three Ubuntu machine.A distributed HBase depends on a running ZooKeeper cluster.we are using default ZooKeeper cluster, which is manage by Hbase.
There are basically three type of node.
1. Hbase Master:- The HbaseMaster is responsible for assigning regions to HbaseRegionserver, monitors the health of each HbaseRegionserver.
2. Zookeeper: – For any distributed application, ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
3. Hbase Regionserver:- The HbaseRegionserver is responsible for handling client read and write requests. It communicates with the Hbasemaster to get a list of regions to serve and to tell the master that it is alive.
In our Example, one machine in the cluster is designated as Hbase master and Zookeeper. The rest of machine in the cluster act as a Regionserver.
To configure HBase, we need to have a running Hadoop cluster, which will be the storage for hbase(Hbase store data in HDFS). Please refere to Installing and configuring hadoop cluster .And plese make sure that user name of all machines and the path where the hbase is installed are same in all machines.In my case user is hduser.
These are the steps ,how to setup and run Hbase cluster.We have build hbase cluster using three Ubuntu machine.A distributed HBase depends on a running ZooKeeper cluster.we are using default ZooKeeper cluster, which is manage by Hbase.
There are basically three type of node.
1. Hbase Master:- The HbaseMaster is responsible for assigning regions to HbaseRegionserver, monitors the health of each HbaseRegionserver.
2. Zookeeper: – For any distributed application, ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
3. Hbase Regionserver:- The HbaseRegionserver is responsible for handling client read and write requests. It communicates with the Hbasemaster to get a list of regions to serve and to tell the master that it is alive.
In our Example, one machine in the cluster is designated as Hbase master and Zookeeper. The rest of machine in the cluster act as a Regionserver.
INSTALLING AND CONFIGURING HBASE MASTER
1. Download hbase-1.1.2tar.gz from http://www.apache.org/dyn/closer.cgi/hbase/ and extract it in some directory in your computer. Now this path is called as $HBASE_INSTALL_DIR.
2. Edit the file /etc/hosts on the master machine and add the following lines.
192.168.35.16 bsw-HbaseMaster bsw-HbaseMaster
Hbase Master and Hadoop Namenode(master machine in hadoop clustering) is configure on same machine
192.168.35.17 bsw-data1 192.168.35.25 bsw-data2
Note: Run the command “ping bsw-HbaseMaster”. This command is run to check whether the bsw-HbaseMaster machine IP is being resolved to actual IP not localhost IP.
Here bsw-data1 and bsw-data2 are the machine where region server is running and bsw-HbaseMaster is the machine where hbase-master is running
3. We have needed to configure password less login from bsw-HbaseMaster to all regionserver machines.
Execute the following commands on bsw-HbaseMaster machine.
$ssh-keygen -t rsa $scp .ssh/id_rsa.pub hduser@bsw-data1/.ssh/authorized_keys $scp .ssh/id_rsa.pub hduser@bsw-data2/.ssh/authorized_keys
4. Open the file $HBASE_INSTALL_DIR/conf/hbase-env.sh and set the $JAVA_HOME.
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
5. Open the file $HBASE_INSTALL_DIR/conf/hbase-site.xml and add the following properties.
hbase-master bsw-HbaseMaster:60000 The host and port that the HBase master runs at. hbase.rootdir hdfs://bsw-HbaseMaster:9000/hadoop-datastore The directory shared by region servers. hbase.cluster.distributed true Possible values are false: standalone and pseudo-distributed setups with managed Zookeeper true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh) hbase.zookeeper.property.clientPort 2181 hbase.zookeeper.quorum bsw-HbaseMaster
Note:-
In our Example, Zookeeper and hbase master both are running in same machine.
6. Open the file $HBASE_INSTALL_DIR/conf/hbase-env.sh and uncomment the following line:
export HBASE_MANAGES_ZK=true
7. Open the file $HBASE_INSTALL_DIR/conf/regionservers and add all the regionserver machine names.
bsw-data1 bsw-data2 bsw-HbaseMaster
Note: Add bsw-HbaseMaster machine name only if you are running a regionserver on bsw-HbaseMaster machine.
INSTALLING AND CONFIGURING HBASE REGIONSERVER
1. Download hbase-1.1.2tar.gz from http://www.apache.org/dyn/closer.cgi/hbase/ and extract it in some directory in your computer. Now this path is called as $HBASE_INSTALL_DIR.
2. Edit the file /etc/hosts on the hbase-regionserver machine and add the following lines.
192.168.35.16 bsw-HbaseMaster bsw-HbaseMaster
Note: In my case, bsw-HbaseMaster and hadoop-namenode are running on same machine.
Note: Run the command “ping bsw-HbaseMaster”. This command is run to check whether the bsw-HbaseMaster machine IP is being resolved to actual IP not localhost IP.
3.We have needed to configure password less login from bsw-data1 and bsw-data2 to bsw-HbaseMaster machine.
Execute the following commands on bsw-data1 and bsw-data2 machine.
$ssh-keygen -t rsa $scp .ssh/id_rsa.pub hduser@bsw-HbaseMaster/.ssh/authorized_keys2
4. Open the file $HBASE_INSTALL_DIR/conf/hbase-env.sh and set the $JAVA_HOME.
export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_25
Note: If you are using open jdk , then give the path of open jdk.
5. Open the file $HBASE_INSTALL_DIR/conf/hbase-site.xml and add the following properties.
hbase-master bsw-HbaseMaster:60000 The host and port that the HBase master runs at. hbase.rootdir hdfs://bsw-HbaseMaster:9000/hadoop-datastore The directory shared by region servers. hbase.cluster.distributed true Possible values are false: standalone and pseudo-distributed setups with managed Zookeeper true: fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh) hbase.zookeeper.property.clientPort 2181 hbase.zookeeper.quorum bsw-HbaseMaster
6. Open the file $HBASE_INSTALL_DIR/conf/hbase-env.sh and uncomment the following line:
export HBASE_MANAGES_ZK=true
Note:-
Above steps is required on all the datanode in the hadoop cluster.
START AND STOP HBASE CLUSTER
1. Starting the Hbase Cluster:-
we have need to start the daemons only on the bsw-HbaseMaster machine, it will start the daemons in all regionserver machines. Execute the following command to start the hbase cluster.
$HBASE_INSTALL_DIR/bin/start-hbase.sh
Note:-
At this point, the following Java processes should run on hbase-master machine.
hduser@bsw-HbaseMaster:$jps 14143 Jps 14007 HQuorumPeer 14066 HMaster
and the following java processes should run on hbase-regionserver machine.
23026 HRegionServer 23171 Jps
2. Starting the hbase shell:-
$HBASE_INSTALL_DIR/bin/hbase shell HBase Shell; enter 'help' for list of supported commands. Version: 0.20.6, r965666, Mon Jul 19 16:54:48 PDT 203 hbase(main):001:0>
Now,create table in hbase.
hbase(main):001:0>create 't1','f1' 0 row(s) in 1.2910 seconds hbase(main):002:0>
Note: – If table is created successfully, then everything is running fine.
3. Stoping the Hbase Cluster:-
Execute the following command on hbase-master machine to stop the hbase cluster.
$HBASE_INSTALL_DIR/bin/stop-hbase.sh