Storm installation on single machine
In this post we will see how to install Twitter Storm on single Linux machine.
- Linux machine
- Java 6 installed on the machine.
- Python 2.6 installed on the machine.
Storm installation can be separated into three parts as follows.
Zookeeper Cluster Installation:Zookeeper is the coordinator for Storm cluster. The interaction between nimbus and worker nodes is done through the Zookeeper.
Storm Client Installation: Storm Client is required for topology management on the Storm cluster i.e. for submitting the created topology to storm cluster, to kill already running topology on the Storm cluster etc.
Storm Cluster Installation:Storm cluster is where the actual topology runs.
Here we are installing all above three parts on single machine so we will create Storm client along with standalone Zookeeper cluster (cluster with single server) and Storm cluster with master node (nimbus) and single worker node (supervisor).
Storm Client installation does not require to install Storm native dependencies i.e. ZeroMQ and JZMQ but Storm Cluster installation requires Storm native dependencies installed on the master and worker node machines. As we are installing Storm Client and Cluster on the same machine we are installing Storm along with its required native dependencies installed i.e. same Storm installation will work as Client as well as Cluster with one master node and one worker node.
First we will see how to install standalone Zookeeper on the machine.
Zookeeper Cluster Installation:
Setting up a Zookeeper server in standalone mode is straightforward. The server is contained in a single JAR file, so installation consists of creating a configuration.
You can follow instructions from here.
Obtain the zookeeper setup at some location.
Setup can be downloaded from this location:
Once you have downloaded the version, untar it at some location on the machine. Now you need to configure the Zookeeper server before starting it.
To configure Zookeeper server create configuration file named ‘zoo.cfg’ in the ‘conf’ folder of untar Zookeeper setup.
Write following into the ‘conf.cfg’ file.
Change the value of dataDir to specify an existing (empty to start with) directory. Here are the meanings for each of the fields:
the basic time unit in milliseconds used by Zookeeper. It is used to do heartbeats and the minimum session timeout will be twice the tickTime.
the location to store the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database.
the port to listen for client connections
Now your Zookeeper cluster is ready to start.
Strom Cluster Installation:
As already mentioned above same Storm installation will act as both Client and Cluster because of single machine installation so we will install Storm with its required native dependencies.
Fetch a Storm release from this location using GIT.
git clone https://github.com/nathanmarz/storm.git
Once you have downloaded the version, keep it at some location on the machine. Now you need to configure Storm so you need to modify the Storm configuration file called ‘storm.yaml’ and its present in the ‘conf’ folder of the untar Storm root folder.
Write following into the ‘storm.yaml’ file.
Change the value of storm.local.dirto specify an existing (empty to start with) directory.
Also copy this modified ‘storm.yaml’ file to “~/.storm/storm.yaml” folder. This is very important so do not forget to create ‘.storm’ folder in user root folder and copy the modified ‘storm.yaml’ file from ‘conf’ folder to created ‘.storm’ folder.
Now we will see how to install Storm native dependencies.
Install native dependencies:
Download and installation commands for ZeroMQ 2.1.7:
Obtain ZeroMQ using following command.
tar –xzf zeromq-2.1.7.tar.gz
sudo make install
Download and installation commands for JZMQ:
Obtain JZMQ using
git clone https://github.com/nathanmarz/jzmq.git
sudo make install
Now we will see how to start Storm Cluster and submit the topology using the Storm Client.
First start the Zookeeper cluster.
To start the Zookeeper server go to the ‘bin’ directory of the Zookeeper installation and execute following command.
sh zkServer.sh start
Second start Storm Cluster by starting master and worker nodes.
Start master node i.e. nimbus.
To start master i.e. nimbus go to the ‘bin’ directory of the Storm installation and execute following command. [separate command line window]
Start worker node i.e. supervisor.
To start worker i.e. supervisor go to the ‘bin’ directory of the Storm installation and execute following command. [separate command line window]
Third upload topology using Storm Client.
To upload topology to Storm Cluster go to the ‘bin’ directory of the Storm installation and execute following command. [separate command line window]
storm jar <path-to-topology-jar> <class-with-the-main> <arg1> <arg2> <argN>
<path-to-topology-jar>: is the complete path to the complied jar where your topology code and all your libraries are.
<class-with-the-main>: will be the class in jar file having main method where the StormSubmitter is executed
<arg1> <arg2> <argN>: the rest of the arguments will be the params that receive our main method.