
Executing a sample Storm topology – local mode
Before we start this section, the assumption is that you have gone through the prerequisites and installed the expected components.
WordCount topology from the Storm-starter project
To understand the components described in the previous section, let's download the Storm-starter project and execute a sample topology:
- The Storm-starter project can be downloaded using the following Git command:
Linux-command-Prompt $ sudo git clone git://github.com/apache/incubator-storm.git && cd incubator-storm/examples/storm-starter
- Next, you need to import the project into your Eclipse workspace:
- Start Eclipse.
- Click on the File menu and select the Import wizard.
- From the Import wizard, select Existing Maven Projects.
- Select pom.xml in the Storm-starter project and specify it as
<download-folder>/starter/incubator-storm/examples/storm-starter
. - Once the project has been successfully imported, the Eclipse folder structure will look like the following screenshot:
- Execute the topology using the run command and you should be able to see the output as shown in the following screenshot:
To understand the functioning of the topology, let's take a look at the code and understand the flow and functioning of each component in the topology:
// instantiates the new builder object TopologyBuilder builder = new TopologyBuilder(); // Adds a new spout of type "RandomSentenceSpout" with a parallelism hint of 5 builder.setSpout("spout", new RandomSentenceSpout(), 5);
Starting with the main function, in the WordCountTopology.java
class, we find the TopologyBuilder
object called builder
; this is important to understand as this is the class that provides us with a template to define the topology. This class exposes the API to configure and wire in various spouts and bolts into a topology—a topology that is essentially a thrift structure at the end.
In the preceding code snippet, we created a TopologyBuilder
object and used the template to perform the following:
setSpout –RandomSentenceSpout
: This generates random sentences. Please note that we are using a property called parallelism hint, which is set to5
here. This is the property that identifies how many instances of this component will be spawned at the time of submitting the topology. In our example, we will have five instances of the spout.setBolt
: We use this method to add two bolts to the topology:SplitSentenceBolt
, which splits the sentence into words, andWordCountBolt
, which counts the words.- Other noteworthy items in the preceding code snippet are
suffleGrouping
andfieldsGrouping
; we shall discuss these in detail in the next chapter; for now, understand that these are the components that control routing of tuples to various bolts in the topology.