Apache Spark | Run first spark program

In this post we are going to run sample spark program. If you want to setup spark locally go to this post. In the last post we created slave node with 2 cores and 2g memory. This worker node will be used by master to run your spark jobs.

Spark installations provides some sample programs to run on cluster. This can be found at below location.
We will be running JavaWordCount program locally in spark cluster.


Spark programs can be pushed to cluster using  ./bin/spark-submit script. More info on official website.

Place sample word.file under any dir(/opt/spark/spark-poc/words.txt for this example). Now run below command.

As you can see, spark-submit takes below arguments.

--class [main-class of spark program]
--master [master node url]
[app-jar]
[app command-line arguments]
Check Spark master UI to see status of spark program.

Comments

Post a Comment

Popular posts from this blog

Spring | Using TIBCO EMS with Spring framework

TIBCO | For Loop - Accumulate output

TIBCO | JNDI Server & JMS Instance creation