
Showing posts from August, 2018

Apache Spark | Run first spark program

In this post we are going to run sample spark program. If you want to setup spark locally go to this post . In the last post we created slave node with 2 cores and 2g memory. This worker node will be used by master to run your spark jobs. Spark installations provides some sample programs to run on cluster. This can be found at below location. We will be running JavaWordCount program locally in spark cluster.

Apache Spark | Setup Spark in local

In this post we are going to setup Apache Spark  in Ubuntu machine. Apache Spark is high performance engine  for Big data such as batches, and streaming of data. Spark provides up-to 100x times speed than any other engines. Spark is compatible with Java, Python, R. Step 1 : Download Spark installations from website . This will download  spark-2.3.1-bin-hadoop2.7.tgz in your local. Extract files from zip as below.