Different Modes of Submitting Spark Job on Yarn
If you are using spark on Yarn, then you must have observed that there are different ways a job can be run on yarn cluster. In this post we will try to explore this. Spark Job to yarn can be submitted in two modes. 1. yarn-client 2. yarn cluster Yarn-Client : When we submit spark job to Yarn and mode is set to yarn-client, then spark driver runs on client machine. Let me elaborate on that, as we know spark driver is kind of controller of the job. When a spark job is submitted in client mode , driver runs on local machine and while spark job runs we can see logs of spark job on client machine. This will not allow you to run anything else on client untill job copletes. you can do this as following spark-submit -mode yarn-client com.mycomp.Example ./test-code.jar here -mode yarn-client is the important parameter that needs to be added. You can also use nohop and diver all logs to a log file and still run this spark job in client mode. Yarn-Cluster : Yarn-cluster...