Zeppelin access to remote Spark&Yarn cluster

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Zeppelin access to remote Spark&Yarn cluster

MichaelYoung
Dear all,

My question might have already been answered somehow before, yet after searching the mailing list, I still did not get a step by step guide on how to make Zeppelin access remote Spark&Yarn clusters. 

My request is:
My Spark and Yarn is installed and deployed on a remote cluster in a data center, and I wanna install Zeppelin on my Mac laptop, I'd like to configure and run Spark jobs inside my Zeppelin on my laptop, is this already supported ? If so, how should I configure it ?

Thanks so much in advance !
Reply | Threaded
Open this post in threaded view
|

Re: Zeppelin access to remote Spark&Yarn cluster

moon
Administrator
Hi,

The easiest way is install spark on your laptop, and configure it to use your yarn remote cluster in a data center. And verify it with bin/spark-shell.

Then install zeppelin and export SPARK_HOME env variable to your spark installation. You'll need to set 'master' property to yarn-client in zeppelin GUI (interpreter setting page).

That's it.

One thing you should consider is, your spark workers will connect to your laptop to load compiled class from scala repl. So you'll need to make sure your network/firewall configuration allow every nodes in your cluster able to access your laptop. Using VPN would be one way to make it secure.

Or install Livy server on your cluster and use Livy interpreter [1] to connect. Then all your nodes in the cluster doesn't need to connect to your laptop.

Best,
moon


On Sun, Aug 7, 2016 at 9:33 PM MichaelYoung <[hidden email]> wrote:
Dear all,

My question might have already been answered somehow before, yet after searching the mailing list, I still did not get a step by step guide on how to make Zeppelin access remote Spark&Yarn clusters. 

My request is:
My Spark and Yarn is installed and deployed on a remote cluster in a data center, and I wanna install Zeppelin on my Mac laptop, I'd like to configure and run Spark jobs inside my Zeppelin on my laptop, is this already supported ? If so, how should I configure it ?

Thanks so much in advance !
Reply | Threaded
Open this post in threaded view
|

Re: Zeppelin access to remote Spark&Yarn cluster

Vinay Shukla
Michael,

Besides what Moon suggested, the next thing to consider is the security. If  the remote cluster in the Data center has Kerberos authentication enabled, then you will need to configure Zeppelin + Livy interpreter for Kerberos and identity propagation to allow the job to run as the end-user.

Thx,
Vinay

 

On Fri, Aug 12, 2016 at 10:40 AM, moon soo Lee <[hidden email]> wrote:
Hi,

The easiest way is install spark on your laptop, and configure it to use your yarn remote cluster in a data center. And verify it with bin/spark-shell.

Then install zeppelin and export SPARK_HOME env variable to your spark installation. You'll need to set 'master' property to yarn-client in zeppelin GUI (interpreter setting page).

That's it.

One thing you should consider is, your spark workers will connect to your laptop to load compiled class from scala repl. So you'll need to make sure your network/firewall configuration allow every nodes in your cluster able to access your laptop. Using VPN would be one way to make it secure.

Or install Livy server on your cluster and use Livy interpreter [1] to connect. Then all your nodes in the cluster doesn't need to connect to your laptop.

Best,
moon


On Sun, Aug 7, 2016 at 9:33 PM MichaelYoung <[hidden email]> wrote:
Dear all,

My question might have already been answered somehow before, yet after searching the mailing list, I still did not get a step by step guide on how to make Zeppelin access remote Spark&Yarn clusters. 

My request is:
My Spark and Yarn is installed and deployed on a remote cluster in a data center, and I wanna install Zeppelin on my Mac laptop, I'd like to configure and run Spark jobs inside my Zeppelin on my laptop, is this already supported ? If so, how should I configure it ?

Thanks so much in advance !