Newbie question... switching between local and external spark engines within the notebook?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Newbie question... switching between local and external spark engines within the notebook?

Michael Segel
Hi,

I know you can set zeppelin to either use a local copy of spark  or set the ip address of the spark master in the configuration files.

I was wondering if you could do this from within the notebook?

So I can use the same notebook to run a paragraph on the local machine, then have the next paragraph run on a different cluster  and maybe a third run elsewhere too?


In a related question… I have a paragraph that handles all of my dependencies for setting up my notebook,  (%spark.dep)
Since this has to run prior to the spark engine/interpreter running, is there a way to shut down spark, run this and start spark from within the paragraph / notebook?

TIA

-Mike

Reply | Threaded
Open this post in threaded view
|

Re: Newbie question... switching between local and external spark engines within the notebook?

moon
Administrator
Hi Mike,

You can always create multiple interpreter settings, in Interpreter menu.
For example 'spark-local' and 'spark-cluster'.

You can bind multiple interpreter settings in interpreter binding menu of each notebook.

Then you can write a paragraph '%spark-local' and then next paragraph with '%spark-cluster'. 

Currently, there's no official way to shutdown spark interpreter inside of the notebook. One possible way is send REST api request inside the notebook (using python or scala, etc) to shutdown(restart) interpreter.


Thanks,
moon

On Thu, Oct 5, 2017 at 12:07 PM Michael Segel <[hidden email]> wrote:
Hi,

I know you can set zeppelin to either use a local copy of spark  or set the ip address of the spark master in the configuration files.

I was wondering if you could do this from within the notebook?

So I can use the same notebook to run a paragraph on the local machine, then have the next paragraph run on a different cluster  and maybe a third run elsewhere too?


In a related question… I have a paragraph that handles all of my dependencies for setting up my notebook,  (%spark.dep)
Since this has to run prior to the spark engine/interpreter running, is there a way to shut down spark, run this and start spark from within the paragraph / notebook?

TIA

-Mike