Spark context configs

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark context configs

Jeff Steinmetz

What is the approach to set custom properties dynamically for the conf within a notebook?

I understand zeppelin already sets up a spark context for you.  Once you have a SparkContext it can’t be changed.

I may be missing something that the documentation doesn’t mention.  Perhaps something to do with %spark

Specifically, a native Scala elasticsearch spark source would look like:

import org.apache.spark.SparkConf val conf = new SparkConf().setAppName(“sampleapp").setMaster(“local") conf.set("es.index.auto.create", "true”)
conf.set("es.nodes", "192.168.51.50”)
val sc = new SparkContext(conf)

There may be times you may want to connect to a different IP for a different cluster.

Reply | Threaded
Open this post in threaded view
|

Re: Spark context configs

Alexander Bezzubov-2
Hi,

thank you for your interest and welcome to Zeppelin community!

The simplest way to set these properties for Spark is thought Interpreter properties. 

In "Interpreters" menu pick one that is bided to you notebook (spark or spark-local, or, create a new one) and set them w/ '+' sign. Interpreter will be restarted automatically on 'save' of these settings.

This way the lazily auto-injected context `sc` will have all this properties set for you on the first binded notebook run.

You are right, once configured SparkContext can not be changed, so for different clusters the best way may be either to have different interpreters (contexts) or if you are using awesome elasticsearch-hadoop then something like this would do the job

```
import org.elasticsearch.spark._
import org.elasticsearch.hadoop.cfg.ConfigurationOptions

var esAddr = ""

out.saveToEs(Map(
  ConfigurationOptions.ES_NODES -> esAddr,
  ConfigurationOptions.ES_RESOURCE -> .... )
))
```

Alex.

On Sun, Sep 13, 2015 at 3:28 AM, Jeff Steinmetz <[hidden email]> wrote:

What is the approach to set custom properties dynamically for the conf within a notebook?

I understand zeppelin already sets up a spark context for you.  Once you have a SparkContext it can’t be changed.

I may be missing something that the documentation doesn’t mention.  Perhaps something to do with %spark

Specifically, a native Scala elasticsearch spark source would look like:

import org.apache.spark.SparkConf val conf = new SparkConf().setAppName(“sampleapp").setMaster(“local") conf.set("es.index.auto.create", "true”)
conf.set("es.nodes", "192.168.51.50”)
val sc = new SparkContext(conf)

There may be times you may want to connect to a different IP for a different cluster.