Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Sourav Mazumder
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav
Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

moon
Administrator
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav
Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Sourav Mazumder
Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the control goes to the appropriate IF-ELSE block in interpreter.sh by putting some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[hidden email]> wrote:
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav

Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

ÐΞ€ρ@Ҝ (๏̯͡๏)
Can you share screen shot of your spark interpreter on zeppelin web interface.

I have exact same deployment structure and it runs fine with right set of configurations.

On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the control goes to the appropriate IF-ELSE block in interpreter.sh by putting some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[hidden email]> wrote:
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav




--
Deepak

Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Sourav Mazumder
Hi Deepu,

Here u go.

Regards,
Sourav



 
Properties
name value
args
master yarn-cluster
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory 512m
spark.home
spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
zeppelin.dep.localrepo local-repo
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.maxResult 1000
zeppelin.spark.useHiveContext true

On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Can you share screen shot of your spark interpreter on zeppelin web interface.

I have exact same deployment structure and it runs fine with right set of configurations.

On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the control goes to the appropriate IF-ELSE block in interpreter.sh by putting some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[hidden email]> wrote:
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav




--
Deepak


Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

ÐΞ€ρ@Ҝ (๏̯͡๏)
Is spark installed on your zeppelin machine ?

I would to try these

master yarn-client
spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.



Looking at  spark.yarn.jar , i see spark is installed at /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i do not know what is it).

I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/  (can be any location) and have spark.yarn.jar to /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.





On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Deepu,

Here u go.

Regards,
Sourav



 
Properties
name value
args
master yarn-cluster
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory 512m
spark.home
spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
zeppelin.dep.localrepo local-repo
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.maxResult 1000
zeppelin.spark.useHiveContext true

On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Can you share screen shot of your spark interpreter on zeppelin web interface.

I have exact same deployment structure and it runs fine with right set of configurations.

On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the control goes to the appropriate IF-ELSE block in interpreter.sh by putting some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[hidden email]> wrote:
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav




--
Deepak





--
Deepak

Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Sourav Mazumder
Yes. Spark is installed in the machine where zeppelin is running.

The location of spark.yarn.jar is very similar to what you have. I'm using IOP as distribution and it is the directory naming convention specific to IOP which is different form hdp.

And yes the setup works perfectly fine when I use master as yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>

Regards,
Sourav

On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Is spark installed on your zeppelin machine ?

I would to try these

master yarn-client
spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.



Looking at  spark.yarn.jar , i see spark is installed at /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i do not know what is it).

I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/  (can be any location) and have spark.yarn.jar to /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.





On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Deepu,

Here u go.

Regards,
Sourav



 
Properties
name value
args
master yarn-cluster
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory 512m
spark.home
spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
zeppelin.dep.localrepo local-repo
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.maxResult 1000
zeppelin.spark.useHiveContext true

On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Can you share screen shot of your spark interpreter on zeppelin web interface.

I have exact same deployment structure and it runs fine with right set of configurations.

On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the control goes to the appropriate IF-ELSE block in interpreter.sh by putting some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[hidden email]> wrote:
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav




--
Deepak





--
Deepak


Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

ÐΞ€ρ@Ҝ (๏̯͡๏)
Do you have these settings configured in zeppelin-env.sh 

export JAVA_HOME=/usr/src/jdk1.7.0_79/

export HADOOP_CONF_DIR=/etc/hadoop/conf

Most likely you have this as your able to run with yarn-client.


Looks like the issue is to not be able to run the driver program on cluster. 


On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <[hidden email]> wrote:
Yes. Spark is installed in the machine where zeppelin is running.

The location of spark.yarn.jar is very similar to what you have. I'm using IOP as distribution and it is the directory naming convention specific to IOP which is different form hdp.

And yes the setup works perfectly fine when I use master as yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>

Regards,
Sourav

On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Is spark installed on your zeppelin machine ?

I would to try these

master yarn-client
spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.



Looking at  spark.yarn.jar , i see spark is installed at /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i do not know what is it).

I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/  (can be any location) and have spark.yarn.jar to /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.





On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Deepu,

Here u go.

Regards,
Sourav



 
Properties
name value
args
master yarn-cluster
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory 512m
spark.home
spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
zeppelin.dep.localrepo local-repo
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.maxResult 1000
zeppelin.spark.useHiveContext true

On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Can you share screen shot of your spark interpreter on zeppelin web interface.

I have exact same deployment structure and it runs fine with right set of configurations.

On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the control goes to the appropriate IF-ELSE block in interpreter.sh by putting some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[hidden email]> wrote:
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav




--
Deepak





--
Deepak





--
Deepak

Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Sourav Mazumder
Yes I have them setup appropriately.

Where I am lost is I can see that interpreter is running spark-submit but at some point of time it is switching to creating a spark context.

May be, as you rightly mentioned, because of some permission issue it is not able to run driver on yarn cluster. But what is that issue/required configuration I'm not able to figure out.

Regards,
Sourav

On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Do you have these settings configured in zeppelin-env.sh 

export JAVA_HOME=/usr/src/jdk1.7.0_79/

export HADOOP_CONF_DIR=/etc/hadoop/conf

Most likely you have this as your able to run with yarn-client.


Looks like the issue is to not be able to run the driver program on cluster. 


On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <[hidden email]> wrote:
Yes. Spark is installed in the machine where zeppelin is running.

The location of spark.yarn.jar is very similar to what you have. I'm using IOP as distribution and it is the directory naming convention specific to IOP which is different form hdp.

And yes the setup works perfectly fine when I use master as yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>

Regards,
Sourav

On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Is spark installed on your zeppelin machine ?

I would to try these

master yarn-client
spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.



Looking at  spark.yarn.jar , i see spark is installed at /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i do not know what is it).

I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/  (can be any location) and have spark.yarn.jar to /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.





On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Deepu,

Here u go.

Regards,
Sourav



 
Properties
name value
args
master yarn-cluster
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory 512m
spark.home
spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
zeppelin.dep.localrepo local-repo
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.maxResult 1000
zeppelin.spark.useHiveContext true

On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Can you share screen shot of your spark interpreter on zeppelin web interface.

I have exact same deployment structure and it runs fine with right set of configurations.

On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the control goes to the appropriate IF-ELSE block in interpreter.sh by putting some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[hidden email]> wrote:
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav




--
Deepak





--
Deepak





--
Deepak


Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

ÐΞ€ρ@Ҝ (๏̯͡๏)
did you try a test job with yarn-cluster (outside zeppelin) ?

On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder <[hidden email]> wrote:
Yes I have them setup appropriately.

Where I am lost is I can see that interpreter is running spark-submit but at some point of time it is switching to creating a spark context.

May be, as you rightly mentioned, because of some permission issue it is not able to run driver on yarn cluster. But what is that issue/required configuration I'm not able to figure out.

Regards,
Sourav

On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Do you have these settings configured in zeppelin-env.sh 

export JAVA_HOME=/usr/src/jdk1.7.0_79/

export HADOOP_CONF_DIR=/etc/hadoop/conf

Most likely you have this as your able to run with yarn-client.


Looks like the issue is to not be able to run the driver program on cluster. 


On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <[hidden email]> wrote:
Yes. Spark is installed in the machine where zeppelin is running.

The location of spark.yarn.jar is very similar to what you have. I'm using IOP as distribution and it is the directory naming convention specific to IOP which is different form hdp.

And yes the setup works perfectly fine when I use master as yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>

Regards,
Sourav

On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Is spark installed on your zeppelin machine ?

I would to try these

master yarn-client
spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.



Looking at  spark.yarn.jar , i see spark is installed at /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i do not know what is it).

I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/  (can be any location) and have spark.yarn.jar to /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.





On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Deepu,

Here u go.

Regards,
Sourav



 
Properties
name value
args
master yarn-cluster
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory 512m
spark.home
spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
zeppelin.dep.localrepo local-repo
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.maxResult 1000
zeppelin.spark.useHiveContext true

On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Can you share screen shot of your spark interpreter on zeppelin web interface.

I have exact same deployment structure and it runs fine with right set of configurations.

On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the control goes to the appropriate IF-ELSE block in interpreter.sh by putting some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[hidden email]> wrote:
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav




--
Deepak





--
Deepak





--
Deepak





--
Deepak

Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

Sourav Mazumder
I could execute following without any issue.

spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1 lib/spark-examples.jar 10

Regards,
Sourav

On Mon, Oct 5, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
did you try a test job with yarn-cluster (outside zeppelin) ?

On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder <[hidden email]> wrote:
Yes I have them setup appropriately.

Where I am lost is I can see that interpreter is running spark-submit but at some point of time it is switching to creating a spark context.

May be, as you rightly mentioned, because of some permission issue it is not able to run driver on yarn cluster. But what is that issue/required configuration I'm not able to figure out.

Regards,
Sourav

On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Do you have these settings configured in zeppelin-env.sh 

export JAVA_HOME=/usr/src/jdk1.7.0_79/

export HADOOP_CONF_DIR=/etc/hadoop/conf

Most likely you have this as your able to run with yarn-client.


Looks like the issue is to not be able to run the driver program on cluster. 


On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <[hidden email]> wrote:
Yes. Spark is installed in the machine where zeppelin is running.

The location of spark.yarn.jar is very similar to what you have. I'm using IOP as distribution and it is the directory naming convention specific to IOP which is different form hdp.

And yes the setup works perfectly fine when I use master as yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>

Regards,
Sourav

On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Is spark installed on your zeppelin machine ?

I would to try these

master yarn-client
spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.



Looking at  spark.yarn.jar , i see spark is installed at /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i do not know what is it).

I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/  (can be any location) and have spark.yarn.jar to /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.





On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Deepu,

Here u go.

Regards,
Sourav



 
Properties
name value
args
master yarn-cluster
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory 512m
spark.home
spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
zeppelin.dep.localrepo local-repo
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.maxResult 1000
zeppelin.spark.useHiveContext true

On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Can you share screen shot of your spark interpreter on zeppelin web interface.

I have exact same deployment structure and it runs fine with right set of configurations.

On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the control goes to the appropriate IF-ELSE block in interpreter.sh by putting some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[hidden email]> wrote:
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav




--
Deepak





--
Deepak





--
Deepak





--
Deepak


Reply | Threaded
Open this post in threaded view
|

Re: Running Zeppelin remotely to submit Spark job in Yarn Cluster mode

moon
Administrator
If you're using master branch, i recommend export only SPARK_HOME in conf/zeppelin-env.sh.
Then Zeppelin will use spark-submit command to run Spark interpreter, and that supposed to works exactly the same as when you run job using spark-submit command.

Thanks,
moon
On 2015년 10월 5일 (월) at 오후 9:57 Sourav Mazumder <[hidden email]> wrote:
I could execute following without any issue.

spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1 lib/spark-examples.jar 10

Regards,
Sourav

On Mon, Oct 5, 2015 at 12:04 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
did you try a test job with yarn-cluster (outside zeppelin) ?

On Mon, Oct 5, 2015 at 11:48 AM, Sourav Mazumder <[hidden email]> wrote:
Yes I have them setup appropriately.

Where I am lost is I can see that interpreter is running spark-submit but at some point of time it is switching to creating a spark context.

May be, as you rightly mentioned, because of some permission issue it is not able to run driver on yarn cluster. But what is that issue/required configuration I'm not able to figure out.

Regards,
Sourav

On Mon, Oct 5, 2015 at 11:38 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Do you have these settings configured in zeppelin-env.sh 

export JAVA_HOME=/usr/src/jdk1.7.0_79/

export HADOOP_CONF_DIR=/etc/hadoop/conf

Most likely you have this as your able to run with yarn-client.


Looks like the issue is to not be able to run the driver program on cluster. 


On Mon, Oct 5, 2015 at 11:13 AM, Sourav Mazumder <[hidden email]> wrote:
Yes. Spark is installed in the machine where zeppelin is running.

The location of spark.yarn.jar is very similar to what you have. I'm using IOP as distribution and it is the directory naming convention specific to IOP which is different form hdp.

And yes the setup works perfectly fine when I use master as yarn-client and same setup for SPARK_HOME, HADOOP_CONF_DIR and HADOOP_CLIENT>

Regards,
Sourav

On Mon, Oct 5, 2015 at 10:25 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Is spark installed on your zeppelin machine ?

I would to try these

master yarn-client
spark.home === SPARK INSTALLATION HOME directory on your zeppelin server.



Looking at  spark.yarn.jar , i see spark is installed at /usr/iop/current/spark-thriftserver/  . But why is it thirftserver (i do not know what is it).

I have spark installed (unzip) on zeppelin machine at /usr/hdp/2.3.1.0-2574/spark/spark/  (can be any location) and have spark.yarn.jar to /usr/hdp/2.3.1.0-2574/spark/spark/lib/spark-assembly-1.4.1-hadoop2.6.0.jar.





On Mon, Oct 5, 2015 at 10:20 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Deepu,

Here u go.

Regards,
Sourav



 
Properties
name value
args
master yarn-cluster
spark.app.name Zeppelin
spark.cores.max
spark.executor.memory 512m
spark.home
spark.yarn.jar /usr/iop/current/spark-thriftserver/lib/spark-assembly.jar
zeppelin.dep.localrepo local-repo
zeppelin.pyspark.python python
zeppelin.spark.concurrentSQL false
zeppelin.spark.maxResult 1000
zeppelin.spark.useHiveContext true

On Mon, Oct 5, 2015 at 10:05 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) <[hidden email]> wrote:
Can you share screen shot of your spark interpreter on zeppelin web interface.

I have exact same deployment structure and it runs fine with right set of configurations.

On Mon, Oct 5, 2015 at 7:56 AM, Sourav Mazumder <[hidden email]> wrote:
Hi Moon,

I'm using 0.6 SNAPSHOT which I built from latest git hub.

I tried setting SPARK_HOME in zeppelin-env.sh. Also I could see that the control goes to the appropriate IF-ELSE block in interpreter.sh by putting some debug statement.

But I get the same error as follows -

org.apache.spark.SparkException: Detected yarn-cluster mode, but isn't running on a cluster. Deployment to YARN is not supported directly by SparkContext. Please use spark-submit. at org.apache.spark.SparkContext.<init>(SparkContext.scala:378) at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:339) at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:149) at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:465) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.open(ClassloaderInterpreter.java:74) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:68) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:92) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:276) at org.apache.zeppelin.scheduler.Job.run(Job.java:170) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:118) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)

Let me know if you need any other details to figure out what is going on.

Regards,
Sourav

On Wed, Sep 30, 2015 at 1:53 AM, moon soo Lee <[hidden email]> wrote:
Which version of Zeppelin are you using?

Master branch uses spark-submit command, when SPARK_HOME is defined in conf/zeppelin-env.sh

If you're not on master branch, recommend try it with SPARK_HOME defined.

Hope this helps,
moon

On Wed, Sep 23, 2015 at 10:21 PM Sourav Mazumder <[hidden email]> wrote:
Hi,

When I try to run Spark Interpreter in Yarn Cluster mode from a remote machine I always get the error saying try spark-submit than using spark context.

Mu Zeppelin process runs in a separate machine remote to the YARN cluster.

Any idea why is this error ?

Regards,
Sourav




--
Deepak





--
Deepak





--
Deepak





--
Deepak