org.apache.spark.SparkException: Could not parse Master URL: 'yarn'

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

org.apache.spark.SparkException: Could not parse Master URL: 'yarn'

Ben Vogan
Hello all,

I am trying to install Zeppelin 0.7.1 on my CDH 5.7 Cluster.  I have been following the instructions here:


I copied the zeppelin-env.sh.template into zeppelin-env.sh and made the following changes:
export JAVA_HOME=/usr/java/latest
export MASTER=yarn-client

export ZEPPELIN_LOG_DIR=/var/log/services/zeppelin
export ZEPPELIN_PID_DIR=/services/zeppelin/data
export ZEPPELIN_WAR_TEMPDIR=/services/zeppelin/data/jetty_tmp
export ZEPPELIN_NOTEBOOK_DIR=/services/zeppelin/data/notebooks
export ZEPPELIN_NOTEBOOK_PUBLIC=true

export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export HADOOP_CONF_DIR=/etc/spark/conf/yarn-conf
export PYSPARK_PYTHON=/usr/lib/python

I then start Zeppelin and hit the UI in my browser and create a spark note:

%spark
sqlContext.sql("select 1+1").collect().foreach(println)

And I get this error:

org.apache.spark.SparkException: Could not parse Master URL: 'yarn'
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2746)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:533)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_1(SparkInterpreter.java:484)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:382)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:828)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:483)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

I specified "yarn-client" as indicated by the instructions so I'm not sure where it is getting "yarn" from.  In my spark-defaults.conf it spark.master=yarn-client as well.

Help would be greatly appreciated.

Thanks,
--
BENJAMIN VOGAN | Data Platform Team Lead

Reply | Threaded
Open this post in threaded view
|

Re: org.apache.spark.SparkException: Could not parse Master URL: 'yarn'

Chaoran Yu
I suspect this is due to not setting SPARK_EXECUTOR_URI.

I’ve run Zeppelin with Spark on Mesos. I ran into a similar exception where Zeppelin was not able to parse the MASTER URL, which is “<a href="mesos://leader.mesos" class="">mesos://leader.mesos:5050” in my case. Then I found out that I had the following setting:
which is not built for mesos.

After changing it to the following
the exception was gone.

In your case, you might want to look at this page: http://archive-primary.cloudera.com/cdh5/cdh/5/
So I guess something like http://archive-primary.cloudera.com/cdh5/cdh/5/spark-1.6.0-cdh5.7.6.tar.gz should work as a value for SPARK_EXECUTOR_URI.

--
Chaoran Yu

On Apr 12, 2017, at 4:16 PM, Ben Vogan <[hidden email]> wrote:

Hello all,

I am trying to install Zeppelin 0.7.1 on my CDH 5.7 Cluster.  I have been following the instructions here:


I copied the zeppelin-env.sh.template into zeppelin-env.sh and made the following changes:
export JAVA_HOME=/usr/java/latest
export MASTER=yarn-client

export ZEPPELIN_LOG_DIR=/var/log/services/zeppelin
export ZEPPELIN_PID_DIR=/services/zeppelin/data
export ZEPPELIN_WAR_TEMPDIR=/services/zeppelin/data/jetty_tmp
export ZEPPELIN_NOTEBOOK_DIR=/services/zeppelin/data/notebooks
export ZEPPELIN_NOTEBOOK_PUBLIC=true

export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export HADOOP_CONF_DIR=/etc/spark/conf/yarn-conf
export PYSPARK_PYTHON=/usr/lib/python

I then start Zeppelin and hit the UI in my browser and create a spark note:

%spark
sqlContext.sql("select 1+1").collect().foreach(println)

And I get this error:

org.apache.spark.SparkException: Could not parse Master URL: 'yarn'
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2746)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:533)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_1(SparkInterpreter.java:484)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:382)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:828)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:483)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

I specified "yarn-client" as indicated by the instructions so I'm not sure where it is getting "yarn" from.  In my spark-defaults.conf it spark.master=yarn-client as well.

Help would be greatly appreciated.

Thanks,
--
BENJAMIN VOGAN | Data Platform Team Lead


Reply | Threaded
Open this post in threaded view
|

Re: org.apache.spark.SparkException: Could not parse Master URL: 'yarn'

Ben Vogan
I discovered that the interpreter.json had "master" : "yarn" and this seems to take precedence over what is in the zeppelin-env.sh file.  Changing that to yarn-client resolved my issue.

--Ben

On Wed, Apr 12, 2017 at 2:39 PM, Chaoran Yu <[hidden email]> wrote:
I suspect this is due to not setting SPARK_EXECUTOR_URI.

I’ve run Zeppelin with Spark on Mesos. I ran into a similar exception where Zeppelin was not able to parse the MASTER URL, which is “mesos://leader.mesos:5050” in my case. Then I found out that I had the following setting:
which is not built for mesos.

After changing it to the following
the exception was gone.

In your case, you might want to look at this page: http://archive-primary.cloudera.com/cdh5/cdh/5/
So I guess something like http://archive-primary.cloudera.com/cdh5/cdh/5/spark-1.6.0-cdh5.7.6.tar.gz should work as a value for SPARK_EXECUTOR_URI.

--
Chaoran Yu

On Apr 12, 2017, at 4:16 PM, Ben Vogan <[hidden email]> wrote:

Hello all,

I am trying to install Zeppelin 0.7.1 on my CDH 5.7 Cluster.  I have been following the instructions here:


I copied the zeppelin-env.sh.template into zeppelin-env.sh and made the following changes:
export JAVA_HOME=/usr/java/latest
export MASTER=yarn-client

export ZEPPELIN_LOG_DIR=/var/log/services/zeppelin
export ZEPPELIN_PID_DIR=/services/zeppelin/data
export ZEPPELIN_WAR_TEMPDIR=/services/zeppelin/data/jetty_tmp
export ZEPPELIN_NOTEBOOK_DIR=/services/zeppelin/data/notebooks
export ZEPPELIN_NOTEBOOK_PUBLIC=true

export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export HADOOP_CONF_DIR=/etc/spark/conf/yarn-conf
export PYSPARK_PYTHON=/usr/lib/python

I then start Zeppelin and hit the UI in my browser and create a spark note:

%spark
sqlContext.sql("select 1+1").collect().foreach(println)

And I get this error:

org.apache.spark.SparkException: Could not parse Master URL: 'yarn'
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2746)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:533)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext_1(SparkInterpreter.java:484)
at org.apache.zeppelin.spark.SparkInterpreter.createSparkContext(SparkInterpreter.java:382)
at org.apache.zeppelin.spark.SparkInterpreter.getSparkContext(SparkInterpreter.java:146)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:828)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:483)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

I specified "yarn-client" as indicated by the instructions so I'm not sure where it is getting "yarn" from.  In my spark-defaults.conf it spark.master=yarn-client as well.

Help would be greatly appreciated.

Thanks,
--
BENJAMIN VOGAN | Data Platform Team Lead





--
BENJAMIN VOGAN | Data Platform Team Lead