How do I configure R interpreter in Zeppelin?

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

How do I configure R interpreter in Zeppelin?

Shan Potti
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

moon
Administrator
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

Shan Potti
​​
Hi Moon,

Thanks for responding. Exporting Spark_home is exactly where I have a problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS account on cloud. I'm not the master account holder for that AWS account but I'm guessing I'm a client account with limited access probably. Can I still do it?

If yes, can you explain where and how should I do that shell scripting to export the variable? Can I do this in the notebook itself by starting the paragraph with sh% or do I need to do something else?
If you can share any video that would be great. I would like to let you know that I'm a novice user just getting to explore Big Data.

Sharing more info for better context.

Here's my AWS account detail type:

Inline image 1

Thanks for your help.

Shan

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


-- 
Shan S. Potti,




-- 

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,




--
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

moon
Administrator
AFAIK, Amazon EMR service has an option that launches Zeppelin (preconfigured) with Spark. Do you use Zeppelin provided by EMR or are you setting up Zeppelin separately?

Thanks,
moon

On Sat, Mar 18, 2017 at 4:13 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
​​
Hi Moon,

Thanks for responding. Exporting Spark_home is exactly where I have a problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS account on cloud. I'm not the master account holder for that AWS account but I'm guessing I'm a client account with limited access probably. Can I still do it?

If yes, can you explain where and how should I do that shell scripting to export the variable? Can I do this in the notebook itself by starting the paragraph with sh% or do I need to do something else?
If you can share any video that would be great. I would like to let you know that I'm a novice user just getting to explore Big Data.

Sharing more info for better context.

Here's my AWS account detail type:

image.png

Thanks for your help.

Shan

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


-- 
Shan S. Potti,




-- 
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="gmail_msg" target="_blank">737-333-1952

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,




--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="gmail_msg" target="_blank">737-333-1952
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

Shan Potti
I'm not 100% sure as I haven't set it up but it looks like I'm using Zeppelin preconfigured with Spark and I've also taken a snapshot of the Spark Interpreter configuration that I have access to/using in Zeppelin. This interpreter comes with SQL and Python integration and I'm figuring out how do I get to use R.

On Sat, Mar 18, 2017 at 8:06 PM, moon soo Lee <[hidden email]> wrote:
AFAIK, Amazon EMR service has an option that launches Zeppelin (preconfigured) with Spark. Do you use Zeppelin provided by EMR or are you setting up Zeppelin separately?

Thanks,
moon

On Sat, Mar 18, 2017 at 4:13 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
​​
Hi Moon,

Thanks for responding. Exporting Spark_home is exactly where I have a problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS account on cloud. I'm not the master account holder for that AWS account but I'm guessing I'm a client account with limited access probably. Can I still do it?

If yes, can you explain where and how should I do that shell scripting to export the variable? Can I do this in the notebook itself by starting the paragraph with sh% or do I need to do something else?
If you can share any video that would be great. I would like to let you know that I'm a novice user just getting to explore Big Data.

Sharing more info for better context.

Here's my AWS account detail type:

image.png

Thanks for your help.

Shan

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


-- 
Shan S. Potti,




-- 
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-2259626140129787235gmail_msg" target="_blank">737-333-1952

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,




--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-2259626140129787235gmail_msg" target="_blank">737-333-1952



--
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

moon
Administrator

Easiest way to figure out what your environment needs is,

1. run SPARK_HOME/bin/sparkR in your shell and make sure it works in the same host where Zeppelin going to run.
2. try use %spark.r in Zeppelin with SPARK_HOME configured. Normally it should work when 1) works without problem, otherwise take a look error message and error log to get more informations.

Thanks,
moon

On Sat, Mar 18, 2017 at 8:47 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:

I'm not 100% sure as I haven't set it up but it looks like I'm using Zeppelin preconfigured with Spark and I've also taken a snapshot of the Spark Interpreter configuration that I have access to/using in Zeppelin. This interpreter comes with SQL and Python integration and I'm figuring out how do I get to use R.

On Sat, Mar 18, 2017 at 8:06 PM, moon soo Lee <[hidden email]> wrote:
AFAIK, Amazon EMR service has an option that launches Zeppelin (preconfigured) with Spark. Do you use Zeppelin provided by EMR or are you setting up Zeppelin separately?

Thanks,
moon

On Sat, Mar 18, 2017 at 4:13 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
​​
Hi Moon,

Thanks for responding. Exporting Spark_home is exactly where I have a problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS account on cloud. I'm not the master account holder for that AWS account but I'm guessing I'm a client account with limited access probably. Can I still do it?

If yes, can you explain where and how should I do that shell scripting to export the variable? Can I do this in the notebook itself by starting the paragraph with sh% or do I need to do something else?
If you can share any video that would be great. I would like to let you know that I'm a novice user just getting to explore Big Data.

Sharing more info for better context.

Here's my AWS account detail type:

image.png

Thanks for your help.

Shan

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


-- 
Shan S. Potti,




-- 
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_6229863376402896580m_-2259626140129787235gmail_msg gmail_msg" target="_blank">737-333-1952

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,




--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_6229863376402896580m_-2259626140129787235gmail_msg gmail_msg" target="_blank">737-333-1952



--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="gmail_msg" target="_blank">737-333-1952
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

Ruslan Dautkhanov
Hi moon soo Lee,

Cloudera's Spark doesn't have $SPARK_HOME/bin/sparkR
Would Zeppelin still enable its sparkR interpreter then?

Built Zeppelin using 

$ mvn clean package -DskipTests -Pspark-2.1 -Ppyspark -Dhadoop.version=2.6.0-cdh5.10.1 -Phadoop-2.6 -Pyarn -Pr -Pvendor-repo -Pscala-2.10 -pl '!...,!...' -e

. . .
[INFO] Zeppelin: R Interpreter ............................ SUCCESS [01:01 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11:28 min

None of the R-related interpreters show up nevertheless.

This is including latest Zeppelin snapshot and was the same on previous releases of Zeppelin.
So something is missing on our side.

are installed on the servers that runs Zeppelin (and Spark driver as it is yarn-client). 

I guess either above build options are wrong or there is another dependency I missed.
conf/zeppelin-site.xml has R related interpreters mentioned - [1] but none of them
show up once Zeppelin starts up. 

Any ideas?


Thank you,
Ruslan


[1]

<property>
  <name>zeppelin.interpreters</name>
  <value>org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.rinterpreter.RRepl,org.apache.zeppelin.rinterpreter.KnitR,org.apache.zeppelin.spark.SparkRInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.file.HDFSFileInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,,org.apache.zeppelin.python.PythonInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.jdbc.JDBCInterpreter,org.apache.zeppelin.kylin.KylinInterpreter,org.apache.zeppelin.elasticsearch.ElasticsearchInterpreter,org.apache.zeppelin.scalding.ScaldingInterpreter,org.apache.zeppelin.alluxio.AlluxioInterpreter,org.apache.zeppelin.hbase.HbaseInterpreter,org.apache.zeppelin.livy.LivySparkInterpreter,org.apache.zeppelin.livy.LivyPySparkInterpreter,org.apache.zeppelin.livy.LivySparkRInterpreter,org.apache.zeppelin.livy.LivySparkSQLInterpreter,org.apache.zeppelin.bigquery.BigQueryInterpreter</value>
  <description>Comma separated interpreter configurations. First interpreter become a default</description>
</property>




--
Ruslan Dautkhanov

On Sun, Mar 19, 2017 at 1:07 PM, moon soo Lee <[hidden email]> wrote:

Easiest way to figure out what your environment needs is,

1. run SPARK_HOME/bin/sparkR in your shell and make sure it works in the same host where Zeppelin going to run.
2. try use %spark.r in Zeppelin with SPARK_HOME configured. Normally it should work when 1) works without problem, otherwise take a look error message and error log to get more informations.

Thanks,
moon



On Sat, Mar 18, 2017 at 8:47 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:

I'm not 100% sure as I haven't set it up but it looks like I'm using Zeppelin preconfigured with Spark and I've also taken a snapshot of the Spark Interpreter configuration that I have access to/using in Zeppelin. This interpreter comes with SQL and Python integration and I'm figuring out how do I get to use R.

On Sat, Mar 18, 2017 at 8:06 PM, moon soo Lee <[hidden email]> wrote:
AFAIK, Amazon EMR service has an option that launches Zeppelin (preconfigured) with Spark. Do you use Zeppelin provided by EMR or are you setting up Zeppelin separately?

Thanks,
moon

On Sat, Mar 18, 2017 at 4:13 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
​​
Hi Moon,

Thanks for responding. Exporting Spark_home is exactly where I have a problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS account on cloud. I'm not the master account holder for that AWS account but I'm guessing I'm a client account with limited access probably. Can I still do it?

If yes, can you explain where and how should I do that shell scripting to export the variable? Can I do this in the notebook itself by starting the paragraph with sh% or do I need to do something else?
If you can share any video that would be great. I would like to let you know that I'm a novice user just getting to explore Big Data.

Sharing more info for better context.

Here's my AWS account detail type:

image.png

Thanks for your help.

Shan

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


-- 
Shan S. Potti,




-- 
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_4117077202442845451gmail_msg" target="_blank">737-333-1952

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,




--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_4117077202442845451gmail_msg" target="_blank">737-333-1952



--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_4117077202442845451gmail_msg" target="_blank">737-333-1952

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

moon
Administrator
Zeppelin includes two R interpreter implementations.

One used to activated by -Psparkr the other -Pr.
Since https://github.com/apache/zeppelin/pull/2215, -Psparkr is activated by default. And if you're trying to use sparkR, -Psparkr (activated by default in master branch) is implementation you might be more interested.

So you can just try use with %spark.r prefix. 
Let me know if it works for you.

Thanks,
moon

On Wed, Apr 26, 2017 at 12:11 AM Ruslan Dautkhanov <[hidden email]> wrote:
Hi moon soo Lee,

Cloudera's Spark doesn't have $SPARK_HOME/bin/sparkR
Would Zeppelin still enable its sparkR interpreter then?

Built Zeppelin using 

$ mvn clean package -DskipTests -Pspark-2.1 -Ppyspark -Dhadoop.version=2.6.0-cdh5.10.1 -Phadoop-2.6 -Pyarn -Pr -Pvendor-repo -Pscala-2.10 -pl '!...,!...' -e

. . .
[INFO] Zeppelin: R Interpreter ............................ SUCCESS [01:01 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11:28 min

None of the R-related interpreters show up nevertheless.

This is including latest Zeppelin snapshot and was the same on previous releases of Zeppelin.
So something is missing on our side.

are installed on the servers that runs Zeppelin (and Spark driver as it is yarn-client). 

I guess either above build options are wrong or there is another dependency I missed.
conf/zeppelin-site.xml has R related interpreters mentioned - [1] but none of them
show up once Zeppelin starts up. 

Any ideas?


Thank you,
Ruslan


[1]

<property>
  <name>zeppelin.interpreters</name>
  <value>org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.rinterpreter.RRepl,org.apache.zeppelin.rinterpreter.KnitR,org.apache.zeppelin.spark.SparkRInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.file.HDFSFileInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,,org.apache.zeppelin.python.PythonInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.jdbc.JDBCInterpreter,org.apache.zeppelin.kylin.KylinInterpreter,org.apache.zeppelin.elasticsearch.ElasticsearchInterpreter,org.apache.zeppelin.scalding.ScaldingInterpreter,org.apache.zeppelin.alluxio.AlluxioInterpreter,org.apache.zeppelin.hbase.HbaseInterpreter,org.apache.zeppelin.livy.LivySparkInterpreter,org.apache.zeppelin.livy.LivyPySparkInterpreter,org.apache.zeppelin.livy.LivySparkRInterpreter,org.apache.zeppelin.livy.LivySparkSQLInterpreter,org.apache.zeppelin.bigquery.BigQueryInterpreter</value>
  <description>Comma separated interpreter configurations. First interpreter become a default</description>
</property>




--
Ruslan Dautkhanov

On Sun, Mar 19, 2017 at 1:07 PM, moon soo Lee <[hidden email]> wrote:

Easiest way to figure out what your environment needs is,

1. run SPARK_HOME/bin/sparkR in your shell and make sure it works in the same host where Zeppelin going to run.
2. try use %spark.r in Zeppelin with SPARK_HOME configured. Normally it should work when 1) works without problem, otherwise take a look error message and error log to get more informations.

Thanks,
moon



On Sat, Mar 18, 2017 at 8:47 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:

I'm not 100% sure as I haven't set it up but it looks like I'm using Zeppelin preconfigured with Spark and I've also taken a snapshot of the Spark Interpreter configuration that I have access to/using in Zeppelin. This interpreter comes with SQL and Python integration and I'm figuring out how do I get to use R.

On Sat, Mar 18, 2017 at 8:06 PM, moon soo Lee <[hidden email]> wrote:
AFAIK, Amazon EMR service has an option that launches Zeppelin (preconfigured) with Spark. Do you use Zeppelin provided by EMR or are you setting up Zeppelin separately?

Thanks,
moon

On Sat, Mar 18, 2017 at 4:13 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
​​
Hi Moon,

Thanks for responding. Exporting Spark_home is exactly where I have a problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS account on cloud. I'm not the master account holder for that AWS account but I'm guessing I'm a client account with limited access probably. Can I still do it?

If yes, can you explain where and how should I do that shell scripting to export the variable? Can I do this in the notebook itself by starting the paragraph with sh% or do I need to do something else?
If you can share any video that would be great. I would like to let you know that I'm a novice user just getting to explore Big Data.

Sharing more info for better context.

Here's my AWS account detail type:

image.png

Thanks for your help.

Shan

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


-- 
Shan S. Potti,




-- 
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-8166364581335834820m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,




--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-8166364581335834820m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952



--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

Ruslan Dautkhanov
Thanks for feedback.

%spark.r
print("Hello World!")
 throws exception [2].

Understood - I'll try to remove -Pr and rebuild Zeppelin. Yep, I used a fresh master snapshot.
( I have't seen anything in maven build logs that could indicate a problem around R interpreter)
Will update this email thread with result after rebuilding Zeppelin without -Pr 


[2]

<div class="gmail-tableDisplay gmail-ng-scope" src="&#39;app/notebook/paragraph/result/result.html?v=1493186403588&#39;" style="box-sizing:border-box;margin-top:2px;color:rgb(33,33,33);font-family:&quot;helvetica neue&quot;,helvetica,arial,sans-serif;font-size:14px">
spark.r interpreter not found
org.apache.zeppelin.interpreter.InterpreterException: spark.r interpreter not found at org.apache.zeppelin.interpreter.InterpreterFactory.getInterpreter(InterpreterFactory.java:417) at org.apache.zeppelin.notebook.Note.run(Note.java:620) at org.apache.zeppelin.socket.NotebookServer.persistAndExecuteSingleParagraph(NotebookServer.java:1781) at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:1741) at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:288) at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:59) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128) at org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69) at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)



--
Ruslan Dautkhanov

On Wed, Apr 26, 2017 at 2:13 PM, moon soo Lee <[hidden email]> wrote:
Zeppelin includes two R interpreter implementations.

One used to activated by -Psparkr the other -Pr.
Since https://github.com/apache/zeppelin/pull/2215, -Psparkr is activated by default. And if you're trying to use sparkR, -Psparkr (activated by default in master branch) is implementation you might be more interested.

So you can just try use with %spark.r prefix. 
Let me know if it works for you.

Thanks,
moon

On Wed, Apr 26, 2017 at 12:11 AM Ruslan Dautkhanov <[hidden email]> wrote:
Hi moon soo Lee,

Cloudera's Spark doesn't have $SPARK_HOME/bin/sparkR
Would Zeppelin still enable its sparkR interpreter then?

Built Zeppelin using 

$ mvn clean package -DskipTests -Pspark-2.1 -Ppyspark -Dhadoop.version=2.6.0-cdh5.10.1 -Phadoop-2.6 -Pyarn -Pr -Pvendor-repo -Pscala-2.10 -pl '!...,!...' -e

. . .
[INFO] Zeppelin: R Interpreter ............................ SUCCESS [01:01 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11:28 min

None of the R-related interpreters show up nevertheless.

This is including latest Zeppelin snapshot and was the same on previous releases of Zeppelin.
So something is missing on our side.

are installed on the servers that runs Zeppelin (and Spark driver as it is yarn-client). 

I guess either above build options are wrong or there is another dependency I missed.
conf/zeppelin-site.xml has R related interpreters mentioned - [1] but none of them
show up once Zeppelin starts up. 

Any ideas?


Thank you,
Ruslan


[1]

<property>
  <name>zeppelin.interpreters</name>
  <value>org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.rinterpreter.RRepl,org.apache.zeppelin.rinterpreter.KnitR,org.apache.zeppelin.spark.SparkRInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.file.HDFSFileInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,,org.apache.zeppelin.python.PythonInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.jdbc.JDBCInterpreter,org.apache.zeppelin.kylin.KylinInterpreter,org.apache.zeppelin.elasticsearch.ElasticsearchInterpreter,org.apache.zeppelin.scalding.ScaldingInterpreter,org.apache.zeppelin.alluxio.AlluxioInterpreter,org.apache.zeppelin.hbase.HbaseInterpreter,org.apache.zeppelin.livy.LivySparkInterpreter,org.apache.zeppelin.livy.LivyPySparkInterpreter,org.apache.zeppelin.livy.LivySparkRInterpreter,org.apache.zeppelin.livy.LivySparkSQLInterpreter,org.apache.zeppelin.bigquery.BigQueryInterpreter</value>
  <description>Comma separated interpreter configurations. First interpreter become a default</description>
</property>




--
Ruslan Dautkhanov

On Sun, Mar 19, 2017 at 1:07 PM, moon soo Lee <[hidden email]> wrote:

Easiest way to figure out what your environment needs is,

1. run SPARK_HOME/bin/sparkR in your shell and make sure it works in the same host where Zeppelin going to run.
2. try use %spark.r in Zeppelin with SPARK_HOME configured. Normally it should work when 1) works without problem, otherwise take a look error message and error log to get more informations.

Thanks,
moon



On Sat, Mar 18, 2017 at 8:47 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:

I'm not 100% sure as I haven't set it up but it looks like I'm using Zeppelin preconfigured with Spark and I've also taken a snapshot of the Spark Interpreter configuration that I have access to/using in Zeppelin. This interpreter comes with SQL and Python integration and I'm figuring out how do I get to use R.

On Sat, Mar 18, 2017 at 8:06 PM, moon soo Lee <[hidden email]> wrote:
AFAIK, Amazon EMR service has an option that launches Zeppelin (preconfigured) with Spark. Do you use Zeppelin provided by EMR or are you setting up Zeppelin separately?

Thanks,
moon

On Sat, Mar 18, 2017 at 4:13 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
​​
Hi Moon,

Thanks for responding. Exporting Spark_home is exactly where I have a problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS account on cloud. I'm not the master account holder for that AWS account but I'm guessing I'm a client account with limited access probably. Can I still do it?

If yes, can you explain where and how should I do that shell scripting to export the variable? Can I do this in the notebook itself by starting the paragraph with sh% or do I need to do something else?
If you can share any video that would be great. I would like to let you know that I'm a novice user just getting to explore Big Data.

Sharing more info for better context.

Here's my AWS account detail type:

image.png

Thanks for your help.

Shan

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


-- 
Shan S. Potti,




-- 
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-190005788716456870m_-8166364581335834820m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,




--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-190005788716456870m_-8166364581335834820m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952



--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

Ruslan Dautkhanov
All my attempts to enable R/sparkR interpreters succesfully failed.

Tried sparkr and r profiles. Attched one of the build logs. 

What I'm missing? It should be something simple. 



--
Ruslan Dautkhanov

On Wed, Apr 26, 2017 at 3:05 PM, Ruslan Dautkhanov <[hidden email]> wrote:
Thanks for feedback.

%spark.r
print("Hello World!")
 throws exception [2].

Understood - I'll try to remove -Pr and rebuild Zeppelin. Yep, I used a fresh master snapshot.
( I have't seen anything in maven build logs that could indicate a problem around R interpreter)
Will update this email thread with result after rebuilding Zeppelin without -Pr 


[2]

<div class="m_-4977098973434043844gmail-tableDisplay m_-4977098973434043844gmail-ng-scope" src="https://ci3.googleusercontent.com/proxy/irzw3YOipoBAb07JrmBLtswn2pSGbEuKwpXfgtkPsB1YeTtjJjk79-E7vUb-SO4IDRbrjkrt32rNpiq_v8237-iHQto4uDN2aYc76Nu6Erqyo93OgRNUIg=s0-d-e1-ft#http://&#39;app/notebook/paragraph/result/result.html?v=1493186403588&#39;" style="box-sizing:border-box;margin-top:2px;color:rgb(33,33,33);font-family:&quot;helvetica neue&quot;,helvetica,arial,sans-serif;font-size:14px">
spark.r interpreter not found
org.apache.zeppelin.interpreter.InterpreterException: spark.r interpreter not found at org.apache.zeppelin.interpreter.InterpreterFactory.getInterpreter(InterpreterFactory.java:417) at org.apache.zeppelin.notebook.Note.run(Note.java:620) at org.apache.zeppelin.socket.NotebookServer.persistAndExecuteSingleParagraph(NotebookServer.java:1781) at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:1741) at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:288) at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:59) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128) at org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69) at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)



--
Ruslan Dautkhanov

On Wed, Apr 26, 2017 at 2:13 PM, moon soo Lee <[hidden email]> wrote:
Zeppelin includes two R interpreter implementations.

One used to activated by -Psparkr the other -Pr.
Since https://github.com/apache/zeppelin/pull/2215, -Psparkr is activated by default. And if you're trying to use sparkR, -Psparkr (activated by default in master branch) is implementation you might be more interested.

So you can just try use with %spark.r prefix. 
Let me know if it works for you.

Thanks,
moon

On Wed, Apr 26, 2017 at 12:11 AM Ruslan Dautkhanov <[hidden email]> wrote:
Hi moon soo Lee,

Cloudera's Spark doesn't have $SPARK_HOME/bin/sparkR
Would Zeppelin still enable its sparkR interpreter then?

Built Zeppelin using 

$ mvn clean package -DskipTests -Pspark-2.1 -Ppyspark -Dhadoop.version=2.6.0-cdh5.10.1 -Phadoop-2.6 -Pyarn -Pr -Pvendor-repo -Pscala-2.10 -pl '!...,!...' -e

. . .
[INFO] Zeppelin: R Interpreter ............................ SUCCESS [01:01 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11:28 min

None of the R-related interpreters show up nevertheless.

This is including latest Zeppelin snapshot and was the same on previous releases of Zeppelin.
So something is missing on our side.

are installed on the servers that runs Zeppelin (and Spark driver as it is yarn-client). 

I guess either above build options are wrong or there is another dependency I missed.
conf/zeppelin-site.xml has R related interpreters mentioned - [1] but none of them
show up once Zeppelin starts up. 

Any ideas?


Thank you,
Ruslan


[1]

<property>
  <name>zeppelin.interpreters</name>
  <value>org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.rinterpreter.RRepl,org.apache.zeppelin.rinterpreter.KnitR,org.apache.zeppelin.spark.SparkRInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.file.HDFSFileInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,,org.apache.zeppelin.python.PythonInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.jdbc.JDBCInterpreter,org.apache.zeppelin.kylin.KylinInterpreter,org.apache.zeppelin.elasticsearch.ElasticsearchInterpreter,org.apache.zeppelin.scalding.ScaldingInterpreter,org.apache.zeppelin.alluxio.AlluxioInterpreter,org.apache.zeppelin.hbase.HbaseInterpreter,org.apache.zeppelin.livy.LivySparkInterpreter,org.apache.zeppelin.livy.LivyPySparkInterpreter,org.apache.zeppelin.livy.LivySparkRInterpreter,org.apache.zeppelin.livy.LivySparkSQLInterpreter,org.apache.zeppelin.bigquery.BigQueryInterpreter</value>
  <description>Comma separated interpreter configurations. First interpreter become a default</description>
</property>




--
Ruslan Dautkhanov

On Sun, Mar 19, 2017 at 1:07 PM, moon soo Lee <[hidden email]> wrote:

Easiest way to figure out what your environment needs is,

1. run SPARK_HOME/bin/sparkR in your shell and make sure it works in the same host where Zeppelin going to run.
2. try use %spark.r in Zeppelin with SPARK_HOME configured. Normally it should work when 1) works without problem, otherwise take a look error message and error log to get more informations.

Thanks,
moon



On Sat, Mar 18, 2017 at 8:47 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:

I'm not 100% sure as I haven't set it up but it looks like I'm using Zeppelin preconfigured with Spark and I've also taken a snapshot of the Spark Interpreter configuration that I have access to/using in Zeppelin. This interpreter comes with SQL and Python integration and I'm figuring out how do I get to use R.

On Sat, Mar 18, 2017 at 8:06 PM, moon soo Lee <[hidden email]> wrote:
AFAIK, Amazon EMR service has an option that launches Zeppelin (preconfigured) with Spark. Do you use Zeppelin provided by EMR or are you setting up Zeppelin separately?

Thanks,
moon

On Sat, Mar 18, 2017 at 4:13 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
​​
Hi Moon,

Thanks for responding. Exporting Spark_home is exactly where I have a problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS account on cloud. I'm not the master account holder for that AWS account but I'm guessing I'm a client account with limited access probably. Can I still do it?

If yes, can you explain where and how should I do that shell scripting to export the variable? Can I do this in the notebook itself by starting the paragraph with sh% or do I need to do something else?
If you can share any video that would be great. I would like to let you know that I'm a novice user just getting to explore Big Data.

Sharing more info for better context.

Here's my AWS account detail type:

image.png

Thanks for your help.

Shan

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


-- 
Shan S. Potti,




-- 
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,




--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952



--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952




zeppelib-build.log.gz (33K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

Paul Brenner
Not sure if it will help, but there are some R related steps we include in our checklist when building zeppelin from source.

  1. Install R:  

    sudo yum install R R-devel
  2. Install R evaluate package and other R stuff: 

    sudo R -e "install.packages('evaluate', repos = 'http://cran.us.r-project.org')"
    sudo R -e "install.packages('devtools', repos = 'http://cran.us.r-project.org')"
    sudo R -e "install.packages('knitr', repos = 'http://cran.us.r-project.org')"
    sudo R -e "install.packages('ggplot2', repos = 'http://cran.us.r-project.org')"
    sudo R -e "install.packages(c('devtools','mplot', 'googleVis'), repos = 'http://cran.us.r-project.org'); require(devtools); install_github('ramnathv/rCharts')"
    sudo R -e "install.packages('cowplot', repos = 'http://cran.us.r-project.org')"

  1. Then our build command is:

    ./dev/change_scala_version.sh 2.11
    mvn clean package -DskipTests -Pspark-2.1.0 -Dhadoop.version=2.6.0-cdh5.8.2 -Phadoop-2.6 -Pyarn -Ppyspark -Psparkr -Pr -Pscala-2.11 -Pscalding -Pvendor-repo

Paul Brenner
DATA SCIENTIST
(217) 390-3033  

PlaceIQ:Location Data Accuracy

On Thu, Apr 27, 2017 at 4:54 PM Ruslan Dautkhanov <[hidden email]> wrote:
All my attempts to enable R/sparkR interpreters succesfully failed.

Tried sparkr and r profiles. Attched one of the build logs. 

What I'm missing? It should be something simple. 



--
Ruslan Dautkhanov

On Wed, Apr 26, 2017 at 3:05 PM, Ruslan Dautkhanov <[hidden email]> wrote:
Thanks for feedback.

%spark.r
print("Hello World!")
 throws exception [2].

Understood - I'll try to remove -Pr and rebuild Zeppelin. Yep, I used a fresh master snapshot.
( I have't seen anything in maven build logs that could indicate a problem around R interpreter)
Will update this email thread with result after rebuilding Zeppelin without -Pr 


[2]

spark.r interpreter not found
org.apache.zeppelin.interpreter.InterpreterException: spark.r interpreter not found at org.apache.zeppelin.interpreter.InterpreterFactory.getInterpreter(InterpreterFactory.java:417) at org.apache.zeppelin.notebook.Note.run(Note.java:620) at org.apache.zeppelin.socket.NotebookServer.persistAndExecuteSingleParagraph(NotebookServer.java:1781) at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:1741) at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:288) at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:59) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128) at org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69) at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)



--
Ruslan Dautkhanov

On Wed, Apr 26, 2017 at 2:13 PM, moon soo Lee <[hidden email]> wrote:
Zeppelin includes two R interpreter implementations.

One used to activated by -Psparkr the other -Pr.
Since https://github.com/apache/zeppelin/pull/2215, -Psparkr is activated by default. And if you're trying to use sparkR, -Psparkr (activated by default in master branch) is implementation you might be more interested.

So you can just try use with %spark.r prefix. 
Let me know if it works for you.

Thanks,
moon

On Wed, Apr 26, 2017 at 12:11 AM Ruslan Dautkhanov <[hidden email]> wrote:
Hi moon soo Lee,

Cloudera's Spark doesn't have $SPARK_HOME/bin/sparkR
Would Zeppelin still enable its sparkR interpreter then?

Built Zeppelin using 

$ mvn clean package -DskipTests -Pspark-2.1 -Ppyspark -Dhadoop.version=2.6.0-cdh5.10.1 -Phadoop-2.6 -Pyarn -Pr -Pvendor-repo -Pscala-2.10 -pl '!...,!...' -e

. . .
[INFO] Zeppelin: R Interpreter ............................ SUCCESS [01:01 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11:28 min

None of the R-related interpreters show up nevertheless.

This is including latest Zeppelin snapshot and was the same on previous releases of Zeppelin.
So something is missing on our side.

are installed on the servers that runs Zeppelin (and Spark driver as it is yarn-client). 

I guess either above build options are wrong or there is another dependency I missed.
conf/zeppelin-site.xml has R related interpreters mentioned - [1] but none of them
show up once Zeppelin starts up. 

Any ideas?


Thank you,
Ruslan


[1]

<property>
  <name>zeppelin.interpreters</name>
  <value>org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.rinterpreter.RRepl,org.apache.zeppelin.rinterpreter.KnitR,org.apache.zeppelin.spark.SparkRInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.file.HDFSFileInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,,org.apache.zeppelin.python.PythonInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.jdbc.JDBCInterpreter,org.apache.zeppelin.kylin.KylinInterpreter,org.apache.zeppelin.elasticsearch.ElasticsearchInterpreter,org.apache.zeppelin.scalding.ScaldingInterpreter,org.apache.zeppelin.alluxio.AlluxioInterpreter,org.apache.zeppelin.hbase.HbaseInterpreter,org.apache.zeppelin.livy.LivySparkInterpreter,org.apache.zeppelin.livy.LivyPySparkInterpreter,org.apache.zeppelin.livy.LivySparkRInterpreter,org.apache.zeppelin.livy.LivySparkSQLInterpreter,org.apache.zeppelin.bigquery.BigQueryInterpreter</value>
  <description>Comma separated interpreter configurations. First interpreter become a default</description>
</property>




--
Ruslan Dautkhanov

On Sun, Mar 19, 2017 at 1:07 PM, moon soo Lee <[hidden email]> wrote:

Easiest way to figure out what your environment needs is,

1. run SPARK_HOME/bin/sparkR in your shell and make sure it works in the same host where Zeppelin going to run.
2. try use %spark.r in Zeppelin with SPARK_HOME configured. Normally it should work when 1) works without problem, otherwise take a look error message and error log to get more informations.

Thanks,
moon



On Sat, Mar 18, 2017 at 8:47 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:

I'm not 100% sure as I haven't set it up but it looks like I'm using Zeppelin preconfigured with Spark and I've also taken a snapshot of the Spark Interpreter configuration that I have access to/using in Zeppelin. This interpreter comes with SQL and Python integration and I'm figuring out how do I get to use R.

On Sat, Mar 18, 2017 at 8:06 PM, moon soo Lee <[hidden email]> wrote:
AFAIK, Amazon EMR service has an option that launches Zeppelin (preconfigured) with Spark. Do you use Zeppelin provided by EMR or are you setting up Zeppelin separately?

Thanks,
moon

On Sat, Mar 18, 2017 at 4:13 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
​​
Hi Moon,

Thanks for responding. Exporting Spark_home is exactly where I have a problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS account on cloud. I'm not the master account holder for that AWS account but I'm guessing I'm a client account with limited access probably. Can I still do it?

If yes, can you explain where and how should I do that shell scripting to export the variable? Can I do this in the notebook itself by starting the paragraph with sh% or do I need to do something else?
If you can share any video that would be great. I would like to let you know that I'm a novice user just getting to explore Big Data.

Sharing more info for better context.

Here's my AWS account detail type:



Thanks for your help.

Shan

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


-- 
Shan S. Potti,




-- 
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,




--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952



--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952




Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: How do I configure R interpreter in Zeppelin?

Jongyoul Lee
Hi Ruslan,

Can you check if the conf/interpreter.json exists? If it exists, zeppelin doesn't initialize from interpreter directory again even you build them again. After building it, you have to remove conf/interpreter.json and start again.

On Fri, Apr 28, 2017 at 5:59 AM, Paul Brenner <[hidden email]> wrote:
Not sure if it will help, but there are some R related steps we include in our checklist when building zeppelin from source.

  1. Install R:  

    sudo yum install R R-devel
  2. Install R evaluate package and other R stuff: 

    sudo R -e "install.packages('evaluate', repos = 'http://cran.us.r-project.org')"
    sudo R -e "install.packages('devtools', repos = 'http://cran.us.r-project.org')"
    sudo R -e "install.packages('knitr', repos = 'http://cran.us.r-project.org')"
    sudo R -e "install.packages('ggplot2', repos = 'http://cran.us.r-project.org')"
    sudo R -e "install.packages(c('devtools','mplot', 'googleVis'), repos = 'http://cran.us.r-project.org'); require(devtools); install_github('ramnathv/rCharts')"
    sudo R -e "install.packages('cowplot', repos = 'http://cran.us.r-project.org')"

  1. Then our build command is:

    ./dev/change_scala_version.sh 2.11
    mvn clean package -DskipTests -Pspark-2.1.0 -Dhadoop.version=2.6.0-cdh5.8.2 -Phadoop-2.6 -Pyarn -Ppyspark -Psparkr -Pr -Pscala-2.11 -Pscalding -Pvendor-repo

Paul Brenner
DATA SCIENTIST
(217) 390-3033  

PlaceIQ:Location Data Accuracy

On Thu, Apr 27, 2017 at 4:54 PM Ruslan Dautkhanov <[hidden email]> wrote:
All my attempts to enable R/sparkR interpreters succesfully failed.

Tried sparkr and r profiles. Attched one of the build logs. 

What I'm missing? It should be something simple. 



--
Ruslan Dautkhanov

On Wed, Apr 26, 2017 at 3:05 PM, Ruslan Dautkhanov <[hidden email]> wrote:
Thanks for feedback.

%spark.r
print("Hello World!")
 throws exception [2].

Understood - I'll try to remove -Pr and rebuild Zeppelin. Yep, I used a fresh master snapshot.
( I have't seen anything in maven build logs that could indicate a problem around R interpreter)
Will update this email thread with result after rebuilding Zeppelin without -Pr 


[2]

<div class="m_-8863523326703307425m_-4977098973434043844gmail-tableDisplay m_-8863523326703307425m_-4977098973434043844gmail-ng-scope" src="https://ci3.googleusercontent.com/proxy/irzw3YOipoBAb07JrmBLtswn2pSGbEuKwpXfgtkPsB1YeTtjJjk79-E7vUb-SO4IDRbrjkrt32rNpiq_v8237-iHQto4uDN2aYc76Nu6Erqyo93OgRNUIg=s0-d-e1-ft#http://&#39;app/notebook/paragraph/result/result.html?v=1493186403588&#39;" style="box-sizing:border-box;margin-top:2px;color:rgb(33,33,33);font-family:&quot;helvetica neue&quot;,helvetica,arial,sans-serif;font-size:14px">
spark.r interpreter not found
org.apache.zeppelin.interpreter.InterpreterException: spark.r interpreter not found at org.apache.zeppelin.interpreter.InterpreterFactory.getInterpreter(InterpreterFactory.java:417) at org.apache.zeppelin.notebook.Note.run(Note.java:620) at org.apache.zeppelin.socket.NotebookServer.persistAndExecuteSingleParagraph(NotebookServer.java:1781) at org.apache.zeppelin.socket.NotebookServer.runParagraph(NotebookServer.java:1741) at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:288) at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:59) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128) at org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69) at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)



--
Ruslan Dautkhanov

On Wed, Apr 26, 2017 at 2:13 PM, moon soo Lee <[hidden email]> wrote:
Zeppelin includes two R interpreter implementations.

One used to activated by -Psparkr the other -Pr.
Since https://github.com/apache/zeppelin/pull/2215, -Psparkr is activated by default. And if you're trying to use sparkR, -Psparkr (activated by default in master branch) is implementation you might be more interested.

So you can just try use with %spark.r prefix. 
Let me know if it works for you.

Thanks,
moon

On Wed, Apr 26, 2017 at 12:11 AM Ruslan Dautkhanov <[hidden email]> wrote:
Hi moon soo Lee,

Cloudera's Spark doesn't have $SPARK_HOME/bin/sparkR
Would Zeppelin still enable its sparkR interpreter then?

Built Zeppelin using 

$ mvn clean package -DskipTests -Pspark-2.1 -Ppyspark -Dhadoop.version=2.6.0-cdh5.10.1 -Phadoop-2.6 -Pyarn -Pr -Pvendor-repo -Pscala-2.10 -pl '!...,!...' -e

. . .
[INFO] Zeppelin: R Interpreter ............................ SUCCESS [01:01 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11:28 min

None of the R-related interpreters show up nevertheless.

This is including latest Zeppelin snapshot and was the same on previous releases of Zeppelin.
So something is missing on our side.

are installed on the servers that runs Zeppelin (and Spark driver as it is yarn-client). 

I guess either above build options are wrong or there is another dependency I missed.
conf/zeppelin-site.xml has R related interpreters mentioned - [1] but none of them
show up once Zeppelin starts up. 

Any ideas?


Thank you,
Ruslan


[1]

<property>
  <name>zeppelin.interpreters</name>
  <value>org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.rinterpreter.RRepl,org.apache.zeppelin.rinterpreter.KnitR,org.apache.zeppelin.spark.SparkRInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.angular.AngularInterpreter,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.file.HDFSFileInterpreter,org.apache.zeppelin.flink.FlinkInterpreter,,org.apache.zeppelin.python.PythonInterpreter,org.apache.zeppelin.lens.LensInterpreter,org.apache.zeppelin.ignite.IgniteInterpreter,org.apache.zeppelin.ignite.IgniteSqlInterpreter,org.apache.zeppelin.cassandra.CassandraInterpreter,org.apache.zeppelin.geode.GeodeOqlInterpreter,org.apache.zeppelin.postgresql.PostgreSqlInterpreter,org.apache.zeppelin.jdbc.JDBCInterpreter,org.apache.zeppelin.kylin.KylinInterpreter,org.apache.zeppelin.elasticsearch.ElasticsearchInterpreter,org.apache.zeppelin.scalding.ScaldingInterpreter,org.apache.zeppelin.alluxio.AlluxioInterpreter,org.apache.zeppelin.hbase.HbaseInterpreter,org.apache.zeppelin.livy.LivySparkInterpreter,org.apache.zeppelin.livy.LivyPySparkInterpreter,org.apache.zeppelin.livy.LivySparkRInterpreter,org.apache.zeppelin.livy.LivySparkSQLInterpreter,org.apache.zeppelin.bigquery.BigQueryInterpreter</value>
  <description>Comma separated interpreter configurations. First interpreter become a default</description>
</property>




--
Ruslan Dautkhanov

On Sun, Mar 19, 2017 at 1:07 PM, moon soo Lee <[hidden email]> wrote:

Easiest way to figure out what your environment needs is,

1. run SPARK_HOME/bin/sparkR in your shell and make sure it works in the same host where Zeppelin going to run.
2. try use %spark.r in Zeppelin with SPARK_HOME configured. Normally it should work when 1) works without problem, otherwise take a look error message and error log to get more informations.

Thanks,
moon



On Sat, Mar 18, 2017 at 8:47 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:

I'm not 100% sure as I haven't set it up but it looks like I'm using Zeppelin preconfigured with Spark and I've also taken a snapshot of the Spark Interpreter configuration that I have access to/using in Zeppelin. This interpreter comes with SQL and Python integration and I'm figuring out how do I get to use R.

On Sat, Mar 18, 2017 at 8:06 PM, moon soo Lee <[hidden email]> wrote:
AFAIK, Amazon EMR service has an option that launches Zeppelin (preconfigured) with Spark. Do you use Zeppelin provided by EMR or are you setting up Zeppelin separately?

Thanks,
moon

On Sat, Mar 18, 2017 at 4:13 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
​​
Hi Moon,

Thanks for responding. Exporting Spark_home is exactly where I have a problem. I'm using Zeppelin notebook with Spark on EMR clusters from an AWS account on cloud. I'm not the master account holder for that AWS account but I'm guessing I'm a client account with limited access probably. Can I still do it?

If yes, can you explain where and how should I do that shell scripting to export the variable? Can I do this in the notebook itself by starting the paragraph with sh% or do I need to do something else?
If you can share any video that would be great. I would like to let you know that I'm a novice user just getting to explore Big Data.

Sharing more info for better context.

Here's my AWS account detail type:

image.png

Thanks for your help.

Shan

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


-- 
Shan S. Potti,




-- 
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-8863523326703307425m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_-8863523326703307425m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952

On Sat, Mar 18, 2017 at 8:39 AM, moon soo Lee <[hidden email]> wrote:
If you don't have spark cluster, then you don't need to do 2).
After 1) %spark.r interpreter should work.

If you do have spark cluster, export SPARK_HOME env variable in conf/zeppelin-env.sh, that should be enough make it work.

Hope this helps.

Thanks,
moon

On Fri, Mar 17, 2017 at 2:41 PM Shanmukha Sreenivas Potti <[hidden email]> wrote:
Hello Group!

I'm trying to leverage various R functions in Zeppelin but am having challenges in figuring out how to configure the Spark interpreter/ SPARK_HOME variable.

I'm going by this documentation for now, and specifically have issues with the following steps:

  1. To run R code and visualize plots in Apache Zeppelin, you will need R on your master node (or your dev laptop).

    For Centos: yum install R R-devel libcurl-devel openssl-devel For Ubuntu: apt-get install r-base

How do I figure out the master node and install the R interpreter? Novice user here.


2. To run Zeppelin with the R Interpreter, the SPARK_HOME environment variable must be set. The best way to do this is by editing conf/zeppelin-env.sh. If it is not set, the R Interpreter will not be able to interface with Spark. You should also copy conf/zeppelin-site.xml.template to conf/zeppelin-site.xml. That will ensure that Zeppelin sees the R Interpreter the first time it starts up.

No idea as to how to do step 2 either.

Appreciate your help. If there is a video that you can point me to that talks about these steps, that would be fantabulous.

Thanks! Shan


--
Shan S. Potti,




--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-8863523326703307425m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451m_6229863376402896580m_-2259626140129787235gmail_msg m_-8863523326703307425m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952



--
Shan S. Potti,
<a href="tel:(737)%20333-1952" value="+17373331952" class="m_-8863523326703307425m_-4977098973434043844m_-190005788716456870m_-8166364581335834820m_4117077202442845451gmail_msg" target="_blank">737-333-1952








--
이종열, Jongyoul Lee, 李宗烈
Loading...