Installing Apache Zeppelin in a different machine.

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Installing Apache Zeppelin in a different machine.

Pablo Torre
Hi guys,

I was wondering if you could help me with this scenario: I am using Amazon AWS, and I have running an EMR Hadoop Cluster. I want to install and configure Apache Zeppelin but in a different machine (EC2 instance). So I want to use Zeppelin to visualize information that is available in the HDFS in the cluster.

Can I configure Zeppelin in a different machine? Do I need to install Apache Spark in any machine?

I appreciate your help

Thanks.


--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site
Reply | Threaded
Open this post in threaded view
|

Re: Installing Apache Zeppelin in a different machine.

moon
Administrator
Hi,

Yes, of course. Zeppelin can run on different machine.
It is recommended to install Spark in the machine that runs Zeppelin and point spark installation path in conf/zeppelin-env.sh using SPARK_HOME env variable(in case of 0.6.0-SNAPSHOT).

One thing you need to take care is, your spark workers need to connect to Spark driver. ie. Zeppelin. So you'll need to configure your EC2 instance to able to communicate with EMR cluster, vise versa.

Hope this helps.
moon


On 2015년 10월 12일 (월) at 오후 9:56 Pablo Torre <[hidden email]> wrote:
Hi guys,

I was wondering if you could help me with this scenario: I am using Amazon AWS, and I have running an EMR Hadoop Cluster. I want to install and configure Apache Zeppelin but in a different machine (EC2 instance). So I want to use Zeppelin to visualize information that is available in the HDFS in the cluster.

Can I configure Zeppelin in a different machine? Do I need to install Apache Spark in any machine?

I appreciate your help

Thanks.


--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site
Reply | Threaded
Open this post in threaded view
|

Re: Installing Apache Zeppelin in a different machine.

Pablo Torre
Thanks for your help!! I will try it, and I will let you if it works for me!!



2015-10-12 23:17 GMT+02:00 moon soo Lee <[hidden email]>:
Hi,

Yes, of course. Zeppelin can run on different machine.
It is recommended to install Spark in the machine that runs Zeppelin and point spark installation path in conf/zeppelin-env.sh using SPARK_HOME env variable(in case of 0.6.0-SNAPSHOT).

One thing you need to take care is, your spark workers need to connect to Spark driver. ie. Zeppelin. So you'll need to configure your EC2 instance to able to communicate with EMR cluster, vise versa.

Hope this helps.
moon



On 2015년 10월 12일 (월) at 오후 9:56 Pablo Torre <[hidden email]> wrote:
Hi guys,

I was wondering if you could help me with this scenario: I am using Amazon AWS, and I have running an EMR Hadoop Cluster. I want to install and configure Apache Zeppelin but in a different machine (EC2 instance). So I want to use Zeppelin to visualize information that is available in the HDFS in the cluster.

Can I configure Zeppelin in a different machine? Do I need to install Apache Spark in any machine?

I appreciate your help

Thanks.


--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site
Reply | Threaded
Open this post in threaded view
|

Re: Installing Apache Zeppelin in a different machine.

Pablo Torre
Moon I have one question.  Since I am going to execute Zeppelin in a different machine, should no configure in conf/zeppelin-env.sh the next environment variable?

export HADOOP_CONF_DIR="path to the conf hadoop directory in the cluster" 


Thanks!

2015-10-13 10:10 GMT+02:00 Pablo Torre <[hidden email]>:
Thanks for your help!! I will try it, and I will let you if it works for me!!



2015-10-12 23:17 GMT+02:00 moon soo Lee <[hidden email]>:
Hi,

Yes, of course. Zeppelin can run on different machine.
It is recommended to install Spark in the machine that runs Zeppelin and point spark installation path in conf/zeppelin-env.sh using SPARK_HOME env variable(in case of 0.6.0-SNAPSHOT).

One thing you need to take care is, your spark workers need to connect to Spark driver. ie. Zeppelin. So you'll need to configure your EC2 instance to able to communicate with EMR cluster, vise versa.

Hope this helps.
moon



On 2015년 10월 12일 (월) at 오후 9:56 Pablo Torre <[hidden email]> wrote:
Hi guys,

I was wondering if you could help me with this scenario: I am using Amazon AWS, and I have running an EMR Hadoop Cluster. I want to install and configure Apache Zeppelin but in a different machine (EC2 instance). So I want to use Zeppelin to visualize information that is available in the HDFS in the cluster.

Can I configure Zeppelin in a different machine? Do I need to install Apache Spark in any machine?

I appreciate your help

Thanks.


--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site
Reply | Threaded
Open this post in threaded view
|

Re: Installing Apache Zeppelin in a different machine.

moon
Administrator
Depends on the version you use.

You'll need export HADOOP_CONF_DIR if you're using 0.5.0 version.
If you're on 0.6.0-SNAPSHOT, it's recommended to export SPARK_HOME to Zeppelin use the same configuration that your spark installation use.

Best,
moon

On Tue, Oct 13, 2015 at 10:16 AM Pablo Torre <[hidden email]> wrote:
Moon I have one question.  Since I am going to execute Zeppelin in a different machine, should no configure in conf/zeppelin-env.sh the next environment variable?

export HADOOP_CONF_DIR="path to the conf hadoop directory in the cluster" 


Thanks!

2015-10-13 10:10 GMT+02:00 Pablo Torre <[hidden email]>:
Thanks for your help!! I will try it, and I will let you if it works for me!!



2015-10-12 23:17 GMT+02:00 moon soo Lee <[hidden email]>:
Hi,

Yes, of course. Zeppelin can run on different machine.
It is recommended to install Spark in the machine that runs Zeppelin and point spark installation path in conf/zeppelin-env.sh using SPARK_HOME env variable(in case of 0.6.0-SNAPSHOT).

One thing you need to take care is, your spark workers need to connect to Spark driver. ie. Zeppelin. So you'll need to configure your EC2 instance to able to communicate with EMR cluster, vise versa.

Hope this helps.
moon



On 2015년 10월 12일 (월) at 오후 9:56 Pablo Torre <[hidden email]> wrote:
Hi guys,

I was wondering if you could help me with this scenario: I am using Amazon AWS, and I have running an EMR Hadoop Cluster. I want to install and configure Apache Zeppelin but in a different machine (EC2 instance). So I want to use Zeppelin to visualize information that is available in the HDFS in the cluster.

Can I configure Zeppelin in a different machine? Do I need to install Apache Spark in any machine?

I appreciate your help

Thanks.


--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site
Reply | Threaded
Open this post in threaded view
|

Re: Installing Apache Zeppelin in a different machine.

Pablo Torre
ok, then if for instance I want to access to the HDFS using Hive, the only thing that I need to do is give permissions to the machine with Zeppelin (so Zeppelin has permissions to access to the cluster) and create a interpreter from Zeppelin, no?

2015-10-13 10:22 GMT+02:00 moon soo Lee <[hidden email]>:
Depends on the version you use.

You'll need export HADOOP_CONF_DIR if you're using 0.5.0 version.
If you're on 0.6.0-SNAPSHOT, it's recommended to export SPARK_HOME to Zeppelin use the same configuration that your spark installation use.

Best,
moon


On Tue, Oct 13, 2015 at 10:16 AM Pablo Torre <[hidden email]> wrote:
Moon I have one question.  Since I am going to execute Zeppelin in a different machine, should no configure in conf/zeppelin-env.sh the next environment variable?

export HADOOP_CONF_DIR="path to the conf hadoop directory in the cluster" 


Thanks!

2015-10-13 10:10 GMT+02:00 Pablo Torre <[hidden email]>:
Thanks for your help!! I will try it, and I will let you if it works for me!!



2015-10-12 23:17 GMT+02:00 moon soo Lee <[hidden email]>:
Hi,

Yes, of course. Zeppelin can run on different machine.
It is recommended to install Spark in the machine that runs Zeppelin and point spark installation path in conf/zeppelin-env.sh using SPARK_HOME env variable(in case of 0.6.0-SNAPSHOT).

One thing you need to take care is, your spark workers need to connect to Spark driver. ie. Zeppelin. So you'll need to configure your EC2 instance to able to communicate with EMR cluster, vise versa.

Hope this helps.
moon



On 2015년 10월 12일 (월) at 오후 9:56 Pablo Torre <[hidden email]> wrote:
Hi guys,

I was wondering if you could help me with this scenario: I am using Amazon AWS, and I have running an EMR Hadoop Cluster. I want to install and configure Apache Zeppelin but in a different machine (EC2 instance). So I want to use Zeppelin to visualize information that is available in the HDFS in the cluster.

Can I configure Zeppelin in a different machine? Do I need to install Apache Spark in any machine?

I appreciate your help

Thanks.


--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site
Reply | Threaded
Open this post in threaded view
|

Re: Installing Apache Zeppelin in a different machine.

moon
Administrator
Basically right. more precisely it depends on how individual interpreter you use is implemented. For example, hive interpreter shipped in Zeppelin uses hive-jdbc driver to connect to hiveserver2. So in this case Zeppelin will only need to connect to the node where hiveserver2 runs. not entire cluster.

Best,
moon

On Tue, Oct 13, 2015 at 10:26 AM Pablo Torre <[hidden email]> wrote:
ok, then if for instance I want to access to the HDFS using Hive, the only thing that I need to do is give permissions to the machine with Zeppelin (so Zeppelin has permissions to access to the cluster) and create a interpreter from Zeppelin, no?

2015-10-13 10:22 GMT+02:00 moon soo Lee <[hidden email]>:
Depends on the version you use.

You'll need export HADOOP_CONF_DIR if you're using 0.5.0 version.
If you're on 0.6.0-SNAPSHOT, it's recommended to export SPARK_HOME to Zeppelin use the same configuration that your spark installation use.

Best,
moon


On Tue, Oct 13, 2015 at 10:16 AM Pablo Torre <[hidden email]> wrote:
Moon I have one question.  Since I am going to execute Zeppelin in a different machine, should no configure in conf/zeppelin-env.sh the next environment variable?

export HADOOP_CONF_DIR="path to the conf hadoop directory in the cluster" 


Thanks!

2015-10-13 10:10 GMT+02:00 Pablo Torre <[hidden email]>:
Thanks for your help!! I will try it, and I will let you if it works for me!!



2015-10-12 23:17 GMT+02:00 moon soo Lee <[hidden email]>:
Hi,

Yes, of course. Zeppelin can run on different machine.
It is recommended to install Spark in the machine that runs Zeppelin and point spark installation path in conf/zeppelin-env.sh using SPARK_HOME env variable(in case of 0.6.0-SNAPSHOT).

One thing you need to take care is, your spark workers need to connect to Spark driver. ie. Zeppelin. So you'll need to configure your EC2 instance to able to communicate with EMR cluster, vise versa.

Hope this helps.
moon



On 2015년 10월 12일 (월) at 오후 9:56 Pablo Torre <[hidden email]> wrote:
Hi guys,

I was wondering if you could help me with this scenario: I am using Amazon AWS, and I have running an EMR Hadoop Cluster. I want to install and configure Apache Zeppelin but in a different machine (EC2 instance). So I want to use Zeppelin to visualize information that is available in the HDFS in the cluster.

Can I configure Zeppelin in a different machine? Do I need to install Apache Spark in any machine?

I appreciate your help

Thanks.


--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site



--
Pablo Torre.
Freelance software engineer and Ruby on Rails developer.
Oleiros (Coruña)
Personal site