restarting zeppelin on EMR causing exceptions

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

restarting zeppelin on EMR causing exceptions

Richard Xin
I was doing config changes on EMR (EMR Release label:emr-5.7.0) Zeppelin (Zeppelin 0.7.2) and struggled with Zeppelin server restart, finally I was able to narrow down to the steps to reproduce below (it's reproducible for any newly created EMR if following the steps below):
1) provision a EMR cluster (1 master node + 2 core nodes) with Spark, Hadoop, hive and Zeppelin
2) ssh to the master node
3) curl http://ec2-34-209-164-132.us-west-2.compute.amazonaws.com:8890/#/
4) sudo /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart
5) do #3 again
you will see errors:
<h2>HTTP ERROR: 503</h2>
<p>Problem accessing /. Reason:
<pre>    Service Unavailable</pre></p>

6) tail Zeppelin log under /var/log/zeppelin/
you will see following errors:
WARN [2017-07-19 21:34:57,425] ({main} AbstractLifeCycle.java[setFailed]:212) - FAILED org.eclipse.jetty.server.Server@4ed38226: java.net.BindException: Address already in use
java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:433)
    at sun.nio.ch.Net.bind(Net.java:425)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321)
    at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
    at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.eclipse.jetty.server.Server.doStart(Server.java:366)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.apache.zeppelin.server.ZeppelinServer.main(ZeppelinServer.java:189)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: restarting zeppelin on EMR causing exceptions

moon
Administrator
Hi,

Can you try restart Zeppelin with 'zeppelin' user permission?

sudo -u zeppelin /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart

Thanks,
moon

On Wed, Jul 19, 2017 at 3:21 PM Richard Xin <[hidden email]> wrote:
I was doing config changes on EMR (EMR Release label:emr-5.7.0) Zeppelin (Zeppelin 0.7.2) and struggled with Zeppelin server restart, finally I was able to narrow down to the steps to reproduce below (it's reproducible for any newly created EMR if following the steps below):
1) provision a EMR cluster (1 master node + 2 core nodes) with Spark, Hadoop, hive and Zeppelin
2) ssh to the master node
3) curl http://ec2-34-209-164-132.us-west-2.compute.amazonaws.com:8890/#/
4) sudo /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart
5) do #3 again
you will see errors:
<h2>HTTP ERROR: 503</h2>
<p>Problem accessing /. Reason:
<pre>    Service Unavailable</pre></p>

6) tail Zeppelin log under /var/log/zeppelin/
you will see following errors:
WARN [2017-07-19 21:34:57,425] ({main} AbstractLifeCycle.java[setFailed]:212) - FAILED org.eclipse.jetty.server.Server@4ed38226: java.net.BindException: Address already in use
java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:433)
    at sun.nio.ch.Net.bind(Net.java:425)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321)
    at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
    at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.eclipse.jetty.server.Server.doStart(Server.java:366)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.apache.zeppelin.server.ZeppelinServer.main(ZeppelinServer.java:189)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: restarting zeppelin on EMR causing exceptions

Richard Xin
nope, not working

[ec2-user@ip-xxx-xxx-xx-xxx ~]$ sudo -u zeppelin /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart
find: failed to restore initial working directory: Permission denied
Zeppelin is not running
Zeppelin start                                             [  OK  ]
Zeppelin process died                                      [FAILED]


On Wednesday, July 19, 2017, 3:39:37 PM PDT, moon soo Lee <[hidden email]> wrote:


Hi,

Can you try restart Zeppelin with 'zeppelin' user permission?

sudo -u zeppelin /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart

Thanks,
moon

On Wed, Jul 19, 2017 at 3:21 PM Richard Xin <[hidden email]> wrote:
I was doing config changes on EMR (EMR Release label:emr-5.7.0) Zeppelin (Zeppelin 0.7.2) and struggled with Zeppelin server restart, finally I was able to narrow down to the steps to reproduce below (it's reproducible for any newly created EMR if following the steps below):
1) provision a EMR cluster (1 master node + 2 core nodes) with Spark, Hadoop, hive and Zeppelin
2) ssh to the master node
3) curl <a rel="nofollow" shape="rect" target="_blank" onclick="return window.theMainWindow.showLinkWarning(this)" href="http://ec2-34-209-164-132.us-west-2.compute.amazonaws.com:8890/#/">http://ec2-34-209-164-132.us-west-2.compute.amazonaws.com:8890/#/
4) sudo /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart
5) do #3 again
you will see errors:
<h2>HTTP ERROR: 503</h2>
<p>Problem accessing /. Reason:
<pre>    Service Unavailable</pre></p>

6) tail Zeppelin log under /var/log/zeppelin/
you will see following errors:
WARN [2017-07-19 21:34:57,425] ({main} AbstractLifeCycle.java[setFailed]:212) - FAILED org.eclipse.jetty.server.Server@4ed38226: java.net.BindException: Address already in use
java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:433)
    at sun.nio.ch.Net.bind(Net.java:425)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321)
    at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
    at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.eclipse.jetty.server.Server.doStart(Server.java:366)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.apache.zeppelin.server.ZeppelinServer.main(ZeppelinServer.java:189)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: restarting zeppelin on EMR causing exceptions

Steven Kirtzic
In reply to this post by moon

Hey Moon,

 

Can you remove me from this mailing list? Thanks,

 

-Steven

 

From: moon soo Lee <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Wednesday, July 19, 2017 at 5:39 PM
To: "[hidden email]" <[hidden email]>
Subject: Re: restarting zeppelin on EMR causing exceptions

 

Hi,

 

Can you try restart Zeppelin with 'zeppelin' user permission?

 

sudo -u zeppelin /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart

 

Thanks,
moon

 

On Wed, Jul 19, 2017 at 3:21 PM Richard Xin <[hidden email]> wrote:

I was doing config changes on EMR (EMR Release label:emr-5.7.0) Zeppelin (Zeppelin 0.7.2) and struggled with Zeppelin server restart, finally I was able to narrow down to the steps to reproduce below (it's reproducible for any newly created EMR if following the steps below):
1) provision a EMR cluster (1 master node + 2 core nodes) with Spark, Hadoop, hive and Zeppelin
2) ssh to the master node
3) curl http://ec2-34-209-164-132.us-west-2.compute.amazonaws.com:8890/#/
4) sudo /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart
5) do #3 again
you will see errors:
<h2>HTTP ERROR: 503</h2>
<p>Problem accessing /. Reason:
<pre>    Service Unavailable</pre></p>

6) tail Zeppelin log under /var/log/zeppelin/
you will see following errors:
WARN [2017-07-19 21:34:57,425] ({main} AbstractLifeCycle.java[setFailed]:212) - FAILED org.eclipse.jetty.server.Server@4ed38226: java.net.BindException: Address already in use
java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:433)
    at sun.nio.ch.Net.bind(Net.java:425)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321)
    at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
    at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.eclipse.jetty.server.Server.doStart(Server.java:366)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.apache.zeppelin.server.ZeppelinServer.main(ZeppelinServer.java:189)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: restarting zeppelin on EMR causing exceptions

Jonathan
EMR uses upstart to manage all daemons/services running on the cluster, so the best way to restart Zeppelin on EMR is to use "sudo stop zeppelin; sudo start zeppelin". (Note: normally you'd use "sudo restart", but this doesn't actually work due to the way that upstart is configured for all of these processes on EMR.)

~ Jonathan

On Wed, Jul 19, 2017 at 4:11 PM Steven Kirtzic <[hidden email]> wrote:

Hey Moon,

 

Can you remove me from this mailing list? Thanks,

 

-Steven

 

From: moon soo Lee <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Wednesday, July 19, 2017 at 5:39 PM
To: "[hidden email]" <[hidden email]>
Subject: Re: restarting zeppelin on EMR causing exceptions

 

Hi,

 

Can you try restart Zeppelin with 'zeppelin' user permission?

 

sudo -u zeppelin /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart

 

Thanks,
moon

 

On Wed, Jul 19, 2017 at 3:21 PM Richard Xin <[hidden email]> wrote:

I was doing config changes on EMR (EMR Release label:emr-5.7.0) Zeppelin (Zeppelin 0.7.2) and struggled with Zeppelin server restart, finally I was able to narrow down to the steps to reproduce below (it's reproducible for any newly created EMR if following the steps below):
1) provision a EMR cluster (1 master node + 2 core nodes) with Spark, Hadoop, hive and Zeppelin
2) ssh to the master node
3) curl http://ec2-34-209-164-132.us-west-2.compute.amazonaws.com:8890/#/
4) sudo /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart
5) do #3 again
you will see errors:
<h2>HTTP ERROR: 503</h2>
<p>Problem accessing /. Reason:
<pre>    Service Unavailable</pre></p>

6) tail Zeppelin log under /var/log/zeppelin/
you will see following errors:
WARN [2017-07-19 21:34:57,425] ({main} AbstractLifeCycle.java[setFailed]:212) - FAILED org.eclipse.jetty.server.Server@4ed38226: java.net.BindException: Address already in use
java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:433)
    at sun.nio.ch.Net.bind(Net.java:425)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321)
    at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
    at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.eclipse.jetty.server.Server.doStart(Server.java:366)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.apache.zeppelin.server.ZeppelinServer.main(ZeppelinServer.java:189)

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: restarting zeppelin on EMR causing exceptions

Richard Xin
thanks, Jonathan, 
the Zeppelin's native restart command (sudo zeppelin-daemon.sh restart) doesn't work for EMR, I saw error "channel 7: open failed: connect failed: Connection refused" using native restart command, it seems that it doesn't stop all dependency process cleanly.

after running native "sudo zeppelin-daemon.sh restart" on EMR, the cluster is not easy to recover, I tried to kill -9 the zeppelin process and restart using sudo start on EMR, but still not working
sudo stop/start zeppelin does not have issues, works as expected.

-Richard




On Wednesday, July 19, 2017, 5:35:57 PM PDT, Jonathan Kelly <[hidden email]> wrote:


EMR uses upstart to manage all daemons/services running on the cluster, so the best way to restart Zeppelin on EMR is to use "sudo stop zeppelin; sudo start zeppelin". (Note: normally you'd use "sudo restart", but this doesn't actually work due to the way that upstart is configured for all of these processes on EMR.)

~ Jonathan

On Wed, Jul 19, 2017 at 4:11 PM Steven Kirtzic <[hidden email]> wrote:

Hey Moon,

 

Can you remove me from this mailing list? Thanks,

 

-Steven

 

From: moon soo Lee <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Wednesday, July 19, 2017 at 5:39 PM
To: "[hidden email]" <[hidden email]>
Subject: Re: restarting zeppelin on EMR causing exceptions

 

Hi,

 

Can you try restart Zeppelin with 'zeppelin' user permission?

 

sudo -u zeppelin /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart

 

Thanks,
moon

 

On Wed, Jul 19, 2017 at 3:21 PM Richard Xin <[hidden email]> wrote:

I was doing config changes on EMR (EMR Release label:emr-5.7.0) Zeppelin (Zeppelin 0.7.2) and struggled with Zeppelin server restart, finally I was able to narrow down to the steps to reproduce below (it's reproducible for any newly created EMR if following the steps below):
1) provision a EMR cluster (1 master node + 2 core nodes) with Spark, Hadoop, hive and Zeppelin
2) ssh to the master node
3) curl <a rel="nofollow" shape="rect" target="_blank" onclick="return window.theMainWindow.showLinkWarning(this)" href="http://ec2-34-209-164-132.us-west-2.compute.amazonaws.com:8890/#/"> http://ec2-34-209-164-132.us-west-2.compute.amazonaws.com:8890/#/
4) sudo /usr/lib/zeppelin/bin/zeppelin-daemon.sh restart
5) do #3 again
you will see errors:
<h2>HTTP ERROR: 503</h2>
<p>Problem accessing /. Reason:
<pre>    Service Unavailable</pre></p>

6) tail Zeppelin log under /var/log/zeppelin/
you will see following errors:
WARN [2017-07-19 21:34:57,425] ({main} AbstractLifeCycle.java[setFailed]:212) - FAILED org.eclipse.jetty.server.Server@4ed38226: java.net.BindException: Address already in use
java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind0(Native Method)
    at sun.nio.ch.Net.bind(Net.java:433)
    at sun.nio.ch.Net.bind(Net.java:425)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
    at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:321)
    at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
    at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:236)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.eclipse.jetty.server.Server.doStart(Server.java:366)
    at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
    at org.apache.zeppelin.server.ZeppelinServer.main(ZeppelinServer.java:189)

Loading...