Limit on multiple concurrent interpreters / isolated notebooks?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Limit on multiple concurrent interpreters / isolated notebooks?

blaubaer
Hi
 
We are running Zeppelin (0.5) on our YARN managed cluster. To allow for multiple concurrent users, without sharing the spark context, we simply setup one interpreter for every user. This works pretty OK, however, at some point we seem to hit the limit of how many concurrent (spark) interpreters the Zeppelin (daemon) service can handle. Now with the “new” (0.6) feature of isolated notebooks, this topic should pop up with other users as well.
 
So, I was wondering: what are your experiences with multiple concurrent interpreters? What are determining factors for the question how many concurrent interpreters can run (besides from cluster resources to actually be able to start the multiple interpreters)? Any experiences on that? For us it seems that 2-3 is OK, 4-5 get’s critical, but that also depends on the load of the jobs it seems.
 
Thx.
Reply | Threaded
Open this post in threaded view
|

Re: Limit on multiple concurrent interpreters / isolated notebooks?

Jianfeng (Jeff) Zhang

Yes, this would be a critical performance issue for multiple user case.
Because currently zeppelin only support yarn-client mode which means the
driver JVM is in the same host
of zeppelin server. So regarding the concurrent users, it depends on the
memory you configure for the driver and how many memory you have for the
machine.

But for the long term, we should support yarn-cluster mode. Here¹s one
ticket and wiki for this

https://issues.apache.org/jira/browse/ZEPPELIN-1377


https://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster+Manager+Propos
al





Best Regard,
Jeff Zhang





On 12/14/16, 11:27 PM, "blaubaer" <[hidden email]> wrote:

>Hi
>
>We are running Zeppelin (0.5) on our YARN managed cluster. To allow for
>multiple concurrent users, without sharing the spark context, we simply
>setup one interpreter for every user. This works pretty OK, however, at
>some
>point we seem to hit the limit of how many concurrent (spark) interpreters
>the Zeppelin (daemon) service can handle. Now with the ³new² (0.6) feature
>of isolated notebooks, this topic should pop up with other users as well.
>
>So, I was wondering: what are your experiences with multiple concurrent
>interpreters? What are determining factors for the question how many
>concurrent interpreters can run (besides from cluster resources to
>actually
>be able to start the multiple interpreters)? Any experiences on that? For
>us
>it seems that 2-3 is OK, 4-5 get¹s critical, but that also depends on the
>load of the jobs it seems.
>
>Thx.
>
>
>
>--
>View this message in context:
>http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/L
>imit-on-multiple-concurrent-interpreters-isolated-notebooks-tp4732.html
>Sent from the Apache Zeppelin Users (incubating) mailing list mailing
>list archive at Nabble.com.
>

Reply | Threaded
Open this post in threaded view
|

Re: Limit on multiple concurrent interpreters / isolated notebooks?

blaubaer

So that is concerning the drivers, which clearly have to fit on one single machine. But what about the Zeppelin server service itself? Any bottlenecks for how many concurrent drivers processes it can handle?

-----------------------------
René Pfitzner
Lead Data Scientists

Neue Zürcher Zeitung

Sent from mobile. Please excuse brevity.

From: Jianfeng (Jeff) Zhang [via Apache Zeppelin Users (incubating) mailing list]
Sent: Thursday, December 15, 3:31 AM
Subject: Re: Limit on multiple concurrent interpreters / isolated notebooks?
To: Pfitzner René

Yes, this would be a critical performance issue for multiple user case.
Because currently zeppelin only support yarn-client mode which means the
driver JVM is in the same host
of zeppelin server. So regarding the concurrent users, it depends on the
memory you configure for the driver and how many memory you have for the
machine.

But for the long term, we should support yarn-cluster mode. Here¹s one
ticket and wiki for this

https://issues.apache.org/jira/browse/ZEPPELIN-1377

<a href="https://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster&#43;Manager&#43;Propos">https<a href="https://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster&#43;Manager&#43;Propos">://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster+Manager+Propos
al



Best Regard,
Jeff Zhang



On 12/14/16, 11:27 PM, "blaubaer" <[hidden email]> wrote:

>Hi
>
>We are running Zeppelin (0.5) on our YARN managed cluster. To allow for
>multiple concurrent users, without sharing the spark context, we simply
>setup one interpreter for every user. This works pretty OK, however, at
>some
>point we seem to hit the limit of how many concurrent (spark) interpreters
>the Zeppelin (daemon) service can handle. Now with the ³new² (0.6) feature
>of isolated notebooks, this topic should pop up with other users as well.
>
>So, I was wondering: what are your experiences with multiple concurrent
>interpreters? What are determining factors for the question how many
>concurrent interpreters can run (besides from cluster resources to
>actually
>be able to start the multiple interpreters)? Any experiences on that? For
>us
>it seems that 2-3 is OK, 4-5 get¹s critical, but that also depends on the
>load of the jobs it seems.
>
>Thx.
>
>
>
>--
>View this message in context:
>http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/L
>imit-on-multiple-concurrent-interpreters-isolated-notebooks-tp4732.html
>Sent from the Apache Zeppelin Users (incubating) mailing list mailing
>list archive at Nabble.com.
>


If you reply to this email, your message will be added to the discussion below:

http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/Limit-on-multiple-concurrent-interpreters-isolated-notebooks-tp4732p4735.html

To start a new topic under Apache Zeppelin Users (incubating) mailing list, email ml-node+[hidden email]
To unsubscribe from Limit on multiple concurrent interpreters / isolated notebooks?, click here.
NAML

Reply | Threaded
Open this post in threaded view
|

Re: Limit on multiple concurrent interpreters / isolated notebooks?

Jianfeng (Jeff) Zhang

IMO, Before zeppelin-server hit bottleneck. Drivers would eat up your machine’s memory.




Best Regard,
Jeff Zhang


From: blaubaer <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Thursday, December 15, 2016 at 3:43 PM
To: "[hidden email]" <[hidden email]>
Subject: Re: Limit on multiple concurrent interpreters / isolated notebooks?

So that is concerning the drivers, which clearly have to fit on one single machine. But what about the Zeppelin server service itself? Any bottlenecks for how many concurrent drivers processes it can handle?

-----------------------------
René Pfitzner
Lead Data Scientists

Neue Zürcher Zeitung

Sent from mobile. Please excuse brevity.

From: Jianfeng (Jeff) Zhang [via Apache Zeppelin Users (incubating) mailing list]
Sent: Thursday, December 15, 3:31 AM
Subject: Re: Limit on multiple concurrent interpreters / isolated notebooks?
To: Pfitzner René

Yes, this would be a critical performance issue for multiple user case.
Because currently zeppelin only support yarn-client mode which means the
driver JVM is in the same host
of zeppelin server. So regarding the concurrent users, it depends on the
memory you configure for the driver and how many memory you have for the
machine.

But for the long term, we should support yarn-cluster mode. Here¹s one
ticket and wiki for this

https://issues.apache.org/jira/browse/ZEPPELIN-1377

<a href="<a href="https://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster&amp;#43;Manager&amp;#43;Propos&quot;&gt;https&lt;a">https://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster&#43;Manager&#43;Propos">https<a href="<a href="https://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster&amp;#43;Manager&amp;#43;Propos&quot;&gt;://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster&#43;Manager&#43;Propos">https://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster&#43;Manager&#43;Propos">://cwiki.apache.org/confluence/display/ZEPPELIN/Cluster+Manager+Propos
al



Best Regard,
Jeff Zhang



On 12/14/16, 11:27 PM, "blaubaer" <[hidden email]> wrote:

>Hi
>
>We are running Zeppelin (0.5) on our YARN managed cluster. To allow for
>multiple concurrent users, without sharing the spark context, we simply
>setup one interpreter for every user. This works pretty OK, however, at
>some
>point we seem to hit the limit of how many concurrent (spark) interpreters
>the Zeppelin (daemon) service can handle. Now with the ³new² (0.6) feature
>of isolated notebooks, this topic should pop up with other users as well.
>
>So, I was wondering: what are your experiences with multiple concurrent
>interpreters? What are determining factors for the question how many
>concurrent interpreters can run (besides from cluster resources to
>actually
>be able to start the multiple interpreters)? Any experiences on that? For
>us
>it seems that 2-3 is OK, 4-5 get¹s critical, but that also depends on the
>load of the jobs it seems.
>
>Thx.
>
>
>
>--
>View this message in context:
>http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/L
>imit-on-multiple-concurrent-interpreters-isolated-notebooks-tp4732.html
>Sent from the Apache Zeppelin Users (incubating) mailing list mailing
>list archive at Nabble.com.
>


If you reply to this email, your message will be added to the discussion below:

http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/Limit-on-multiple-concurrent-interpreters-isolated-notebooks-tp4732p4735.html

To start a new topic under Apache Zeppelin Users (incubating) mailing list, email ml-node+[hidden email]
To unsubscribe from Limit on multiple concurrent interpreters / isolated notebooks?, click here.
NAML



View this message in context: Re: Limit on multiple concurrent interpreters / isolated notebooks?
Sent from the Apache Zeppelin Users (incubating) mailing list mailing list archive at Nabble.com.