Zeppelin in multi user environment

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Zeppelin in multi user environment

IT CTO
Hi,
we are in the process of testing Zeppelin as our investigation platform inside the organization.
One of the first question raised was with regard to multi user environment - currently, as I see it, all users run against the same zeppelin server and have access and availability to all notebooks.

What are other people do with regard to that?
Does the road-map have a multi-tenant solution for zeppelin? security?

Eran 
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zeppelin in multi user environment

Ophir Cohen
Actually it a bit more than that:
Even the variables shared across notebooks!

I think that NFLabs has a commercial version that supports groups and users.
In my organisation we are looking on few solutions for that.
One of them is using different instances - maybe even on the same machine.
I'm going to test it soon - but you are right, currently it's a problem.

BTW
Running different Zeppelin instances isn't such a bad idea as you get the efficiency from the yarn resource manager that can be the same cluster (assuming you using yarn)>

On Sun, Jun 28, 2015 at 10:00 AM, IT CTO <[hidden email]> wrote:
Hi,
we are in the process of testing Zeppelin as our investigation platform inside the organization.
One of the first question raised was with regard to multi user environment - currently, as I see it, all users run against the same zeppelin server and have access and availability to all notebooks.

What are other people do with regard to that?
Does the road-map have a multi-tenant solution for zeppelin? security?

Eran 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zeppelin in multi user environment

IT CTO
Thanks Ophir!
That means that I have to wrap zeppelin with my own site which launch a zeppelin server on behalf of every requesting user. this is an option but I want avoiding it.
pls, share wherever you come across during this journey
Eran 

On Sun, Jun 28, 2015 at 12:09 PM Ophir Cohen <[hidden email]> wrote:
Actually it a bit more than that:
Even the variables shared across notebooks!

I think that NFLabs has a commercial version that supports groups and users.
In my organisation we are looking on few solutions for that.
One of them is using different instances - maybe even on the same machine.
I'm going to test it soon - but you are right, currently it's a problem.

BTW
Running different Zeppelin instances isn't such a bad idea as you get the efficiency from the yarn resource manager that can be the same cluster (assuming you using yarn)>

On Sun, Jun 28, 2015 at 10:00 AM, IT CTO <[hidden email]> wrote:
Hi,
we are in the process of testing Zeppelin as our investigation platform inside the organization.
One of the first question raised was with regard to multi user environment - currently, as I see it, all users run against the same zeppelin server and have access and availability to all notebooks.

What are other people do with regard to that?
Does the road-map have a multi-tenant solution for zeppelin? security?

Eran 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zeppelin in multi user environment

Ophir Cohen
I'll do so.
Actually I didn't mean for every user but for every customer (that can have many end-users in my business), I can see advantage of sharing the same environment for the same group/department but then again, it depends on your business and your needs...
I'll keep everybody updated.

On Sun, Jun 28, 2015 at 12:54 PM, IT CTO <[hidden email]> wrote:
Thanks Ophir!
That means that I have to wrap zeppelin with my own site which launch a zeppelin server on behalf of every requesting user. this is an option but I want avoiding it.
pls, share wherever you come across during this journey
Eran 

On Sun, Jun 28, 2015 at 12:09 PM Ophir Cohen <[hidden email]> wrote:
Actually it a bit more than that:
Even the variables shared across notebooks!

I think that NFLabs has a commercial version that supports groups and users.
In my organisation we are looking on few solutions for that.
One of them is using different instances - maybe even on the same machine.
I'm going to test it soon - but you are right, currently it's a problem.

BTW
Running different Zeppelin instances isn't such a bad idea as you get the efficiency from the yarn resource manager that can be the same cluster (assuming you using yarn)>

On Sun, Jun 28, 2015 at 10:00 AM, IT CTO <[hidden email]> wrote:
Hi,
we are in the process of testing Zeppelin as our investigation platform inside the organization.
One of the first question raised was with regard to multi user environment - currently, as I see it, all users run against the same zeppelin server and have access and availability to all notebooks.

What are other people do with regard to that?
Does the road-map have a multi-tenant solution for zeppelin? security?

Eran 


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zeppelin in multi user environment

Eric Charles
In reply to this post by IT CTO
There is also https://github.com/apache/incubator-zeppelin/pull/53 which proposes to add shiro security (user authentication on the web part). This does not address what Ophir mentions (separated environment for e.g. spark interpreters to avoid variables shared across simultaneous authenticated users).

My company (Datalayer) has also developed a multiuser extension to Zeppelin that addresses both web and interpreter user environment separation.

To achieve this, we had to change the interpreter API to propagate the authenticated user to the interpreters.

On 2015-06-28 11:54, IT CTO wrote:
Thanks Ophir!
That means that I have to wrap zeppelin with my own site which launch a zeppelin server on behalf of every requesting user. this is an option but I want avoiding it.
pls, share wherever you come across during this journey
Eran 

On Sun, Jun 28, 2015 at 12:09 PM Ophir Cohen <[hidden email]> wrote:
Actually it a bit more than that:
Even the variables shared across notebooks!

I think that NFLabs has a commercial version that supports groups and users.
In my organisation we are looking on few solutions for that.
One of them is using different instances - maybe even on the same machine.
I'm going to test it soon - but you are right, currently it's a problem.

BTW
Running different Zeppelin instances isn't such a bad idea as you get the efficiency from the yarn resource manager that can be the same cluster (assuming you using yarn)>

On Sun, Jun 28, 2015 at 10:00 AM, IT CTO <[hidden email]> wrote:
Hi,
we are in the process of testing Zeppelin as our investigation platform inside the organization.
One of the first question raised was with regard to multi user environment - currently, as I see it, all users run against the same zeppelin server and have access and availability to all notebooks.

What are other people do with regard to that?
Does the road-map have a multi-tenant solution for zeppelin? security?

Eran 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zeppelin in multi user environment

moon
Administrator
Hi,

Here's something i know about multi-tenancy for Zeppelin.

A. Reverse Proxy + Zeppelin on docker.

Setup a reverse proxy, who is doing authentication and redirect user to proper Zeppelin instance running on docker container.
I saw many companies are already using Zeppelin in this way.

My company (NFLabs) also uses this way for one of internal cluster. And now preparing open source the tools that helps set up and use this type of environment. 

As far as i know, NFLabs has no plan to make commercial package of Zeppelin which has more features(such as security enabled zeppelin) than Apache version. One commercial service NFLabs doing is collaboration/sharing service for Zeppelin notebook with access control (like github for git).


B. Shiro security. PullRequest-53

Which enables dedicated notebook space for each user.
I like the approach and really make sense. 

There're couple of issues i can think.
  - compiler context are shared among users
  - user can still read other users's notebook directly from filesystem
  - user is not distinguished in interpreter level. 
  - restarting Zeppelin is required for many cases. That'll impact all connected user.

Therefore, it can be used for basic authentication, but need more work for multi-tenant environment.

So, i'd like to say, A is more like what's possible now, B is more like future work.

Thanks,
moon


On Sun, Jun 28, 2015 at 3:09 AM Eric Charles <[hidden email]> wrote:
There is also https://github.com/apache/incubator-zeppelin/pull/53 which proposes to add shiro security (user authentication on the web part). This does not address what Ophir mentions (separated environment for e.g. spark interpreters to avoid variables shared across simultaneous authenticated users).

My company (Datalayer) has also developed a multiuser extension to Zeppelin that addresses both web and interpreter user environment separation.

To achieve this, we had to change the interpreter API to propagate the authenticated user to the interpreters.

On 2015-06-28 11:54, IT CTO wrote:
Thanks Ophir!
That means that I have to wrap zeppelin with my own site which launch a zeppelin server on behalf of every requesting user. this is an option but I want avoiding it.
pls, share wherever you come across during this journey
Eran 

On Sun, Jun 28, 2015 at 12:09 PM Ophir Cohen <[hidden email]> wrote:
Actually it a bit more than that:
Even the variables shared across notebooks!

I think that NFLabs has a commercial version that supports groups and users.
In my organisation we are looking on few solutions for that.
One of them is using different instances - maybe even on the same machine.
I'm going to test it soon - but you are right, currently it's a problem.

BTW
Running different Zeppelin instances isn't such a bad idea as you get the efficiency from the yarn resource manager that can be the same cluster (assuming you using yarn)>

On Sun, Jun 28, 2015 at 10:00 AM, IT CTO <[hidden email]> wrote:
Hi,
we are in the process of testing Zeppelin as our investigation platform inside the organization.
One of the first question raised was with regard to multi user environment - currently, as I see it, all users run against the same zeppelin server and have access and availability to all notebooks.

What are other people do with regard to that?
Does the road-map have a multi-tenant solution for zeppelin? security?

Eran 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zeppelin in multi user environment

IT CTO
Thanks!
That's clarify the issue...
Can you share what NFLabs doing in open source? 
Eran 

On Sun, Jun 28, 2015 at 10:10 PM moon soo Lee <[hidden email]> wrote:
Hi,

Here's something i know about multi-tenancy for Zeppelin.

A. Reverse Proxy + Zeppelin on docker.

Setup a reverse proxy, who is doing authentication and redirect user to proper Zeppelin instance running on docker container.
I saw many companies are already using Zeppelin in this way.

My company (NFLabs) also uses this way for one of internal cluster. And now preparing open source the tools that helps set up and use this type of environment. 

As far as i know, NFLabs has no plan to make commercial package of Zeppelin which has more features(such as security enabled zeppelin) than Apache version. One commercial service NFLabs doing is collaboration/sharing service for Zeppelin notebook with access control (like github for git).


B. Shiro security. PullRequest-53

Which enables dedicated notebook space for each user.
I like the approach and really make sense. 

There're couple of issues i can think.
  - compiler context are shared among users
  - user can still read other users's notebook directly from filesystem
  - user is not distinguished in interpreter level. 
  - restarting Zeppelin is required for many cases. That'll impact all connected user.

Therefore, it can be used for basic authentication, but need more work for multi-tenant environment.

So, i'd like to say, A is more like what's possible now, B is more like future work.

Thanks,
moon


On Sun, Jun 28, 2015 at 3:09 AM Eric Charles <[hidden email]> wrote:
There is also https://github.com/apache/incubator-zeppelin/pull/53 which proposes to add shiro security (user authentication on the web part). This does not address what Ophir mentions (separated environment for e.g. spark interpreters to avoid variables shared across simultaneous authenticated users).

My company (Datalayer) has also developed a multiuser extension to Zeppelin that addresses both web and interpreter user environment separation.

To achieve this, we had to change the interpreter API to propagate the authenticated user to the interpreters.

On 2015-06-28 11:54, IT CTO wrote:
Thanks Ophir!
That means that I have to wrap zeppelin with my own site which launch a zeppelin server on behalf of every requesting user. this is an option but I want avoiding it.
pls, share wherever you come across during this journey
Eran 

On Sun, Jun 28, 2015 at 12:09 PM Ophir Cohen <[hidden email]> wrote:
Actually it a bit more than that:
Even the variables shared across notebooks!

I think that NFLabs has a commercial version that supports groups and users.
In my organisation we are looking on few solutions for that.
One of them is using different instances - maybe even on the same machine.
I'm going to test it soon - but you are right, currently it's a problem.

BTW
Running different Zeppelin instances isn't such a bad idea as you get the efficiency from the yarn resource manager that can be the same cluster (assuming you using yarn)>

On Sun, Jun 28, 2015 at 10:00 AM, IT CTO <[hidden email]> wrote:
Hi,
we are in the process of testing Zeppelin as our investigation platform inside the organization.
One of the first question raised was with regard to multi user environment - currently, as I see it, all users run against the same zeppelin server and have access and availability to all notebooks.

What are other people do with regard to that?
Does the road-map have a multi-tenant solution for zeppelin? security?

Eran 

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zeppelin in multi user environment

Alexander Bezzubov
Hi,

thank you for asking,

indeed, as Moon mentioned, we are working on making a standalone tool
available that is a reverse proxy, capable of launching separate
docker container per-user for the chosen spark\hadoop version,
implementing the A. architecture from above.


--
Alex

On Mon, Jun 29, 2015 at 3:37 PM, IT CTO <[hidden email]> wrote:

> Thanks!
> That's clarify the issue...
> Can you share what NFLabs doing in open source?
> Eran
>
> On Sun, Jun 28, 2015 at 10:10 PM moon soo Lee <[hidden email]> wrote:
>>
>> Hi,
>>
>> Here's something i know about multi-tenancy for Zeppelin.
>>
>> A. Reverse Proxy + Zeppelin on docker.
>>
>> Setup a reverse proxy, who is doing authentication and redirect user to
>> proper Zeppelin instance running on docker container.
>> I saw many companies are already using Zeppelin in this way.
>>
>> My company (NFLabs) also uses this way for one of internal cluster. And
>> now preparing open source the tools that helps set up and use this type of
>> environment.
>>
>> As far as i know, NFLabs has no plan to make commercial package of
>> Zeppelin which has more features(such as security enabled zeppelin) than
>> Apache version. One commercial service NFLabs doing is collaboration/sharing
>> service for Zeppelin notebook with access control (like github for git).
>>
>>
>> B. Shiro security. PullRequest-53
>>
>> Which enables dedicated notebook space for each user.
>> I like the approach and really make sense.
>>
>> There're couple of issues i can think.
>>   - compiler context are shared among users
>>   - user can still read other users's notebook directly from filesystem
>>   - user is not distinguished in interpreter level.
>>   - restarting Zeppelin is required for many cases. That'll impact all
>> connected user.
>>
>> Therefore, it can be used for basic authentication, but need more work for
>> multi-tenant environment.
>>
>> So, i'd like to say, A is more like what's possible now, B is more like
>> future work.
>>
>> Thanks,
>> moon
>>
>>
>> On Sun, Jun 28, 2015 at 3:09 AM Eric Charles <[hidden email]> wrote:
>>>
>>> There is also https://github.com/apache/incubator-zeppelin/pull/53 which
>>> proposes to add shiro security (user authentication on the web part). This
>>> does not address what Ophir mentions (separated environment for e.g. spark
>>> interpreters to avoid variables shared across simultaneous authenticated
>>> users).
>>>
>>> My company (Datalayer) has also developed a multiuser extension to
>>> Zeppelin that addresses both web and interpreter user environment
>>> separation.
>>>
>>> To achieve this, we had to change the interpreter API to propagate the
>>> authenticated user to the interpreters.
>>>
>>> On 2015-06-28 11:54, IT CTO wrote:
>>>
>>> Thanks Ophir!
>>> That means that I have to wrap zeppelin with my own site which launch a
>>> zeppelin server on behalf of every requesting user. this is an option but I
>>> want avoiding it.
>>> pls, share wherever you come across during this journey
>>> Eran
>>>
>>> On Sun, Jun 28, 2015 at 12:09 PM Ophir Cohen <[hidden email]> wrote:
>>>>
>>>> Actually it a bit more than that:
>>>> Even the variables shared across notebooks!
>>>>
>>>> I think that NFLabs has a commercial version that supports groups and
>>>> users.
>>>> In my organisation we are looking on few solutions for that.
>>>> One of them is using different instances - maybe even on the same
>>>> machine.
>>>> I'm going to test it soon - but you are right, currently it's a problem.
>>>>
>>>> BTW
>>>> Running different Zeppelin instances isn't such a bad idea as you get
>>>> the efficiency from the yarn resource manager that can be the same cluster
>>>> (assuming you using yarn)>
>>>>
>>>> On Sun, Jun 28, 2015 at 10:00 AM, IT CTO <[hidden email]> wrote:
>>>>>
>>>>> Hi,
>>>>> we are in the process of testing Zeppelin as our investigation platform
>>>>> inside the organization.
>>>>> One of the first question raised was with regard to multi user
>>>>> environment - currently, as I see it, all users run against the same
>>>>> zeppelin server and have access and availability to all notebooks.
>>>>>
>>>>> What are other people do with regard to that?
>>>>> Does the road-map have a multi-tenant solution for zeppelin? security?
>>>>>
>>>>> Eran
>>>>
>>>>
>



--
--
Kind regards,
Alexander.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Zeppelin in multi user environment

John Omernik
Alex - 

How are you addressing the Yarn's need to have dynamic ports available on the yarn-client so the app master can connect to it? I've run into an issue where if I try to run Docker on Mesos in this setup, the containers fail due to the application master trying to connect to the container, but I didn't know the ports before the spark instance started.  I am stumped on that one... 



On Mon, Jun 29, 2015 at 9:11 PM, Alexander Bezzubov <[hidden email]> wrote:
Hi,

thank you for asking,

indeed, as Moon mentioned, we are working on making a standalone tool
available that is a reverse proxy, capable of launching separate
docker container per-user for the chosen spark\hadoop version,
implementing the A. architecture from above.


--
Alex

On Mon, Jun 29, 2015 at 3:37 PM, IT CTO <[hidden email]> wrote:
> Thanks!
> That's clarify the issue...
> Can you share what NFLabs doing in open source?
> Eran
>
> On Sun, Jun 28, 2015 at 10:10 PM moon soo Lee <[hidden email]> wrote:
>>
>> Hi,
>>
>> Here's something i know about multi-tenancy for Zeppelin.
>>
>> A. Reverse Proxy + Zeppelin on docker.
>>
>> Setup a reverse proxy, who is doing authentication and redirect user to
>> proper Zeppelin instance running on docker container.
>> I saw many companies are already using Zeppelin in this way.
>>
>> My company (NFLabs) also uses this way for one of internal cluster. And
>> now preparing open source the tools that helps set up and use this type of
>> environment.
>>
>> As far as i know, NFLabs has no plan to make commercial package of
>> Zeppelin which has more features(such as security enabled zeppelin) than
>> Apache version. One commercial service NFLabs doing is collaboration/sharing
>> service for Zeppelin notebook with access control (like github for git).
>>
>>
>> B. Shiro security. PullRequest-53
>>
>> Which enables dedicated notebook space for each user.
>> I like the approach and really make sense.
>>
>> There're couple of issues i can think.
>>   - compiler context are shared among users
>>   - user can still read other users's notebook directly from filesystem
>>   - user is not distinguished in interpreter level.
>>   - restarting Zeppelin is required for many cases. That'll impact all
>> connected user.
>>
>> Therefore, it can be used for basic authentication, but need more work for
>> multi-tenant environment.
>>
>> So, i'd like to say, A is more like what's possible now, B is more like
>> future work.
>>
>> Thanks,
>> moon
>>
>>
>> On Sun, Jun 28, 2015 at 3:09 AM Eric Charles <[hidden email]> wrote:
>>>
>>> There is also https://github.com/apache/incubator-zeppelin/pull/53 which
>>> proposes to add shiro security (user authentication on the web part). This
>>> does not address what Ophir mentions (separated environment for e.g. spark
>>> interpreters to avoid variables shared across simultaneous authenticated
>>> users).
>>>
>>> My company (Datalayer) has also developed a multiuser extension to
>>> Zeppelin that addresses both web and interpreter user environment
>>> separation.
>>>
>>> To achieve this, we had to change the interpreter API to propagate the
>>> authenticated user to the interpreters.
>>>
>>> On 2015-06-28 11:54, IT CTO wrote:
>>>
>>> Thanks Ophir!
>>> That means that I have to wrap zeppelin with my own site which launch a
>>> zeppelin server on behalf of every requesting user. this is an option but I
>>> want avoiding it.
>>> pls, share wherever you come across during this journey
>>> Eran
>>>
>>> On Sun, Jun 28, 2015 at 12:09 PM Ophir Cohen <[hidden email]> wrote:
>>>>
>>>> Actually it a bit more than that:
>>>> Even the variables shared across notebooks!
>>>>
>>>> I think that NFLabs has a commercial version that supports groups and
>>>> users.
>>>> In my organisation we are looking on few solutions for that.
>>>> One of them is using different instances - maybe even on the same
>>>> machine.
>>>> I'm going to test it soon - but you are right, currently it's a problem.
>>>>
>>>> BTW
>>>> Running different Zeppelin instances isn't such a bad idea as you get
>>>> the efficiency from the yarn resource manager that can be the same cluster
>>>> (assuming you using yarn)>
>>>>
>>>> On Sun, Jun 28, 2015 at 10:00 AM, IT CTO <[hidden email]> wrote:
>>>>>
>>>>> Hi,
>>>>> we are in the process of testing Zeppelin as our investigation platform
>>>>> inside the organization.
>>>>> One of the first question raised was with regard to multi user
>>>>> environment - currently, as I see it, all users run against the same
>>>>> zeppelin server and have access and availability to all notebooks.
>>>>>
>>>>> What are other people do with regard to that?
>>>>> Does the road-map have a multi-tenant solution for zeppelin? security?
>>>>>
>>>>> Eran
>>>>
>>>>
>



--
--
Kind regards,
Alexander.

Loading...