On demand user cluster for Zeppelin-Spark?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

On demand user cluster for Zeppelin-Spark?

c1291
This post has NOT been accepted by the mailing list yet.
We use cloudera to deploy a zeppelin-spark-yarn-hdfs cluster. Right now, there's only one instance of zeppelin and spark, and the execution of all spark notebooks affects every user. For instance, if we stop the spark context in a user's notebook, it affects all other user's notebooks. I've seen that there's an option in zeppelin to isolate interpreters, but is there a way to provide each user with its own 'cluster' on demand? Maybe using Docker and building an image with zeppelin and spark for each user, and limiting their resources to the ones provided by the user cluster? I'm quite lost as to how to implement it, or if it's even possible, but my ideal scenario would be an approach like databricks does. There you can have your own cluster, and all resources are isolated from other users.

Thanks in advance!
Loading...