Is any limitation of maximum interpreter processes?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Is any limitation of maximum interpreter processes?

Belousov Maksim Eduardovich

Hello, users!

 

Our analysts run notes with such interpreters: markdown, one or two jdbc and pyspark. The interpreters are instantiated Per User in isolated process and Per Note in isolated process.

 

And the analysts complain that sometimes paragraphs aren't processed and stay in status 'Pending'.

We noticed that it happen when number of started interpreter processes is about 90-100.

If admin restarts one of the popular interpreter (that is killing some interpreter processes), the paragraphs become 'Running'.

 

We can't see any workload on zeppelin server when paragraphs are pended. RAM is sufficiently, iowait ~ 0

Also we can't find out any parameters about maximum interpreter processes.

 

 

Has anyone of you faced the same problem? How can this problem be solved?

 

 

Thanks,

Maksim Belousov

 

Reply | Threaded
Open this post in threaded view
|

Re: Is any limitation of maximum interpreter processes?

Jianfeng (Jeff) Zhang

Which interpreter is pending ? It is possible that spark interpreter pending due to yarn resource capacity if you run it in yarn client mode

If it is pending, you can check the log first.



Best Regard,
Jeff Zhang


From: Belousov Maksim Eduardovich <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Monday, October 2, 2017 at 9:26 PM
To: "[hidden email]" <[hidden email]>
Subject: Is any limitation of maximum interpreter processes?

Hello, users!

 

Our analysts run notes with such interpreters: markdown, one or two jdbc and pyspark. The interpreters are instantiated Per User in isolated process and Per Note in isolated process.

 

And the analysts complain that sometimes paragraphs aren't processed and stay in status 'Pending'.

We noticed that it happen when number of started interpreter processes is about 90-100.

If admin restarts one of the popular interpreter (that is killing some interpreter processes), the paragraphs become 'Running'.

 

We can't see any workload on zeppelin server when paragraphs are pended. RAM is sufficiently, iowait ~ 0

Also we can't find out any parameters about maximum interpreter processes.

 

 

Has anyone of you faced the same problem? How can this problem be solved?

 

 

Thanks,

Maksim Belousov

 

Reply | Threaded
Open this post in threaded view
|

RE: Is any limitation of maximum interpreter processes?

Belousov Maksim Eduardovich

> Which interpreter is pending ?

There comes a time when any paragraph with any interpreter doesn't run and remains in 'Pending' state.

We use local spark instances in spark interpretator.

 

Logs don't contain errors.

 


Максим Белоусов
Архитектор

Отдел отчетности и витрин данных

Управление хранилищ данных и отчетности
Тел.: +7 495 648-10-00, доб. 2271

 

From: Jianfeng (Jeff) Zhang [mailto:[hidden email]]
Sent: Tuesday, October 03, 2017 2:01 AM
To: [hidden email]
Subject: Re: Is any limitation of maximum interpreter processes?

 

 

Which interpreter is pending ? It is possible that spark interpreter pending due to yarn resource capacity if you run it in yarn client mode

 

If it is pending, you can check the log first.

 

 

 

Best Regard,

Jeff Zhang

 

 

From: Belousov Maksim Eduardovich <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Monday, October 2, 2017 at 9:26 PM
To: "[hidden email]" <[hidden email]>
Subject: Is any limitation of maximum interpreter processes?

 

Hello, users!

 

Our analysts run notes with such interpreters: markdown, one or two jdbc and pyspark. The interpreters are instantiated Per User in isolated process and Per Note in isolated process.

 

And the analysts complain that sometimes paragraphs aren't processed and stay in status 'Pending'.

We noticed that it happen when number of started interpreter processes is about 90-100.

If admin restarts one of the popular interpreter (that is killing some interpreter processes), the paragraphs become 'Running'.

 

We can't see any workload on zeppelin server when paragraphs are pended. RAM is sufficiently, iowait ~ 0

Also we can't find out any parameters about maximum interpreter processes.

 

 

Has anyone of you faced the same problem? How can this problem be solved?

 

 

Thanks,


Maksim Belousov


 

Reply | Threaded
Open this post in threaded view
|

RE: Is any limitation of maximum interpreter processes?

Belousov Maksim Eduardovich

I found out that there is a limitaion in a number of schedulers in SchedulerFactory.java[1]

 

"executor = ExecutorFactory.singleton().createOrGet("SchedulerFactory", 100);"

 

It can be tested by:

Set a small number for SchedulerFactory, for example 16.

Run notes with interpreters in an isolated mode per user and per note.

See pending paragraphs when a dozen of interpreter processes will start.

 

There is no limitation in total number of started interpreter processes, but there is a limitation in schedulers.

Scheduler born inside interpreter. If we need a limitation it's to be good to limit a number of interpreter processes.

 

Is this limitation in schedulers useful?

 

 

1. https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/scheduler/SchedulerFactory.java

 


Maksim Belousov

 

From: Belousov Maksim Eduardovich [mailto:[hidden email]]
Sent: Tuesday, October 03, 2017 10:37 AM
To: [hidden email]
Subject: RE: Is any limitation of maximum interpreter processes?

 

> Which interpreter is pending ?

There comes a time when any paragraph with any interpreter doesn't run and remains in 'Pending' state.

We use local spark instances in spark interpretator.

 

Logs don't contain errors.

 


Максим Белоусов
Архитектор

Отдел отчетности и витрин данных

Управление хранилищ данных и отчетности
Тел.: +7 495 648-10-00, доб. 2271

 

From: Jianfeng (Jeff) Zhang [[hidden email]]
Sent: Tuesday, October 03, 2017 2:01 AM
To:
[hidden email]
Subject: Re: Is any limitation of maximum interpreter processes?

 

 

Which interpreter is pending ? It is possible that spark interpreter pending due to yarn resource capacity if you run it in yarn client mode

 

If it is pending, you can check the log first.

 

 

 

Best Regard,

Jeff Zhang

 

 

From: Belousov Maksim Eduardovich <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Monday, October 2, 2017 at 9:26 PM
To: "[hidden email]" <[hidden email]>
Subject: Is any limitation of maximum interpreter processes?

 

Hello, users!

 

Our analysts run notes with such interpreters: markdown, one or two jdbc and pyspark. The interpreters are instantiated Per User in isolated process and Per Note in isolated process.

 

And the analysts complain that sometimes paragraphs aren't processed and stay in status 'Pending'.

We noticed that it happen when number of started interpreter processes is about 90-100.

If admin restarts one of the popular interpreter (that is killing some interpreter processes), the paragraphs become 'Running'.

 

We can't see any workload on zeppelin server when paragraphs are pended. RAM is sufficiently, iowait ~ 0

Also we can't find out any parameters about maximum interpreter processes.

 

 

Has anyone of you faced the same problem? How can this problem be solved?

 

 

Thanks,

Maksim Belousov