Zeppelin Spark Streaming Twitter Stuck

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Zeppelin Spark Streaming Twitter Stuck

Chaoran Yu
Hello,

Has anybody got Spark Streaming Twitter example to work in Zeppelin? When I started the streaming context with ssc.start(), the Zeppelin paragraph seemed to have started but it got stuck there. The top right corner of the paragraph says “RUNNING 0%”

I think this is a problem with Spark Streaming + Zeppelin, rather than one with Twitter example in particular. Because I’ve tried my own simple streaming tests but got the same thing: Stuck in “RUNNING 0%” status forever.

I also tried to stop the streaming context with ssc.stop() in a new paragraph but it won’t execute i.e. remain in “PENDING” status. In fact, any new code won’t execute in a new paragraph. I had to restart Zeppelin to get me out of this situation. Zeppelin logs didn’t reveal any errors either.

Could anyone help me here?

Thank you,
Chaoran Yu
Reply | Threaded
Open this post in threaded view
|

Re: Zeppelin Spark Streaming Twitter Stuck

Raffaele S
You might have to add the relative artifacts manually in the interpreter page.

Raffaele



2017-03-30 4:12 GMT+02:00 Chaoran Yu <[hidden email]>:
Hello,

Has anybody got Spark Streaming Twitter example to work in Zeppelin? When I started the streaming context with ssc.start(), the Zeppelin paragraph seemed to have started but it got stuck there. The top right corner of the paragraph says “RUNNING 0%”

I think this is a problem with Spark Streaming + Zeppelin, rather than one with Twitter example in particular. Because I’ve tried my own simple streaming tests but got the same thing: Stuck in “RUNNING 0%” status forever.

I also tried to stop the streaming context with ssc.stop() in a new paragraph but it won’t execute i.e. remain in “PENDING” status. In fact, any new code won’t execute in a new paragraph. I had to restart Zeppelin to get me out of this situation. Zeppelin logs didn’t reveal any errors either.

Could anyone help me here?

Thank you,
Chaoran Yu

Reply | Threaded
Open this post in threaded view
|

Re: Zeppelin Spark Streaming Twitter Stuck

Chaoran Yu

I think I’ve added required artifacts in interpreter dependency settings.
If there are artifacts missing, wouldn’t I see errors either in Zeppelin notebook UI or /logs folder? But I didn’t. For example, if I missed a Twitter artifact, I would have gotten a ClassNotFound error. Instead, what I saw was the notebook started running but never made any progress. 


On Mar 30, 2017, at 3:59 AM, Raffaele S <[hidden email]> wrote:

You might have to add the relative artifacts manually in the interpreter page.

Raffaele



2017-03-30 4:12 GMT+02:00 Chaoran Yu <[hidden email]>:
Hello,

Has anybody got Spark Streaming Twitter example to work in Zeppelin? When I started the streaming context with ssc.start(), the Zeppelin paragraph seemed to have started but it got stuck there. The top right corner of the paragraph says “RUNNING 0%”

I think this is a problem with Spark Streaming + Zeppelin, rather than one with Twitter example in particular. Because I’ve tried my own simple streaming tests but got the same thing: Stuck in “RUNNING 0%” status forever.

I also tried to stop the streaming context with ssc.stop() in a new paragraph but it won’t execute i.e. remain in “PENDING” status. In fact, any new code won’t execute in a new paragraph. I had to restart Zeppelin to get me out of this situation. Zeppelin logs didn’t reveal any errors either.

Could anyone help me here?

Thank you,
Chaoran Yu


Reply | Threaded
Open this post in threaded view
|

Re: Zeppelin Spark Streaming Twitter Stuck

kant kodali
@Chaoran Yu Yeah I don't think its dependency issue. you wouldn't be able to call methods if you are missing dependencies. 

I am also in a similar boat though I am trying to get Streaming and Zeppelin to work except I have my own indirect receiver (not the direct stream). That twitter example is pretty old. I am using Spark 2.1.0. I wonder if I should call streamingContext.awaitTermination from zeppelin? I only do the following and streamingContext is stopped every time after execute the following lines. still debugging and trying to see what is going on. can let you know once I have something working.

jsonDStream.foreachRDD(rdd => println(rdd.count()));
streamingContext.start();

On Thu, Mar 30, 2017 at 6:47 AM, Chaoran Yu <[hidden email]> wrote:

I think I’ve added required artifacts in interpreter dependency settings.
If there are artifacts missing, wouldn’t I see errors either in Zeppelin notebook UI or /logs folder? But I didn’t. For example, if I missed a Twitter artifact, I would have gotten a ClassNotFound error. Instead, what I saw was the notebook started running but never made any progress. 


On Mar 30, 2017, at 3:59 AM, Raffaele S <[hidden email]> wrote:

You might have to add the relative artifacts manually in the interpreter page.

Raffaele



2017-03-30 4:12 GMT+02:00 Chaoran Yu <[hidden email]>:
Hello,

Has anybody got Spark Streaming Twitter example to work in Zeppelin? When I started the streaming context with ssc.start(), the Zeppelin paragraph seemed to have started but it got stuck there. The top right corner of the paragraph says “RUNNING 0%”

I think this is a problem with Spark Streaming + Zeppelin, rather than one with Twitter example in particular. Because I’ve tried my own simple streaming tests but got the same thing: Stuck in “RUNNING 0%” status forever.

I also tried to stop the streaming context with ssc.stop() in a new paragraph but it won’t execute i.e. remain in “PENDING” status. In fact, any new code won’t execute in a new paragraph. I had to restart Zeppelin to get me out of this situation. Zeppelin logs didn’t reveal any errors either.

Could anyone help me here?

Thank you,
Chaoran Yu



Reply | Threaded
Open this post in threaded view
|

Re: Zeppelin Spark Streaming Twitter Stuck

kant kodali
@Chaoran Yu  I finally got it working. here is my code. I usually code in Java but tried to convert it into scala below. 

import spark.implicits._
import org.apache.spark.streaming._

SparkConf sparkConf = sc.getConf();
sparkConf.setJars(JavaSparkContext.jarOfClass(Hello.class));
val streamingContext = new StreamingContext(sc, 1000)
val jsonDStream = streamingContext.receiverStream(receiver); // indirect receiver

jsonDStream.foreachRDD{rdd => 
    val jsonDF = spark.read.json(rdd)
    jsonDF.createOrReplaceTempView("jsondf")
}
streamingContext.start()

%sql select * from jsondf



On Sun, Apr 23, 2017 at 12:11 PM, kant kodali <[hidden email]> wrote:
@Chaoran Yu Yeah I don't think its dependency issue. you wouldn't be able to call methods if you are missing dependencies. 

I am also in a similar boat though I am trying to get Streaming and Zeppelin to work except I have my own indirect receiver (not the direct stream). That twitter example is pretty old. I am using Spark 2.1.0. I wonder if I should call streamingContext.awaitTermination from zeppelin? I only do the following and streamingContext is stopped every time after execute the following lines. still debugging and trying to see what is going on. can let you know once I have something working.

jsonDStream.foreachRDD(rdd => println(rdd.count()));
streamingContext.start();

On Thu, Mar 30, 2017 at 6:47 AM, Chaoran Yu <[hidden email]> wrote:

I think I’ve added required artifacts in interpreter dependency settings.
If there are artifacts missing, wouldn’t I see errors either in Zeppelin notebook UI or /logs folder? But I didn’t. For example, if I missed a Twitter artifact, I would have gotten a ClassNotFound error. Instead, what I saw was the notebook started running but never made any progress. 


On Mar 30, 2017, at 3:59 AM, Raffaele S <[hidden email]> wrote:

You might have to add the relative artifacts manually in the interpreter page.

Raffaele



2017-03-30 4:12 GMT+02:00 Chaoran Yu <[hidden email]>:
Hello,

Has anybody got Spark Streaming Twitter example to work in Zeppelin? When I started the streaming context with ssc.start(), the Zeppelin paragraph seemed to have started but it got stuck there. The top right corner of the paragraph says “RUNNING 0%”

I think this is a problem with Spark Streaming + Zeppelin, rather than one with Twitter example in particular. Because I’ve tried my own simple streaming tests but got the same thing: Stuck in “RUNNING 0%” status forever.

I also tried to stop the streaming context with ssc.stop() in a new paragraph but it won’t execute i.e. remain in “PENDING” status. In fact, any new code won’t execute in a new paragraph. I had to restart Zeppelin to get me out of this situation. Zeppelin logs didn’t reveal any errors either.

Could anyone help me here?

Thank you,
Chaoran Yu