Does Zeppelin 0.7.1 work with Spark Structured Streaming 2.1.1?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Does Zeppelin 0.7.1 work with Spark Structured Streaming 2.1.1?

kant kodali
Hi All,

I have the following code 

StreamingQuery query = df2.writeStream().outputMode("complete").queryName("foo").option("truncate", "false").format("console").start();
query.awaitTermination();

and it works fine however when I change it to the below code. I do get the output but only once and I tried running %spark.sql select * from foo over and over again but I don't see results getting updated but in console format like above it works perfect fine I can see updates on each batch so should I be doing something else for memory sink?

StreamingQuery query = df2.writeStream().outputMode("complete").queryName("foo").option("truncate", "false").format("memory").start();
%spark.sql select * from foo

Thanks!

Reply | Threaded
Open this post in threaded view
|

Re: Does Zeppelin 0.7.1 work with Spark Structured Streaming 2.1.1?

kant kodali
Look for something that is more like in this video where the graphs automatically update themselves. Is that possible in Zeppelin?

https://www.youtube.com/watch?v=IJmFTXvUZgY 

You can watch it from 9:20 


On Fri, May 19, 2017 at 9:21 AM, kant kodali <[hidden email]> wrote:
Hi All,

I have the following code 

StreamingQuery query = df2.writeStream().outputMode("complete").queryName("foo").option("truncate", "false").format("console").start();
query.awaitTermination();

and it works fine however when I change it to the below code. I do get the output but only once and I tried running %spark.sql select * from foo over and over again but I don't see results getting updated but in console format like above it works perfect fine I can see updates on each batch so should I be doing something else for memory sink?

StreamingQuery query = df2.writeStream().outputMode("complete").queryName("foo").option("truncate", "false").format("memory").start();
%spark.sql select * from foo

Thanks!


Reply | Threaded
Open this post in threaded view
|

Re: Does Zeppelin 0.7.1 work with Spark Structured Streaming 2.1.1?

Felix Cheung
I think you can have one notebook for setup - calling the source and sink as you have below; then another notebook to run the %sql to read from memory temp view and output to a visualization.

This way you could have the 2nd notebook be a dashboard and run (i.e. refresh automatically) on a schedule.

As for
df2.writeStream().outputMode("complete").queryName("foo").option("truncate","false").format("console").start();

You can't use queryName with format console - this is not supported by Spark Structured Streaming.

_____________________________
From: kant kodali <[hidden email]>
Sent: Friday, May 19, 2017 10:29 AM
Subject: Re: Does Zeppelin 0.7.1 work with Spark Structured Streaming 2.1.1?
To: <[hidden email]>


Look for something that is more like in this video where the graphs automatically update themselves. Is that possible in Zeppelin?

https://www.youtube.com/watch?v=IJmFTXvUZgY 

You can watch it from 9:20 


On Fri, May 19, 2017 at 9:21 AM, kant kodali <[hidden email]> wrote:
Hi All,

I have the following code 

StreamingQuery query = df2.writeStream().outputMode("complete").queryName("foo").option("truncate","false").format("console").start();
query.awaitTermination();

and it works fine however when I change it to the below code. I do get the output but only once and I tried running %spark.sql select * from foo over and over again but I don't see results getting updated but in console format like above it works perfect fine I can see updates on each batch so should I be doing something else for memory sink?

StreamingQuery query = df2.writeStream().outputMode("complete").queryName("foo").option("truncate", "false").format("memory").start();
%spark.sql select * from foo

Thanks!