%sql counts from multiple columns by checkbox

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

%sql counts from multiple columns by checkbox

Jonathan Davidson
Hello all, first time poster! I tried looking in forums and documentation but nothing seems to suit the specific nature of my question.

I am trying to display counts based on checkbox functionality in Zeppelin.The base table can be pretty simple:
%spark
case class Data(a: Int, b: String)
val df = Seq(
  Data(1, "one"),
  Data(2, "two"),
  Data(3,null)).toDF()
df.collect()
df.registerTempTable("table")

I want counts in SQL:
%sql
SELECT COUNT(a),COUNT(b) FROM table
--gives 3 and 2, respectively

I would think to use 
SELECT COUNT(${checkbox:test=a,a|b}) FROM table
However, the problem is that it doesn't scale to multiple checked values at the same time. If you use more than one, it comes out as the equivalent of 
SELECT COUNT(a,b) FROM ds2
which returns one value of 2; I want two values 3 and 2 as stated above

Basically, I've tried turning the values into COUNT(a) and COUNT(b) but it gets confused:
SELECT ${checkbox(whatever):test=COUNT(a)|COUNT(b),COUNT(a)|COUNT(b)} FROM ds2
cannot recognize input near 'FROM' 'ds2' '<EOF>' in select clause; line 1 pos 8

I've also tried simple manipulation like adding stuff in parenthesis after ${checkbox...
SELECT ${checkbox(whatever):test=a|b,a|b} FROM ds2
cannot resolve 'awhateverb' given input columns: [a, b]; line 1 pos 7

Does anyone have advice?

Thanks,
Jonathan
Loading...