Re: sqlcontext not available in pyspark interpreter

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: sqlcontext not available in pyspark interpreter

moon
Administrator
in pyspark, 'sqlc' is the injected variable name at the moment.
So, current workaround could be simply doing
'sqlContext = sqlc'

I think there's no good reason that not using sqlContext as a variable name. (keeping sqlc, too. for backward compatibility)


Best,
moon

On Thu, Jun 25, 2015 at 6:30 AM Dafne van Kuppevelt <[hidden email]> wrote:
Hi,
 
I run into the problem that the 'global' sqlContext variable is not available in the pyspark interpreter.
 
If I have for example the folowing code:
%pyspark
df = sqlContext.createDataFrame(...)
 
I get the error:
(<type 'exceptions.NameError'>, NameError("name 'sqlContext'is not defined",)
 
When I add the sqlContext explicitly:
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
 
the df will be created, but if I register it as a (temp) table, it is not available in the sql interpreter! (or in the SQLcontext in Scala)
 
If I do the same thing in Scala it works fine, for example if I run the example notebook with the 'bank' table.
 
Some info about my enivornment:
I'm running spark in yarn-client mode, the spark.home and zeppelin.pyspark.python properties of the interpreter are set, to resp spark 1.3 and python 2.7.
 
Thanks in advance for your help,
 
Dafne
Loading...