How to execute spark-submit on Note

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

How to execute spark-submit on Note

小野圭二
Hi all,

I searched this topic on the archive of ml, but still could not find out the solution clearly.
So i have tried to post this again(maybe).

I am using ver 0.8.0, and have installed spark 2.2 on the other path, just for checking my test program.
Then i wrote a quite simple sample python code to check the how to.

1. the code works fine on a note in Zeppelin
2. the same code but added the initialize code for SparkContext in it works fine on the Spark by using 'spark-submit'.
3. tried to execute "2" from a note in Zeppelin with the following script.
    yes, "spark" interpreter has been implemented in the note.
    then on the note,
        %spark-submit <program name with full path>   
          -> interpreter not found error
4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the doc
    ex. export SPARK_SUBMIT_OPTIONS='--packages com.databricks:spark-csv_2.10:1.2.0'
5. then running.... 
     %spark-submit <program name with full path> 
      -> interpreter not found error  (as same as "3")

How can i use spark-submit from a note?
Any advice thanks.

-Keiji
Reply | Threaded
Open this post in threaded view
|

Re: How to execute spark-submit on Note

Jianfeng (Jeff) Zhang

I am surprised why would you use %spark-submit, there’s no document about %spark-submit.   If you want to use spark-submit in zeppelin, then you could use %sh


Best Regard,
Jeff Zhang


From: 小野圭二 <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Tuesday, October 3, 2017 at 12:49 PM
To: "[hidden email]" <[hidden email]>
Subject: How to execute spark-submit on Note

Hi all,

I searched this topic on the archive of ml, but still could not find out the solution clearly.
So i have tried to post this again(maybe).

I am using ver 0.8.0, and have installed spark 2.2 on the other path, just for checking my test program.
Then i wrote a quite simple sample python code to check the how to.

1. the code works fine on a note in Zeppelin
2. the same code but added the initialize code for SparkContext in it works fine on the Spark by using 'spark-submit'.
3. tried to execute "2" from a note in Zeppelin with the following script.
    yes, "spark" interpreter has been implemented in the note.
    then on the note,
        %spark-submit <program name with full path>   
          -> interpreter not found error
4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the doc
    ex. export SPARK_SUBMIT_OPTIONS='--packages com.databricks:spark-csv_2.10:1.2.0'
5. then running.... 
     %spark-submit <program name with full path> 
      -> interpreter not found error  (as same as "3")

How can i use spark-submit from a note?
Any advice thanks.

-Keiji
Reply | Threaded
Open this post in threaded view
|

Re: How to execute spark-submit on Note

小野圭二
Thank you for your reply, Jeff

"%sh" ?
"sh" seems like request something execution code.
I tried "%sh", then

%sh <program name with full path>
      %sh bash: <program name>: no permission

I made binary file from .py to .pyc, but the answer was as same.
I am sorry seems like doubting you, but Is "%sh" the resolution?

-Keiji

2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang <[hidden email]>:

I am surprised why would you use %spark-submit, there’s no document about %spark-submit.   If you want to use spark-submit in zeppelin, then you could use %sh


Best Regard,
Jeff Zhang


From: 小野圭二 <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Tuesday, October 3, 2017 at 12:49 PM
To: "[hidden email]" <[hidden email]>
Subject: How to execute spark-submit on Note

Hi all,

I searched this topic on the archive of ml, but still could not find out the solution clearly.
So i have tried to post this again(maybe).

I am using ver 0.8.0, and have installed spark 2.2 on the other path, just for checking my test program.
Then i wrote a quite simple sample python code to check the how to.

1. the code works fine on a note in Zeppelin
2. the same code but added the initialize code for SparkContext in it works fine on the Spark by using 'spark-submit'.
3. tried to execute "2" from a note in Zeppelin with the following script.
    yes, "spark" interpreter has been implemented in the note.
    then on the note,
        %spark-submit <program name with full path>   
          -> interpreter not found error
4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the doc
    ex. export SPARK_SUBMIT_OPTIONS='--packages com.databricks:spark-csv_2.10:1.2.0'
5. then running.... 
     %spark-submit <program name with full path> 
      -> interpreter not found error  (as same as "3")

How can i use spark-submit from a note?
Any advice thanks.

-Keiji

Reply | Threaded
Open this post in threaded view
|

Re: How to execute spark-submit on Note

Jeff Zhang
%sh is shell interpreter, you can run spark-submit just as you run it in shell terminal. 

小野圭二 <[hidden email]>于2017年10月3日周二 下午4:58写道:
Thank you for your reply, Jeff

"%sh" ?
"sh" seems like request something execution code.
I tried "%sh", then

%sh <program name with full path>
      %sh bash: <program name>: no permission

I made binary file from .py to .pyc, but the answer was as same.
I am sorry seems like doubting you, but Is "%sh" the resolution?

-Keiji

2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang <[hidden email]>:

I am surprised why would you use %spark-submit, there’s no document about %spark-submit.   If you want to use spark-submit in zeppelin, then you could use %sh


Best Regard,
Jeff Zhang


From: 小野圭二 <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Tuesday, October 3, 2017 at 12:49 PM
To: "[hidden email]" <[hidden email]>
Subject: How to execute spark-submit on Note

Hi all,

I searched this topic on the archive of ml, but still could not find out the solution clearly.
So i have tried to post this again(maybe).

I am using ver 0.8.0, and have installed spark 2.2 on the other path, just for checking my test program.
Then i wrote a quite simple sample python code to check the how to.

1. the code works fine on a note in Zeppelin
2. the same code but added the initialize code for SparkContext in it works fine on the Spark by using 'spark-submit'.
3. tried to execute "2" from a note in Zeppelin with the following script.
    yes, "spark" interpreter has been implemented in the note.
    then on the note,
        %spark-submit <program name with full path>   
          -> interpreter not found error
4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the doc
    ex. export SPARK_SUBMIT_OPTIONS='--packages com.databricks:spark-csv_2.10:1.2.0'
5. then running.... 
     %spark-submit <program name with full path> 
      -> interpreter not found error  (as same as "3")

How can i use spark-submit from a note?
Any advice thanks.

-Keiji

Reply | Threaded
Open this post in threaded view
|

Re: How to execute spark-submit on Note

小野圭二
Thank you for your quick reply again, Jeff.

Yes i know the difference of the execution environment between "%sh" and ">spark-submit".
And my question was "how to execute spark-submit as shell interpreter".
That mean, i am searching how to execute a binary program from a note of zeppelin.
This time it has been limited on Spark.

Seems like Zeppelin have several procedure to execute Spark shell, like spark.pyspark, spark.sql.... 
So how to do "spark-submit" was my wondering.

I am sorry for bothering Your time, but at the same time, i am appreciated if You get my wondering clearly, and show me some tips.

-Keiji


2017-10-03 18:30 GMT+09:00 Jeff Zhang <[hidden email]>:
%sh is shell interpreter, you can run spark-submit just as you run it in shell terminal. 

小野圭二 <[hidden email]>于2017年10月3日周二 下午4:58写道:
Thank you for your reply, Jeff

"%sh" ?
"sh" seems like request something execution code.
I tried "%sh", then

%sh <program name with full path>
      %sh bash: <program name>: no permission

I made binary file from .py to .pyc, but the answer was as same.
I am sorry seems like doubting you, but Is "%sh" the resolution?

-Keiji

2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang <[hidden email]>:

I am surprised why would you use %spark-submit, there’s no document about %spark-submit.   If you want to use spark-submit in zeppelin, then you could use %sh


Best Regard,
Jeff Zhang


From: 小野圭二 <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Tuesday, October 3, 2017 at 12:49 PM
To: "[hidden email]" <[hidden email]>
Subject: How to execute spark-submit on Note

Hi all,

I searched this topic on the archive of ml, but still could not find out the solution clearly.
So i have tried to post this again(maybe).

I am using ver 0.8.0, and have installed spark 2.2 on the other path, just for checking my test program.
Then i wrote a quite simple sample python code to check the how to.

1. the code works fine on a note in Zeppelin
2. the same code but added the initialize code for SparkContext in it works fine on the Spark by using 'spark-submit'.
3. tried to execute "2" from a note in Zeppelin with the following script.
    yes, "spark" interpreter has been implemented in the note.
    then on the note,
        %spark-submit <program name with full path>   
          -> interpreter not found error
4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the doc
    ex. export SPARK_SUBMIT_OPTIONS='--packages com.databricks:spark-csv_2.10:1.2.0'
5. then running.... 
     %spark-submit <program name with full path> 
      -> interpreter not found error  (as same as "3")

How can i use spark-submit from a note?
Any advice thanks.

-Keiji


Reply | Threaded
Open this post in threaded view
|

RE: How to execute spark-submit on Note

David Howell

Hi Keiji,

 

In the paragraph you would write:

%sh

spark-submit myapp.jar ...

 

The %sh interpreter is a shell, and runs as the zeppelin service user with whatever permissions it has. You can run any shell commands in it.

 

Although, this is a fairly strange way to run zeppelin so I’m not really sure that is what you want.

 

You can just use the %spark.pyspark interpreter and write your python spark code in there. The spark interpreters in Zeppelin already create the Spark Context for you, as well as sqlContext and spark session. These are available as sc, sqlContext and spark. If you have a program that is ready for spark submit, I would use some other tool to schedule and run it, like cron, oozie, NiFi, Luigi, Airflow etc. Or if you want to run manually just use spark submit from the shell directly or ssh.

 

 

Dave

 

From: [hidden email]
Sent: Tuesday, 3 October 2017 8:43 PM
To: [hidden email]
Subject: Re: How to execute spark-submit on Note

 

Thank you for your quick reply again, Jeff.

Yes i know the difference of the execution environment between "%sh" and ">spark-submit".
And my question was "how to execute spark-submit as shell interpreter".
That mean, i am searching how to execute a binary program from a note of zeppelin.
This time it has been limited on Spark.

Seems like Zeppelin have several procedure to execute Spark shell, like spark.pyspark, spark.sql.... 
So how to do "spark-submit" was my wondering.

I am sorry for bothering Your time, but at the same time, i am appreciated if You get my wondering clearly, and show me some tips.

-Keiji


2017-10-03 18:30 GMT+09:00 Jeff Zhang <[hidden email]>:
%sh is shell interpreter, you can run spark-submit just as you run it in shell terminal. 

小野圭二 <[hidden email]>于2017年10月3日周二 下午4:58写道:
Thank you for your reply, Jeff

"%sh" ?
"sh" seems like request something execution code.
I tried "%sh", then

%sh <program name with full path>
      %sh bash: <program name>: no permission

I made binary file from .py to .pyc, but the answer was as same.
I am sorry seems like doubting you, but Is "%sh" the resolution?

-Keiji

2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang <[hidden email]>:

I am surprised why would you use %spark-submit, there’s no document about %spark-submit.   If you want to use spark-submit in zeppelin, then you could use %sh


Best Regard,
Jeff Zhang


From: 小野圭二 <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Tuesday, October 3, 2017 at 12:49 PM
To: "[hidden email]" <[hidden email]>
Subject: How to execute spark-submit on Note

Hi all,

I searched this topic on the archive of ml, but still could not find out the solution clearly.
So i have tried to post this again(maybe).

I am using ver 0.8.0, and have installed spark 2.2 on the other path, just for checking my test program.
Then i wrote a quite simple sample python code to check the how to.

1. the code works fine on a note in Zeppelin
2. the same code but added the initialize code for SparkContext in it works fine on the Spark by using 'spark-submit'.
3. tried to execute "2" from a note in Zeppelin with the following script.
    yes, "spark" interpreter has been implemented in the note.
    then on the note,
        %spark-submit <program name with full path>   
          -> interpreter not found error
4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the doc
    ex. export SPARK_SUBMIT_OPTIONS='--packages com.databricks:spark-csv_2.10:1.2.0'
5. then running.... 
     %spark-submit <program name with full path> 
      -> interpreter not found error  (as same as "3")

How can i use spark-submit from a note?
Any advice thanks.

-Keiji


Reply | Threaded
Open this post in threaded view
|

Re: How to execute spark-submit on Note

小野圭二
Hi Dave,

Thank You for your suggestion.
It worked fine order by my expectation so far.
I did not know "%sh" could use like that.

Anyhow, i would like to explain why i would like to execute "spark-submit" in a note, to be clear your wondering. 
Yes, i know the basic instruction of Zeppelin as You explained to me in your reply, Dave. 
So, now, i tried to find the prospect of the environment of execution in Zeppelin.
That mean, we were considering how to deliver our programs to users widely after we made a program with collaboration on Zeppelin. In this case, we might do not want to disclose our source code to them, but want to keep the execution environment for rejecting any unnecessary issues. 
Now i succeeded with a script code. Next will try to run a binary one. 

That was the reason why, i posted this question into ML.
And i asked similar but another solution into JIRA,(#2721)

Once again, thank You Dave.

-Keiji


2017-10-03 19:12 GMT+09:00 David Howell <[hidden email]>:

Hi Keiji,

 

In the paragraph you would write:

%sh

spark-submit myapp.jar ...

 

The %sh interpreter is a shell, and runs as the zeppelin service user with whatever permissions it has. You can run any shell commands in it.

 

Although, this is a fairly strange way to run zeppelin so I’m not really sure that is what you want.

 

You can just use the %spark.pyspark interpreter and write your python spark code in there. The spark interpreters in Zeppelin already create the Spark Context for you, as well as sqlContext and spark session. These are available as sc, sqlContext and spark. If you have a program that is ready for spark submit, I would use some other tool to schedule and run it, like cron, oozie, NiFi, Luigi, Airflow etc. Or if you want to run manually just use spark submit from the shell directly or ssh.

 

 

Dave

 

From: [hidden email]
Sent: Tuesday, 3 October 2017 8:43 PM
To: [hidden email]
Subject: Re: How to execute spark-submit on Note

 

Thank you for your quick reply again, Jeff.

Yes i know the difference of the execution environment between "%sh" and ">spark-submit".
And my question was "how to execute spark-submit as shell interpreter".
That mean, i am searching how to execute a binary program from a note of zeppelin.
This time it has been limited on Spark.

Seems like Zeppelin have several procedure to execute Spark shell, like spark.pyspark, spark.sql.... 
So how to do "spark-submit" was my wondering.

I am sorry for bothering Your time, but at the same time, i am appreciated if You get my wondering clearly, and show me some tips.

-Keiji


2017-10-03 18:30 GMT+09:00 Jeff Zhang <[hidden email]>:
%sh is shell interpreter, you can run spark-submit just as you run it in shell terminal. 

小野圭二 <[hidden email]>于2017年10月3日周二 下午4:58写道:
Thank you for your reply, Jeff

"%sh" ?
"sh" seems like request something execution code.
I tried "%sh", then

%sh <program name with full path>
      %sh bash: <program name>: no permission

I made binary file from .py to .pyc, but the answer was as same.
I am sorry seems like doubting you, but Is "%sh" the resolution?

-Keiji

2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang <[hidden email]>:

I am surprised why would you use %spark-submit, there’s no document about %spark-submit.   If you want to use spark-submit in zeppelin, then you could use %sh


Best Regard,
Jeff Zhang


From: 小野圭二 <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Tuesday, October 3, 2017 at 12:49 PM
To: "[hidden email]" <[hidden email]>
Subject: How to execute spark-submit on Note

Hi all,

I searched this topic on the archive of ml, but still could not find out the solution clearly.
So i have tried to post this again(maybe).

I am using ver 0.8.0, and have installed spark 2.2 on the other path, just for checking my test program.
Then i wrote a quite simple sample python code to check the how to.

1. the code works fine on a note in Zeppelin
2. the same code but added the initialize code for SparkContext in it works fine on the Spark by using 'spark-submit'.
3. tried to execute "2" from a note in Zeppelin with the following script.
    yes, "spark" interpreter has been implemented in the note.
    then on the note,
        %spark-submit <program name with full path>   
          -> interpreter not found error
4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the doc
    ex. export SPARK_SUBMIT_OPTIONS='--packages com.databricks:spark-csv_2.10:1.2.0'
5. then running.... 
     %spark-submit <program name with full path> 
      -> interpreter not found error  (as same as "3")

How can i use spark-submit from a note?
Any advice thanks.

-Keiji



Reply | Threaded
Open this post in threaded view
|

How to execute spark-submit on Note

Partridge, Lucas (GE Aviation)

we were considering how to deliver our programs to users widely after we made a program with collaboration on Zeppelin

 

- This is a common question/use case in my experience with Zeppelin: “How do we roll out code to everyone once it’s been prototyped in Zeppelin?”.  Our approach is to package it up in jars or Python packages and make them available on the environment. Then users can just import it like any other code in their own Zeppelin %spark or %pyspark paragraphs.  No %sh is required.  Other notebook-based environments like Databricks make this packaging and importing of libraries part of their UI.

 

Thanks, Lucas.

 

From: 小野圭二 [mailto:[hidden email]]
Sent: 04 October 2017 02:24
To: [hidden email]
Subject: EXT: Re: How to execute spark-submit on Note

 

Hi Dave,

 

Thank You for your suggestion.

It worked fine order by my expectation so far.

I did not know "%sh" could use like that.

 

Anyhow, i would like to explain why i would like to execute "spark-submit" in a note, to be clear your wondering. 

Yes, i know the basic instruction of Zeppelin as You explained to me in your reply, Dave. 

So, now, i tried to find the prospect of the environment of execution in Zeppelin.

That mean, we were considering how to deliver our programs to users widely after we made a program with collaboration on Zeppelin. In this case, we might do not want to disclose our source code to them, but want to keep the execution environment for rejecting any unnecessary issues. 

Now i succeeded with a script code. Next will try to run a binary one. 

 

That was the reason why, i posted this question into ML.

And i asked similar but another solution into JIRA,(#2721)

 

Once again, thank You Dave.

 

-Keiji

 

 

2017-10-03 19:12 GMT+09:00 David Howell <[hidden email]>:

Hi Keiji,

 

In the paragraph you would write:

%sh

spark-submit myapp.jar ...

 

The %sh interpreter is a shell, and runs as the zeppelin service user with whatever permissions it has. You can run any shell commands in it.

 

Although, this is a fairly strange way to run zeppelin so I’m not really sure that is what you want.

 

You can just use the %spark.pyspark interpreter and write your python spark code in there. The spark interpreters in Zeppelin already create the Spark Context for you, as well as sqlContext and spark session. These are available as sc, sqlContext and spark. If you have a program that is ready for spark submit, I would use some other tool to schedule and run it, like cron, oozie, NiFi, Luigi, Airflow etc. Or if you want to run manually just use spark submit from the shell directly or ssh.

 

 

Dave

 

From: [hidden email]
Sent: Tuesday, 3 October 2017 8:43 PM
To: [hidden email]
Subject: Re: How to execute spark-submit on Note

 

Thank you for your quick reply again, Jeff.

 

Yes i know the difference of the execution environment between "%sh" and ">spark-submit".

And my question was "how to execute spark-submit as shell interpreter".

That mean, i am searching how to execute a binary program from a note of zeppelin.

This time it has been limited on Spark.

 

Seems like Zeppelin have several procedure to execute Spark shell, like spark.pyspark, spark.sql.... 

So how to do "spark-submit" was my wondering.

 

I am sorry for bothering Your time, but at the same time, i am appreciated if You get my wondering clearly, and show me some tips.

 

-Keiji

 

 

2017-10-03 18:30 GMT+09:00 Jeff Zhang <[hidden email]>:

%sh is shell interpreter, you can run spark-submit just as you run it in shell terminal. 

 

小野圭二 <[hidden email]>2017103日周二 下午4:58写道:

Thank you for your reply, Jeff

 

"%sh" ?

"sh" seems like request something execution code.

I tried "%sh", then

 

%sh <program name with full path>

      %sh bash: <program name>: no permission

 

I made binary file from .py to .pyc, but the answer was as same.

I am sorry seems like doubting you, but Is "%sh" the resolution?

 

-Keiji

 

2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang <[hidden email]>:

 

I am surprised why would you use %spark-submit, there’s no document about %spark-submit.   If you want to use spark-submit in zeppelin, then you could use %sh

 

 

Best Regard,

Jeff Zhang

 

 

From: 小野圭二 <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Tuesday, October 3, 2017 at 12:49 PM
To: "[hidden email]" <[hidden email]>
Subject: How to execute spark-submit on Note

 

Hi all,

 

I searched this topic on the archive of ml, but still could not find out the solution clearly.

So i have tried to post this again(maybe).

 

I am using ver 0.8.0, and have installed spark 2.2 on the other path, just for checking my test program.

Then i wrote a quite simple sample python code to check the how to.

 

1. the code works fine on a note in Zeppelin

2. the same code but added the initialize code for SparkContext in it works fine on the Spark by using 'spark-submit'.

3. tried to execute "2" from a note in Zeppelin with the following script.

    yes, "spark" interpreter has been implemented in the note.

    then on the note,

        %spark-submit <program name with full path>   

          -> interpreter not found error

4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the doc

    ex. export SPARK_SUBMIT_OPTIONS='--packages com.databricks:spark-csv_2.10:1.2.0'

5. then running.... 

     %spark-submit <program name with full path> 

      -> interpreter not found error  (as same as "3")

 

How can i use spark-submit from a note?

Any advice thanks.

 

-Keiji

 

 

 

Reply | Threaded
Open this post in threaded view
|

Re: How to execute spark-submit on Note

小野圭二
Thank You for your information, Lucas.
Sound interesting and good tip for how to deliver it.
I think this tip should be up to Zeppelin wiki if there is it. :-)
And i should check the Databricks notebook anyhow.

-Keiji

2017-10-04 17:20 GMT+09:00 Partridge, Lucas (GE Aviation) <[hidden email]>:

we were considering how to deliver our programs to users widely after we made a program with collaboration on Zeppelin

 

- This is a common question/use case in my experience with Zeppelin: “How do we roll out code to everyone once it’s been prototyped in Zeppelin?”.  Our approach is to package it up in jars or Python packages and make them available on the environment. Then users can just import it like any other code in their own Zeppelin %spark or %pyspark paragraphs.  No %sh is required.  Other notebook-based environments like Databricks make this packaging and importing of libraries part of their UI.

 

Thanks, Lucas.

 

From: 小野圭二 [mailto:[hidden email]]
Sent: 04 October 2017 02:24
To: [hidden email]
Subject: EXT: Re: How to execute spark-submit on Note

 

Hi Dave,

 

Thank You for your suggestion.

It worked fine order by my expectation so far.

I did not know "%sh" could use like that.

 

Anyhow, i would like to explain why i would like to execute "spark-submit" in a note, to be clear your wondering. 

Yes, i know the basic instruction of Zeppelin as You explained to me in your reply, Dave. 

So, now, i tried to find the prospect of the environment of execution in Zeppelin.

That mean, we were considering how to deliver our programs to users widely after we made a program with collaboration on Zeppelin. In this case, we might do not want to disclose our source code to them, but want to keep the execution environment for rejecting any unnecessary issues. 

Now i succeeded with a script code. Next will try to run a binary one. 

 

That was the reason why, i posted this question into ML.

And i asked similar but another solution into JIRA,(#2721)

 

Once again, thank You Dave.

 

-Keiji

 

 

2017-10-03 19:12 GMT+09:00 David Howell <[hidden email]>:

Hi Keiji,

 

In the paragraph you would write:

%sh

spark-submit myapp.jar ...

 

The %sh interpreter is a shell, and runs as the zeppelin service user with whatever permissions it has. You can run any shell commands in it.

 

Although, this is a fairly strange way to run zeppelin so I’m not really sure that is what you want.

 

You can just use the %spark.pyspark interpreter and write your python spark code in there. The spark interpreters in Zeppelin already create the Spark Context for you, as well as sqlContext and spark session. These are available as sc, sqlContext and spark. If you have a program that is ready for spark submit, I would use some other tool to schedule and run it, like cron, oozie, NiFi, Luigi, Airflow etc. Or if you want to run manually just use spark submit from the shell directly or ssh.

 

 

Dave

 

From: [hidden email]
Sent: Tuesday, 3 October 2017 8:43 PM
To: [hidden email]
Subject: Re: How to execute spark-submit on Note

 

Thank you for your quick reply again, Jeff.

 

Yes i know the difference of the execution environment between "%sh" and ">spark-submit".

And my question was "how to execute spark-submit as shell interpreter".

That mean, i am searching how to execute a binary program from a note of zeppelin.

This time it has been limited on Spark.

 

Seems like Zeppelin have several procedure to execute Spark shell, like spark.pyspark, spark.sql.... 

So how to do "spark-submit" was my wondering.

 

I am sorry for bothering Your time, but at the same time, i am appreciated if You get my wondering clearly, and show me some tips.

 

-Keiji

 

 

2017-10-03 18:30 GMT+09:00 Jeff Zhang <[hidden email]>:

%sh is shell interpreter, you can run spark-submit just as you run it in shell terminal. 

 

小野圭二 <[hidden email]>2017103日周二 下午4:58写道:

Thank you for your reply, Jeff

 

"%sh" ?

"sh" seems like request something execution code.

I tried "%sh", then

 

%sh <program name with full path>

      %sh bash: <program name>: no permission

 

I made binary file from .py to .pyc, but the answer was as same.

I am sorry seems like doubting you, but Is "%sh" the resolution?

 

-Keiji

 

2017-10-03 17:35 GMT+09:00 Jianfeng (Jeff) Zhang <[hidden email]>:

 

I am surprised why would you use %spark-submit, there’s no document about %spark-submit.   If you want to use spark-submit in zeppelin, then you could use %sh

 

 

Best Regard,

Jeff Zhang

 

 

From: 小野圭二 <[hidden email]>
Reply-To: "[hidden email]" <[hidden email]>
Date: Tuesday, October 3, 2017 at 12:49 PM
To: "[hidden email]" <[hidden email]>
Subject: How to execute spark-submit on Note

 

Hi all,

 

I searched this topic on the archive of ml, but still could not find out the solution clearly.

So i have tried to post this again(maybe).

 

I am using ver 0.8.0, and have installed spark 2.2 on the other path, just for checking my test program.

Then i wrote a quite simple sample python code to check the how to.

 

1. the code works fine on a note in Zeppelin

2. the same code but added the initialize code for SparkContext in it works fine on the Spark by using 'spark-submit'.

3. tried to execute "2" from a note in Zeppelin with the following script.

    yes, "spark" interpreter has been implemented in the note.

    then on the note,

        %spark-submit <program name with full path>   

          -> interpreter not found error

4.I have arranged 'SPARK_SUBMIT_OPTIONS' in zeppelin-env.sh order by the doc

    ex. export SPARK_SUBMIT_OPTIONS='--packages com.databricks:spark-csv_2.10:1.2.0'

5. then running.... 

     %spark-submit <program name with full path> 

      -> interpreter not found error  (as same as "3")

 

How can i use spark-submit from a note?

Any advice thanks.

 

-Keiji