Skip to content

Conversation

@gm-spacagna
Copy link

I have tried a different implementation for fixing the compatibility of PySpark3 on Livy0.4+ as described in #421.

For me the major problem was the mapping 1-1 between the specified language and corresponding Livy kind variable.

Since that starting from Livy 0.4 the kinds supported are " spark", "pyspark" and "sparkr", the choice between pyspark (python 2) and pyspark3 (python 3) is given by the config parameters instead or the default setting in the Livy server environment.

I have added the field lang to the Session object such that in both the sql query and remote commands we now have available both the kind and desired language as part of the context information. We can discriminate between the two python versions and encoding or not encoding the results serialized in JSON.

@gm-spacagna
Copy link
Author

gm-spacagna commented Jan 28, 2019

I think this patch is not robust in case you lose the reference of the httpclient instance or if you want to share the same session across clients.
Maybe we can store the language information in the config dictionary that is stored server-side. We could add a new property such as {"sparkmagic.language": "python3"}.

@Tagar
Copy link

Tagar commented Jan 28, 2019

FYI - Livy 0.5+ has session kind is actually useless

See this comment in LIVY-469

Starting from 0.5 session kind is actually useless, we should encourage user to not specify session kind, instead we should set code kind when submitting statement.

So this could be both a notebook-level property (just as a default for newly created cells), and a cell-level property.
For example, this opens up a way to for something like this in sparkmagic:

%%spark -l python
# .. some pyspark code.. 
df = spark.sql("select * from universe")
%%spark -l scala 
// .. some Scala code.. 
var df = spark.sql("select * from universe")

on the same (shared) backend Spark session.

@kanastasiou4signifyd
Copy link

Q? Any plans for this to be merged?

@bujol12
Copy link

bujol12 commented Apr 1, 2019

^

Hi guys, any chance of this getting merged soon? Would really appreciate if sparkmagic started to be compatible with livy0.5, as these are also the defaults AWS EMR ships with now.

Thanks!

@Tagar
Copy link

Tagar commented Apr 1, 2019

Btw, Livy 0.6 was released last week https://goo.gl/h64tzY

This fix is still applicable for the 0.6 release .

@gm-spacagna - any chance this can be fixed soon ?

Thanks!!

@xelibrion
Copy link

Hey @apetresc, any chance merging this PR in the near future? Compatibility with Livy 0.5+ seem to be desired by a fair few people.

This is necessary to test livy compatibility, because newer versions of
livy don't work without Hive support in the Spark session, which isn't
included in the default binary distribution.
@sangramga
Copy link

@apetresc Can we merge this anytime soon?

@ashkan-leo
Copy link

Hey @apetresc, any chance merging this PR?

@Tagar
Copy link

Tagar commented May 21, 2019

Adding rest of jupyter-incubator custodians from https://github.com/jupyter-incubator page.
cc @jaipreet-s @Carreau @damianavila @jdfreder @parente @rgbkrk

Is it possible somebody would help with pushing some pull request jupyter-incubator/sparkmagic over to master?

Sparkmagic has broken Python 3 support for over a year, and doesn't support several latest Apache Livy releases.. also there are a lot of PRs that don't get attention from committers for since January.

@lrodgers36
Copy link

Getting this merged soon would be great if that is possible!

@rgbkrk rgbkrk requested a review from aggFTW May 28, 2019 10:15
@rgbkrk
Copy link
Member

rgbkrk commented May 28, 2019

I currently do not use (nor deploy) sparkmagic for users, though I have been supportive of the folks leading it. If PRs are not being merged and support for Python3 is lacking, we may want to consider the livelihood of the project. I'll see about reaching out to maintainers.

@rgbkrk
Copy link
Member

rgbkrk commented May 28, 2019

@gm-spacagna -- I want to thank you for making your first contribution to this repository. Before I take any action on this PR I'd like to see what @apetresc thinks, since they were the last to merge code and make a release.

@itamarst
Copy link
Contributor

I now have commit bit. Going to review this and see if I can get it (or some other variant) merged.

@itamarst
Copy link
Contributor

itamarst commented Jun 19, 2019

Notes from skimming the PR and issue:

  1. While it's true that the newer Livy versions support a kind per command, as a first pass it seems useful to just get this working again, and worry about per-command kinds some other time.

  2. Most of the changes involve tracking the language (Python 2 or Python 3) throughout the various parts of the code. However, the only thing the language is used for is deciding how to encode the message. I suspect it's possible to come up with encoding scheme that works in both Python 2 and Python 3, thus making most of these changes unnecessary (possibly that's what Fix enconding issue for Python3.x #538 does).

  3. Beyond the encoding issue, there remains the issue of updating the Docker configs (which this PR has code for) and ripping out PYSPARK3 kind.

@itamarst
Copy link
Contributor

This is now superseded by #540. Please test #540, if it works for people I will merge it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind:enhancement A new Sparkmagic feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.