Skip to content

Commit 84857b9

Browse files
coriverakevcunnane
authored andcommitted
Merge in latest changes from jupyter-incubator/sparkmagic (#3)
* Release 0.12.0 (jupyter-incubator#373) * Make location of config.json file configurable using environment variables (jupyter-incubator#350) * Make location of config.json file configurable using environment variables * Update minor version to 0.11.3 * Fix column drop issue when first row has missing value (jupyter-incubator#353) * Remove extra line * initial fix of dropping columns * add unit tests * revert sql query test change * revert sql query test change 2 * bump versions * move outside if * Adding a working Docker setup for developing sparkmagic (jupyter-incubator#361) * Adding a working Docker setup for developing sparkmagic It includes the Jupyter notebook as well as the Livy+Spark endpoint. Documentation is in the README * Pre-configure the ~/.sparkmagic/config.json Now you can just launch a PySpark wrapper kernel and have it work out of the box. * Add R to Livy container Also added an R section to example_config.json to make it work out of the box - and I think it's just a good thing to have it anyway, otherwise how would users ever know it was meant to be there? * Add more detail to the README container section * Add dev_mode build-arg. Disabled by default. When enabled, builds the container using your local copy of sparkmagic, so that you can test your development changes inside the container. * Adding missing kernels Was missing Scala and Python2. Confirmed that Python2 and Python3 are indeed separate environments on the spark container. * Kerberos authentication support (jupyter-incubator#355) * Enabled kerberos authentication on sparkmagic and updated test cases. * Enabled hide and show username/password based on auth_type. * Updated as per comments. * Updated documentation for kerberos support * Added test cases to test backward compatibility of auth in handlers * Update README.md Change layout and add build status * Bump version to 0.12.0 (jupyter-incubator#365) * Remove extra line * bump version * Optional coerce (jupyter-incubator#367) * Remove extra line * added optional configuration to have optional coercion * fix circular dependency between conf and utils * add gcc installation for dev build * fix parsing bug for coerce value * fix parsing bug for coerce value 2 * Automatically configure wrapper-kernel endpoints in widget (jupyter-incubator#362) * Add pre-configured endpoints to endpoint widget automatically * Fix crash on partially-defined kernel configurations * Use LANGS_SUPPORTED constant to get list of possible kernel config sections * Rename is_default attr to implicitly_added * Adding blank line between imports and class declaration * Log failure to connect to implicitly-defined endpoints * Adding comment explaining implicitly_added * Pass auth parameter through * Fix hash and auth to include auth parameter (jupyter-incubator#370) * Fix hash and auth to include auth parameter * fix endpoint validation * remove unecessary commit * Ability to add custom headers to HTTP calls (jupyter-incubator#371) * Abiulity to add custom headers to rest call * Fix import * Ad basic conf test * Fix tests * Add test * Fix tests * Fix indent * Addres review comments * Add custom headers to example config * Merge master to release (jupyter-incubator#390) * Configurable retry for errors (jupyter-incubator#378) * Remove extra line * bumping versions * configurable retry * fix string * Make statement and session waiting more responsive (jupyter-incubator#379) * Remove extra line * bumping versions * make sleeping for sessions an exponential backoff * fix bug * Add vscode tasks (jupyter-incubator#383) * Remove extra line * bumping versions * add vscode tasks * Fix endpoints widget when deleting a session (jupyter-incubator#389) * Remove extra line * bumping versions * add vscode tasks * fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly * fix tests * add unit tests * refresh after cleanup * Merge master to release (jupyter-incubator#392) * Configurable retry for errors (jupyter-incubator#378) * Remove extra line * bumping versions * configurable retry * fix string * Make statement and session waiting more responsive (jupyter-incubator#379) * Remove extra line * bumping versions * make sleeping for sessions an exponential backoff * fix bug * Add vscode tasks (jupyter-incubator#383) * Remove extra line * bumping versions * add vscode tasks * Fix endpoints widget when deleting a session (jupyter-incubator#389) * Remove extra line * bumping versions * add vscode tasks * fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly * fix tests * add unit tests * refresh after cleanup * Try to fix pypi repos (jupyter-incubator#391) * Remove extra line * bumping versions * add vscode tasks * try to fix pypi new repos * Merge master to release (jupyter-incubator#394) * Configurable retry for errors (jupyter-incubator#378) * Remove extra line * bumping versions * configurable retry * fix string * Make statement and session waiting more responsive (jupyter-incubator#379) * Remove extra line * bumping versions * make sleeping for sessions an exponential backoff * fix bug * Add vscode tasks (jupyter-incubator#383) * Remove extra line * bumping versions * add vscode tasks * Fix endpoints widget when deleting a session (jupyter-incubator#389) * Remove extra line * bumping versions * add vscode tasks * fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly * fix tests * add unit tests * refresh after cleanup * Try to fix pypi repos (jupyter-incubator#391) * Remove extra line * bumping versions * add vscode tasks * try to fix pypi new repos * Test 2.7.13 environment for pypi push to prod (jupyter-incubator#393) * Remove extra line * bumping versions * add vscode tasks * try to fix pypi new repos * try to fix pip push for prod pypi by pinning to later version of python * bump versions (jupyter-incubator#395) * Release v0.12.6 (jupyter-incubator#481) * Add python3 option in %manage_spark magic (jupyter-incubator#427) Fixes jupyter-incubator#420 * Links fixed in README * DataError in Pandas moved from core.groupby to core.base (jupyter-incubator#459) * DataError in Pandas moved from core.groupby to core.base * maintain backwards compatability with Pandas 0.22 or lower for DataError * Bump autoviz version to 0.12.6 * Fix unit test failure caused by un-spec'ed mock which fails traitlet validation (jupyter-incubator#480) * Fix failing unit tests Caused by an un-spec'ed mock in a test which fails traitlet validation * Bump travis.yml Python3 version to 3.6 Python 3.3 is not only EOL'ed but is now actively unsupported by Tornado, which causes the Travis build to fail again. * Bumping version numbers for hdijupyterutils and sparkmagic to keep them in sync * add magic for matplotlib display * repair * Patch SparkMagic for latest IPythonKernel compatibility **Description** * The IPython interface was updated to return an asyncio.Future rather than a dict from version 5.1.0. This broke SparkMagic as it still expects a dictionart from the output * This change updates the SparkMagic base kernel to expect a Future and block on its result. * This also updates the dependencies to call out the the new IPython version dependency. **Testing Done** * Unit tests added * Validating that the kernel connects successfully * Validating some basic Spark additional operations on an EMR cluster. * Fix decode json error at trailing empty line (jupyter-incubator#483) * Bump version number to 0.12.7 * add a screenshot of an example for display matplot picture * Fix guaranteed stack trace * Simplify loop a bit * We want to be able to interrupt the sleep, so move that outside the try / except * Add missing session status to session. * Correct to correct URL with full list. * Better tests. * Switch to Livy 0.6. * Sketch of removal of PYSPARK3. * Don't allow selection of Python3, since it's not a separate thing now. * __repr__ for easier debugging of test failures. * Start fixing tests. * Rip out more no-longer-relevant "Python 3" code. Python 3 and Python 2 work again. * Changelog. * Add progress bar to sparkmagic/sparkmagic/livyclientlib/command.py. Tested with livy 0.4-0.6, python2 and python3 * Support Future and non-Future results from ipykernel. * News entry. * Unpin ipykernel so it works with Python 2. * Python 3.7 support. * Also update requirements.txt. * Xenial has 3.7. * from IPython.display import display to silence travis warning * Couple missing entries. * Update versions. * Document release process, as I understand it. * Correct file name. * delete obsolete pyspark3kernel (jupyter-incubator#549) * delete obsolete pyspark3kernel * Update README.md * Update setup.py * Update test_kernels.py * Remove old kernelspec installation from Dockerfile This kernel was removed in jupyter-incubator#549 but the Dockerfile still tries to install it, which fails the build. This corrects that. * Relax constraints even more, and make sure to relax them in duplicate locations. * Don't assume some pre-populated tables, create a new table from the Docker image's examples. * Note new feature. * Additional dependencies for matplotlib to work. * Add support and documentation for extension use, refactor kernel use. * Example in pyspark kernel. * Test for Command's decoding of images. * Switch to plotly 3. * Try to switch to standard mechanism Sparkmagic uses for displaying. * Another entry. * Add documentation for JupyterLab. * Prepare for 0.12.9. * Revert get_session_kind change to be more consistent with upstream repo. * Remove redundant python3 session test. * Remove python3 references in livysession.
1 parent 4052be9 commit 84857b9

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+752
-696
lines changed

.travis.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
1+
dist: xenial
12
language: python
23
python:
34
- '2.7.13'
45
- '3.6'
6+
- '3.7'
57
install:
68
- pip install six
79
- pip install -r hdijupyterutils/requirements.txt -e hdijupyterutils

CHANGELOG.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
# Changelog
2+
3+
## 0.12.9
4+
5+
### Features
6+
7+
* Support server-side rendering of images, so you don't have to ship all the data to the client to do visualization—see the `%matplot` usage in the example notebook. Thanks to wangqiaoshi for the patch.
8+
* Progress bar for long running queries. Thanks to @juliusvonkohout.
9+
10+
### Bug fixes
11+
12+
* Work correctly with newer versions of the Jupyter notebook. Thanks to Jaipreet Singh for the patch, Eric Dill for testing, and G-Research for sponsoring Itamar Turner-Trauring's time.
13+
14+
### Other changes
15+
16+
* Switch to Plotly 3.
17+
18+
## 0.12.8
19+
20+
### Bug fixes:
21+
22+
* Updated code to work with Livy 0.5 and later, where Python 3 support is not a different kind of session. Thanks to Gianmario Spacagna for contributing some of the code, and G-Research for sponsoring Itamar Turner-Trauring's time.
23+
* Fixed `AttributeError` on `None`, thanks to Eric Dill.
24+
* `recovering` session status won't cause a blow up anymore. Thanks to G-Research for sponsoring Itamar Turner-Trauring's time.
25+

Dockerfile.jupyter

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,11 @@ COPY examples /home/jovyan/work
1919
COPY hdijupyterutils hdijupyterutils/
2020
COPY autovizwidget autovizwidget/
2121
COPY sparkmagic sparkmagic/
22+
23+
USER root
24+
RUN chown -R $NB_USER .
25+
26+
USER $NB_USER
2227
RUN if [ "$dev_mode" = "true" ]; then \
2328
cd hdijupyterutils && pip install . && cd ../ && \
2429
cd autovizwidget && pip install . && cd ../ && \
@@ -31,7 +36,6 @@ RUN sed -i 's/localhost/spark/g' /home/$NB_USER/.sparkmagic/config.json
3136
RUN jupyter nbextension enable --py --sys-prefix widgetsnbextension
3237
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/sparkkernel
3338
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/pysparkkernel
34-
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/pyspark3kernel
3539
RUN jupyter-kernelspec install --user $(pip show sparkmagic | grep Location | cut -d" " -f2)/sparkmagic/kernels/sparkrkernel
3640
RUN jupyter serverextension enable --py sparkmagic
3741

Dockerfile.spark

Lines changed: 31 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,37 +1,52 @@
1-
FROM gettyimages/spark:2.1.0-hadoop-2.7
1+
FROM debian:stretch
22

33
RUN apt-get update && apt-get install -yq --no-install-recommends --force-yes \
4+
curl \
45
git \
5-
openjdk-7-jdk \
6+
openjdk-8-jdk \
67
maven \
7-
python2.7 \
8-
python3.4 \
8+
python2.7 python2.7-setuptools \
9+
python3 python3-setuptools \
910
r-base \
1011
r-base-core && \
1112
rm -rf /var/lib/apt/lists/*
1213

14+
RUN easy_install3 pip py4j
1315
RUN pip install --upgrade setuptools
1416

15-
ENV LIVY_BUILD_VERSION livy-server-0.3.0
16-
ENV LIVY_APP_PATH /apps/$LIVY_BUILD_VERSION
17-
ENV LIVY_BUILD_PATH /apps/build/livy
18-
ENV PYSPARK_PYTHON python2.7
19-
ENV PYSPARK3_PYTHON python3.4
17+
ENV PYTHONHASHSEED 0
18+
ENV PYTHONIOENCODING UTF-8
19+
ENV PIP_DISABLE_PIP_VERSION_CHECK 1
2020

21+
ENV SPARK_BUILD_VERSION 2.3.3
22+
ENV SPARK_HOME /apps/spark-$SPARK_BUILD_VERSION
23+
ENV SPARK_BUILD_PATH /apps/build/spark
2124
RUN mkdir -p /apps/build && \
22-
cd /apps/build && \
23-
git clone https://github.com/cloudera/livy.git && \
25+
cd /apps/build && \
26+
git clone https://github.com/apache/spark.git spark && \
27+
cd $SPARK_BUILD_PATH && \
28+
git checkout v$SPARK_BUILD_VERSION && \
29+
dev/make-distribution.sh --name spark-$SPARK_BUILD_VERSION -Phive -Phive-thriftserver -Pyarn && \
30+
cp -r /apps/build/spark/dist $SPARK_HOME && \
31+
rm -rf $SPARK_BUILD_PATH
32+
33+
ENV LIVY_BUILD_VERSION 0.6.0-incubating
34+
ENV LIVY_APP_PATH /apps/apache-livy-$LIVY_BUILD_VERSION-bin
35+
ENV LIVY_BUILD_PATH /apps/build/livy
36+
RUN cd /apps/build && \
37+
git clone https://github.com/apache/incubator-livy.git livy && \
2438
cd $LIVY_BUILD_PATH && \
25-
git checkout v0.3.0 && \
26-
mvn -DskipTests -Dspark.version=$SPARK_VERSION clean package && \
39+
git checkout v$LIVY_BUILD_VERSION-rc2 && \
40+
mvn -DskipTests -Dspark.version=$SPARK_BUILD_VERSION clean package && \
2741
ls -al $LIVY_BUILD_PATH && ls -al $LIVY_BUILD_PATH/assembly && ls -al $LIVY_BUILD_PATH/assembly/target && \
28-
unzip $LIVY_BUILD_PATH/assembly/target/$LIVY_BUILD_VERSION.zip -d /apps && \
42+
unzip $LIVY_BUILD_PATH/assembly/target/apache-livy-${LIVY_BUILD_VERSION}-bin.zip -d /apps && \
2943
rm -rf $LIVY_BUILD_PATH && \
3044
mkdir -p $LIVY_APP_PATH/upload && \
3145
mkdir -p $LIVY_APP_PATH/logs
3246

47+
RUN pip install matplotlib
48+
RUN pip install pandas
3349

3450
EXPOSE 8998
3551

36-
CMD ["/apps/livy-server-0.3.0/bin/livy-server"]
37-
52+
CMD $LIVY_APP_PATH/bin/livy-server

README.md

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -9,14 +9,16 @@ The Sparkmagic project includes a set of magics for interactively running Spark
99

1010
![Automatic visualization](screenshots/autoviz.png)
1111

12+
![Server-side visualization](screenshots/matplotlib.png)
13+
1214
![Help](screenshots/help.png)
1315

1416
## Features
1517

1618
* Run Spark code in multiple languages against any remote Spark cluster through Livy
1719
* Automatic SparkContext (`sc`) and HiveContext (`sqlContext`) creation
1820
* Easily execute SparkSQL queries with the `%%sql` magic
19-
* Automatic visualization of SQL queries in the PySpark, PySpark3, Spark and SparkR kernels; use an easy visual interface to interactively construct visualizations, no code required
21+
* Automatic visualization of SQL queries in the PySpark, Spark and SparkR kernels; use an easy visual interface to interactively construct visualizations, no code required
2022
* Easy access to Spark application information and logs (`%%info` magic)
2123
* Ability to capture the output of SQL queries as Pandas dataframes to interact with other Python libraries (e.g. matplotlib)
2224
* Authenticate to Livy via Basic Access authentication or via Kerberos
@@ -43,20 +45,23 @@ See [Pyspark](examples/Pyspark%20Kernel.ipynb) and [Spark](examples/Spark%20Kern
4345
2. Make sure that ipywidgets is properly installed by running
4446

4547
jupyter nbextension enable --py --sys-prefix widgetsnbextension
46-
47-
3. (Optional) Install the wrapper kernels. Do `pip show sparkmagic` and it will show the path where `sparkmagic` is installed at. `cd` to that location and do:
48+
49+
3. If you're using JupyterLab, you'll need to run another command:
50+
51+
jupyter labextension install @jupyter-widgets/jupyterlab-manager
52+
53+
4. (Optional) Install the wrapper kernels. Do `pip show sparkmagic` and it will show the path where `sparkmagic` is installed at. `cd` to that location and do:
4854

4955
jupyter-kernelspec install sparkmagic/kernels/sparkkernel
5056
jupyter-kernelspec install sparkmagic/kernels/pysparkkernel
51-
jupyter-kernelspec install sparkmagic/kernels/pyspark3kernel
5257
jupyter-kernelspec install sparkmagic/kernels/sparkrkernel
5358
54-
4. (Optional) Modify the configuration file at ~/.sparkmagic/config.json. Look at the [example_config.json](sparkmagic/example_config.json)
59+
5. (Optional) Modify the configuration file at ~/.sparkmagic/config.json. Look at the [example_config.json](sparkmagic/example_config.json)
5560

56-
5. (Optional) Enable the server extension so that clusters can be programatically changed:
61+
6. (Optional) Enable the server extension so that clusters can be programatically changed:
5762

5863
jupyter serverextension enable --py sparkmagic
59-
64+
6065
## Authentication Methods
6166

6267
Sparkmagic supports:

RELEASING.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
# How to release
2+
3+
1. Update versions in all three `__init__.py` files.
4+
2. Merge `master` into `release`.
5+
3. Tag on `release`.
6+
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = '0.12.6'
1+
__version__ = '0.12.9'

autovizwidget/autovizwidget/plotlygraphs/graphbase.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Copyright (c) 2015 [email protected]
22
# Distributed under the terms of the Modified BSD License.
33

4-
from plotly.graph_objs import Figure, Data, Layout
4+
from plotly.graph_objs import Figure, Layout
55
from plotly.offline import iplot
66
try:
77
from pandas.core.base import DataError
@@ -34,7 +34,7 @@ def render(self, df, encoding, output):
3434

3535
with output:
3636
try:
37-
fig = Figure(data=Data(data), layout=layout)
37+
fig = Figure(data=data, layout=layout)
3838
iplot(fig, show_link=False)
3939
except TypeError:
4040
print("\n\n\nPlease select another set of X and Y axis, because the type of the current axis do\n"

autovizwidget/autovizwidget/plotlygraphs/piegraph.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Copyright (c) 2015 [email protected]
22
# Distributed under the terms of the Modified BSD License.
33

4-
from plotly.graph_objs import Pie, Figure, Data
4+
from plotly.graph_objs import Pie, Figure
55
from plotly.offline import iplot
66
try:
77
from pandas.core.base import DataError
@@ -48,7 +48,7 @@ def render(df, encoding, output):
4848
else:
4949
data = [Pie(values=values, labels=labels)]
5050

51-
fig = Figure(data=Data(data))
51+
fig = Figure(data=data)
5252
iplot(fig, show_link=False)
5353

5454
@staticmethod

autovizwidget/requirements.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
plotly>=1.10.0,<3.0
2-
ipywidgets>5.0.0,<8.0
1+
plotly>=3
2+
ipywidgets>5.0.0
33
hdijupyterutils>=0.6
4-
notebook>=4.2,<6.0
4+
notebook>=4.2
55
pandas>=0.20.1

0 commit comments

Comments
 (0)