Skip to content

Conversation

@apetresc
Copy link
Member

@apetresc apetresc commented Jun 2, 2017

This resolves #359. Basically, it pre-populates the endpoint widgets with any valid configurations found in the ~/.sparkmagic/config.json for any of the wrapper kernels. This just removes a step from the workflow in cases where you've set up config.json for any other kernels.

Tested and works in the Docker environment, and my own deployment 😄

@staticmethod
def _get_default_endpoints():
default_endpoints = set()
kernel_types = ['python', 'python3', 'scala', 'r']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list should be in the constants file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah beautiful, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

kernel_types = ['python', 'python3', 'scala', 'r']

for kernel_type in kernel_types:
endpoint_config = getattr(conf, 'kernel_%s_credentials' % kernel_type)()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The defaults will return localhost:8998 on URL, making this check to always add the endpoints, even when not configured. I'm trying to come up with a way to tell if the user configured the kernel endpoints or not for sure and failing to....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's why you added the is_default field. Is that so?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly. I want to treat errors connecting to the endpoint differently based on whether the endpoint was explicitly added, or just added through this default mechanism. If they added it themselves, throwing an error to stdout is okay, but otherwise I think it should just be silently dropped from the list.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call the field implicitly_added?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed

from sparkmagic.controllerwidget.abstractmenuwidget import AbstractMenuWidget

from sparkmagic.livyclientlib.exceptions import HttpClientException

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add an extra line here (2 lines in between sections)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, done

except HttpClientException:
# If we can't reach one of the default endpoints, just skip over it
# silently
if not endpoint.is_default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please log the error here too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I don't mind logging it here too, but I just want to make sure it's a good idea. As you guessed, the problem I'm trying to fix is someone copying in a default example_config.json and getting strange errors about an endpoint they aren't even aware of.

I think the current behavior is good: if they did explicitly configure an endpoint in config.json that cannot be connected to, they silently won't see it in the default endpoint list – then when they explicitly add it, they see the connection error and understand what needs to be fixed. The alternative seems worse.

Copy link
Contributor

@aggFTW aggFTW Jun 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think logging would be a good idea for the case when it's silently omitted from the list and the user files an issue saying it's being omitted. We will then be able to look at the log file and explain what's going on. Logging shouldn't interfere with what the UX is, since it's just writing to a file/logging system.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Added a log to that effect. I hope SparkLogger was the right thing to use here? It seems to just be a simple subclass of the hdijupyterutils Log object, but the name confused me... made me think it somehow logged things on/about the Spark cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's the right one 👍

self._max_retries = max_retries

def should_retry(self, status_code, error, retry_count):
if None in (status_code, retry_count):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could this happen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was happening to me when connecting to an endpoint that didn't exist. requests.get would return None for status_code and stdout would return these strange opaque errors about NoneType in linearretrypolicy.py, which isn't very descriptive to the end user. I can fetch a screenshot of the situation if you'd like :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks!

self.username = username
self.password = password
self.authenticate = False if username == '' and password == '' else True
self.is_default = is_default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comments as to when is_default should be true and when it should be false, and implications for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@aggFTW aggFTW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor comments. Thanks for the PR! :)

@apetresc
Copy link
Member Author

apetresc commented Jun 5, 2017

Thank you for the comments. I'm out most of the day so I won't get a chance to actually commit the changes until later tonight, but I will get to it shortly! 😄

@aggFTW
Copy link
Contributor

aggFTW commented Jun 5, 2017

No worries, thanks 😄

for kernel_type in kernel_types:
endpoint_config = getattr(conf, 'kernel_%s_credentials' % kernel_type)()
if all([p in endpoint_config for p in ["url", "password", "username"]]) and endpoint_config["url"] != "":
default_endpoints.add(Endpoint(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enpoint now has an auth parameter after the kerberos PR was merged, and the widget should have an extra dropdown now.

@aggFTW
Copy link
Contributor

aggFTW commented Jun 7, 2017

Any news on the PR? I'd like to push sparkmagic 0.12 soon with this goodness on it 😃
If not possible, that's fine, and we can always ship it as 0.12.1 or something like that.

@aggFTW
Copy link
Contributor

aggFTW commented Jun 14, 2017

Gentle ping

@apetresc
Copy link
Member Author

Sorry again for the delay - was out of town and didn't find time until I got back. All comments addressed, I hope :) Manual tests still look good.

@apetresc
Copy link
Member Author

Ah, I will up-merge with master too, one minute...

@apetresc
Copy link
Member Author

Upmerged with master. Everything still works, although I've done 0 testing using the new Kerberos auth parameter, mainly because I have no idea how Kerberos works 😨 I can dive into it if needed, it should probably be added to the Docker image anyway if it's going to be supported going forward.

@aggFTW aggFTW merged commit 9c61ff9 into jupyter-incubator:master Jun 20, 2017
@aggFTW
Copy link
Contributor

aggFTW commented Jun 20, 2017

Thanks!

@apetresc apetresc deleted the configure-default-endpoint branch June 21, 2017 12:59
aggFTW added a commit that referenced this pull request Jun 23, 2017
* Make location of config.json file configurable using environment variables (#350)

* Make location of config.json file configurable using environment variables

* Update minor version to 0.11.3

* Fix column drop issue when first row has missing value (#353)

* Remove extra line

* initial fix of dropping columns

* add unit tests

* revert sql query test change

* revert sql query test change 2

* bump versions

* move outside if

* Adding a working Docker setup for developing sparkmagic (#361)

* Adding a working Docker setup for developing sparkmagic

It includes the Jupyter notebook as well as the Livy+Spark endpoint.
Documentation is in the README

* Pre-configure the ~/.sparkmagic/config.json

Now you can just launch a PySpark wrapper kernel and have it work
out of the box.

* Add R to Livy container

Also added an R section to example_config.json to make it work
out of the box - and I think it's just a good thing to have it
anyway, otherwise how would users ever know it was meant to be
there?

* Add more detail to the README container section

* Add dev_mode build-arg.

Disabled by default. When enabled, builds the container using your local
copy of sparkmagic, so that you can test your development changes inside
the container.

* Adding missing kernels

Was missing Scala and Python2. Confirmed that Python2 and
Python3 are indeed separate environments on the spark
container.

* Kerberos authentication support (#355)

* Enabled kerberos authentication on sparkmagic and updated test cases.

* Enabled hide and show username/password based on auth_type.

* Updated as per comments.

* Updated documentation for kerberos support

* Added test cases to test backward compatibility of auth in handlers

* Update README.md

Change layout and add build status

* Bump version to 0.12.0 (#365)

* Remove extra line

* bump version

* Optional coerce (#367)

* Remove extra line

* added optional configuration to have optional coercion

* fix circular dependency between conf and utils

* add gcc installation for dev build

* fix parsing bug for coerce value

* fix parsing bug for coerce value 2

* Automatically configure wrapper-kernel endpoints in widget (#362)

* Add pre-configured endpoints to endpoint widget automatically

* Fix crash on partially-defined kernel configurations

* Use LANGS_SUPPORTED constant to get list of possible kernel config sections

* Rename is_default attr to implicitly_added

* Adding blank line between imports and class declaration

* Log failure to connect to implicitly-defined endpoints

* Adding comment explaining implicitly_added

* Pass auth parameter through

* Fix hash and auth to include auth parameter (#370)

* Fix hash and auth to include auth parameter

* fix endpoint validation

* remove unecessary commit

* Ability to add custom headers to HTTP calls (#371)

* Abiulity to add custom headers to rest call

* Fix import

* Ad basic conf test

* Fix tests

* Add test

* Fix tests

* Fix indent

* Addres review comments

* Add custom headers to example config
aggFTW added a commit that referenced this pull request Jun 27, 2017
* Remove extra line

* Release 0.12.0 (#373)

* Make location of config.json file configurable using environment variables (#350)

* Make location of config.json file configurable using environment variables

* Update minor version to 0.11.3

* Fix column drop issue when first row has missing value (#353)

* Remove extra line

* initial fix of dropping columns

* add unit tests

* revert sql query test change

* revert sql query test change 2

* bump versions

* move outside if

* Adding a working Docker setup for developing sparkmagic (#361)

* Adding a working Docker setup for developing sparkmagic

It includes the Jupyter notebook as well as the Livy+Spark endpoint.
Documentation is in the README

* Pre-configure the ~/.sparkmagic/config.json

Now you can just launch a PySpark wrapper kernel and have it work
out of the box.

* Add R to Livy container

Also added an R section to example_config.json to make it work
out of the box - and I think it's just a good thing to have it
anyway, otherwise how would users ever know it was meant to be
there?

* Add more detail to the README container section

* Add dev_mode build-arg.

Disabled by default. When enabled, builds the container using your local
copy of sparkmagic, so that you can test your development changes inside
the container.

* Adding missing kernels

Was missing Scala and Python2. Confirmed that Python2 and
Python3 are indeed separate environments on the spark
container.

* Kerberos authentication support (#355)

* Enabled kerberos authentication on sparkmagic and updated test cases.

* Enabled hide and show username/password based on auth_type.

* Updated as per comments.

* Updated documentation for kerberos support

* Added test cases to test backward compatibility of auth in handlers

* Update README.md

Change layout and add build status

* Bump version to 0.12.0 (#365)

* Remove extra line

* bump version

* Optional coerce (#367)

* Remove extra line

* added optional configuration to have optional coercion

* fix circular dependency between conf and utils

* add gcc installation for dev build

* fix parsing bug for coerce value

* fix parsing bug for coerce value 2

* Automatically configure wrapper-kernel endpoints in widget (#362)

* Add pre-configured endpoints to endpoint widget automatically

* Fix crash on partially-defined kernel configurations

* Use LANGS_SUPPORTED constant to get list of possible kernel config sections

* Rename is_default attr to implicitly_added

* Adding blank line between imports and class declaration

* Log failure to connect to implicitly-defined endpoints

* Adding comment explaining implicitly_added

* Pass auth parameter through

* Fix hash and auth to include auth parameter (#370)

* Fix hash and auth to include auth parameter

* fix endpoint validation

* remove unecessary commit

* Ability to add custom headers to HTTP calls (#371)

* Abiulity to add custom headers to rest call

* Fix import

* Ad basic conf test

* Fix tests

* Add test

* Fix tests

* Fix indent

* Addres review comments

* Add custom headers to example config

* bumping versions
kevcunnane pushed a commit to kevcunnane/sparkmagic that referenced this pull request Aug 22, 2019
* Release 0.12.0 (jupyter-incubator#373)

* Make location of config.json file configurable using environment variables (jupyter-incubator#350)

* Make location of config.json file configurable using environment variables

* Update minor version to 0.11.3

* Fix column drop issue when first row has missing value (jupyter-incubator#353)

* Remove extra line

* initial fix of dropping columns

* add unit tests

* revert sql query test change

* revert sql query test change 2

* bump versions

* move outside if

* Adding a working Docker setup for developing sparkmagic (jupyter-incubator#361)

* Adding a working Docker setup for developing sparkmagic

It includes the Jupyter notebook as well as the Livy+Spark endpoint.
Documentation is in the README

* Pre-configure the ~/.sparkmagic/config.json

Now you can just launch a PySpark wrapper kernel and have it work
out of the box.

* Add R to Livy container

Also added an R section to example_config.json to make it work
out of the box - and I think it's just a good thing to have it
anyway, otherwise how would users ever know it was meant to be
there?

* Add more detail to the README container section

* Add dev_mode build-arg.

Disabled by default. When enabled, builds the container using your local
copy of sparkmagic, so that you can test your development changes inside
the container.

* Adding missing kernels

Was missing Scala and Python2. Confirmed that Python2 and
Python3 are indeed separate environments on the spark
container.

* Kerberos authentication support (jupyter-incubator#355)

* Enabled kerberos authentication on sparkmagic and updated test cases.

* Enabled hide and show username/password based on auth_type.

* Updated as per comments.

* Updated documentation for kerberos support

* Added test cases to test backward compatibility of auth in handlers

* Update README.md

Change layout and add build status

* Bump version to 0.12.0 (jupyter-incubator#365)

* Remove extra line

* bump version

* Optional coerce (jupyter-incubator#367)

* Remove extra line

* added optional configuration to have optional coercion

* fix circular dependency between conf and utils

* add gcc installation for dev build

* fix parsing bug for coerce value

* fix parsing bug for coerce value 2

* Automatically configure wrapper-kernel endpoints in widget (jupyter-incubator#362)

* Add pre-configured endpoints to endpoint widget automatically

* Fix crash on partially-defined kernel configurations

* Use LANGS_SUPPORTED constant to get list of possible kernel config sections

* Rename is_default attr to implicitly_added

* Adding blank line between imports and class declaration

* Log failure to connect to implicitly-defined endpoints

* Adding comment explaining implicitly_added

* Pass auth parameter through

* Fix hash and auth to include auth parameter (jupyter-incubator#370)

* Fix hash and auth to include auth parameter

* fix endpoint validation

* remove unecessary commit

* Ability to add custom headers to HTTP calls (jupyter-incubator#371)

* Abiulity to add custom headers to rest call

* Fix import

* Ad basic conf test

* Fix tests

* Add test

* Fix tests

* Fix indent

* Addres review comments

* Add custom headers to example config

* Merge master to release (jupyter-incubator#390)

* Configurable retry for errors (jupyter-incubator#378)

* Remove extra line

* bumping versions

* configurable retry

* fix string

* Make statement and session waiting more responsive (jupyter-incubator#379)

* Remove extra line

* bumping versions

* make sleeping for sessions an exponential backoff

* fix bug

* Add vscode tasks (jupyter-incubator#383)

* Remove extra line

* bumping versions

* add vscode tasks

* Fix endpoints widget when deleting a session (jupyter-incubator#389)

* Remove extra line

* bumping versions

* add vscode tasks

* fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly

* fix tests

* add unit tests

* refresh after cleanup

* Merge master to release (jupyter-incubator#392)

* Configurable retry for errors (jupyter-incubator#378)

* Remove extra line

* bumping versions

* configurable retry

* fix string

* Make statement and session waiting more responsive (jupyter-incubator#379)

* Remove extra line

* bumping versions

* make sleeping for sessions an exponential backoff

* fix bug

* Add vscode tasks (jupyter-incubator#383)

* Remove extra line

* bumping versions

* add vscode tasks

* Fix endpoints widget when deleting a session (jupyter-incubator#389)

* Remove extra line

* bumping versions

* add vscode tasks

* fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly

* fix tests

* add unit tests

* refresh after cleanup

* Try to fix pypi repos (jupyter-incubator#391)

* Remove extra line

* bumping versions

* add vscode tasks

* try to fix pypi new repos

* Merge master to release (jupyter-incubator#394)

* Configurable retry for errors (jupyter-incubator#378)

* Remove extra line

* bumping versions

* configurable retry

* fix string

* Make statement and session waiting more responsive (jupyter-incubator#379)

* Remove extra line

* bumping versions

* make sleeping for sessions an exponential backoff

* fix bug

* Add vscode tasks (jupyter-incubator#383)

* Remove extra line

* bumping versions

* add vscode tasks

* Fix endpoints widget when deleting a session (jupyter-incubator#389)

* Remove extra line

* bumping versions

* add vscode tasks

* fix deleting from endpoint widget, add notebooks to docker file, refresh correctly, populate endpoints correctly

* fix tests

* add unit tests

* refresh after cleanup

* Try to fix pypi repos (jupyter-incubator#391)

* Remove extra line

* bumping versions

* add vscode tasks

* try to fix pypi new repos

* Test 2.7.13 environment for pypi push to prod (jupyter-incubator#393)

* Remove extra line

* bumping versions

* add vscode tasks

* try to fix pypi new repos

* try to fix pip push for prod pypi by pinning to later version of python

* bump versions (jupyter-incubator#395)

* Release v0.12.6 (jupyter-incubator#481)

* Add python3 option in %manage_spark magic (jupyter-incubator#427)

Fixes jupyter-incubator#420

* Links fixed in README

* DataError in Pandas moved from core.groupby to core.base (jupyter-incubator#459)

* DataError in Pandas moved from core.groupby to core.base

* maintain backwards compatability with Pandas 0.22 or lower for DataError

* Bump autoviz version to 0.12.6

* Fix unit test failure caused by un-spec'ed mock which fails traitlet validation (jupyter-incubator#480)

* Fix failing unit tests

Caused by an un-spec'ed mock in a test which fails traitlet validation

* Bump travis.yml Python3 version to 3.6

Python 3.3 is not only EOL'ed but is now actively unsupported by Tornado, which causes
the Travis build to fail again.

* Bumping version numbers for hdijupyterutils and sparkmagic to keep them in sync

* add magic for matplotlib display

* repair

* Patch SparkMagic for latest IPythonKernel compatibility

**Description**
* The IPython interface was updated to return an asyncio.Future rather than a dict from version 5.1.0. This broke SparkMagic as it still expects a dictionart from the output
* This change updates the SparkMagic base kernel to expect a Future and block on its result.
* This also updates the dependencies to call out the the new IPython version dependency.

**Testing Done**
* Unit tests added
* Validating that the kernel connects successfully
* Validating some basic Spark additional operations on an EMR cluster.

* Fix decode json error at trailing empty line (jupyter-incubator#483)

* Bump version number to 0.12.7

* add a screenshot of an example  for display matplot picture

* Fix guaranteed stack trace

* Simplify loop a bit

* We want to be able to interrupt the sleep, so move that outside the try / except

* Add missing session status to session.

* Correct to correct URL with full list.

* Better tests.

* Switch to Livy 0.6.

* Sketch of removal of PYSPARK3.

* Don't allow selection of Python3, since it's not a separate thing now.

* __repr__ for easier debugging of test failures.

* Start fixing tests.

* Rip out more no-longer-relevant "Python 3" code. Python 3 and Python 2 work again.

* Changelog.

* Add progress bar to sparkmagic/sparkmagic/livyclientlib/command.py. Tested with livy 0.4-0.6, python2 and python3

* Support Future and non-Future results from ipykernel.

* News entry.

* Unpin ipykernel so it works with Python 2.

* Python 3.7 support.

* Also update requirements.txt.

* Xenial has 3.7.

* from IPython.display import display to silence travis warning

* Couple missing entries.

* Update versions.

* Document release process, as I understand it.

* Correct file name.

* delete obsolete pyspark3kernel (jupyter-incubator#549)

* delete obsolete pyspark3kernel

* Update README.md

* Update setup.py

* Update test_kernels.py

* Remove old kernelspec installation from Dockerfile

This kernel was removed in jupyter-incubator#549 but the Dockerfile still tries to
install it, which fails the build. This corrects that.

* Relax constraints even more, and make sure to relax them in duplicate locations.

* Don't assume some pre-populated tables, create a new table from the Docker
image's examples.

* Note new feature.

* Additional dependencies for matplotlib to work.

* Add support and documentation for extension use, refactor kernel use.

* Example in pyspark kernel.

* Test for Command's decoding of images.

* Switch to plotly 3.

* Try to switch to standard mechanism Sparkmagic uses for displaying.

* Another entry.

* Add documentation for JupyterLab.

* Prepare for 0.12.9.

* Revert get_session_kind change to be more consistent with upstream repo.

* Remove redundant python3 session test.

* Remove python3 references in livysession.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Automatically add endpoint for the one configured in kernel_*_credentials

2 participants