Skip to content

OAuth implementation #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 58 commits into from
Sep 14, 2022
Merged
Show file tree
Hide file tree
Changes from 43 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
5581ad6
Reformat changelog (#11)
Jun 29, 2022
68a1903
oauth implementation initial work
moderakh Jul 12, 2022
20e888b
responde to code review comments
moderakh Jul 13, 2022
cff24d9
Update src/databricks/sql/auth/authenticators.py
moderakh Jul 13, 2022
d99af2d
Update src/databricks/sql/auth/authenticators.py
moderakh Jul 13, 2022
09986ab
Update src/databricks/sql/auth/authenticators.py
moderakh Jul 13, 2022
a703f58
responded to review comments
moderakh Jul 13, 2022
70975fe
added unit tests for legacy auth providers (PAT, User/Pass)
moderakh Jul 14, 2022
bccf869
added more tests
moderakh Jul 14, 2022
bedc274
replaced click with logging
moderakh Jul 14, 2022
0bde8fe
responded to code review comments
moderakh Jul 14, 2022
a1ffcec
made use of quotes consitent on oauth.py
moderakh Jul 14, 2022
681a64b
addressed f string related comments
moderakh Jul 14, 2022
f8bb7e9
removed client method
moderakh Jul 14, 2022
774c882
responded to review comments
moderakh Jul 14, 2022
4a2ca6f
added requests as an explicit dependency as that's required by oauth
moderakh Jul 14, 2022
61f2ec0
responded to review comments
moderakh Jul 14, 2022
adce3f3
responded to review comments
moderakh Jul 14, 2022
dcec176
cleanup
moderakh Jul 14, 2022
9642692
cleanup
moderakh Jul 15, 2022
964ddd2
added support for persistence
moderakh Aug 24, 2022
b062674
Add e2e tests (#12)
susodapop Jul 15, 2022
1f30744
Indicate that Python 3.10 is not supported (#27)
Aug 1, 2022
e851ea4
Add Developer Certificate of Origin requirement (#13)
Aug 1, 2022
63894d7
Retry attempts that fail due to a connection timeout (#24)
Aug 5, 2022
36c6f4d
Bump to v2.0.3 (#28)
Aug 5, 2022
78015e5
Bump version to 2.0.4-dev (#29)
Aug 10, 2022
cdef1a5
[PECO-197] Support Python 3.10 (#31)
dbaxa Aug 17, 2022
39efd0a
Update changelog and bump to v2.0.4 (#34)
Aug 17, 2022
643425f
Bump to 2.0.5-dev on main (#35)
Aug 19, 2022
69b7b8c
On Pypi, display the "Project Links" sidebar. (#36)
Aug 19, 2022
23cd570
[ES-402013] Close cursors before closing connection (#38)
Aug 23, 2022
353f413
Bump version to 2.0.5 and improve CHANGELOG (#40)
Aug 23, 2022
737fd7e
minor fixes
moderakh Aug 24, 2022
2274c3f
Merge remote-tracking branch 'databricks/main' into PECO-188
moderakh Aug 24, 2022
9e40f18
fixed token refresh and persistent
moderakh Aug 25, 2022
f9cb9f9
updated comment
moderakh Aug 25, 2022
b6e1fde
added type annotation
moderakh Aug 25, 2022
2a69715
addressed code review comment (use python3 api)
moderakh Aug 25, 2022
beb1f64
made http request handler an independent class, removed global arg, r…
moderakh Aug 25, 2022
930cea8
added pull request ci trigger
moderakh Aug 26, 2022
8af7b5f
restructured the code as class
moderakh Aug 26, 2022
dac3c1e
fixed test_thrift_backend.py tests
moderakh Aug 26, 2022
0fc1684
cleanup
moderakh Aug 29, 2022
4fe5a58
Update src/databricks/sql/experimental/oauth_persistence.py
moderakh Aug 29, 2022
8ab4e3f
cleanup
moderakh Aug 29, 2022
fb012b0
cleanup
moderakh Aug 29, 2022
6881226
cleanup
moderakh Aug 29, 2022
3e537d5
cleanup
moderakh Aug 29, 2022
00c403d
cleanup
moderakh Aug 30, 2022
e64df63
Update src/databricks/sql/auth/thrift_http_client.py
moderakh Aug 30, 2022
ef385e9
cleanup
moderakh Aug 31, 2022
27fb3b5
cleanup
moderakh Aug 31, 2022
0a6c455
cleanup
moderakh Aug 31, 2022
16aa44e
moved access_token out of kwargs
moderakh Sep 6, 2022
be19297
added hostname to the persitence api
moderakh Sep 6, 2022
367a3ee
responded to review comments
moderakh Sep 14, 2022
8476673
fix lint
moderakh Sep 14, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/code-quality-checks.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: Code Quality Checks
on: [push]
on: [pull_request, push]
jobs:
run-unit-tests:
runs-on: ubuntu-latest
Expand Down
299 changes: 135 additions & 164 deletions poetry.lock

Large diffs are not rendered by default.

2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ python = "^3.7.1"
thrift = "^0.13.0"
pandas = "^1.3.0"
pyarrow = "^9.0.0"
requests=">2.18.1"
oauthlib=">=3.1.0"

[tool.poetry.dev-dependencies]
pytest = "^7.1.2"
Expand Down
5 changes: 2 additions & 3 deletions src/databricks/sql/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ def TimestampFromTicks(ticks):
return Timestamp(*time.localtime(ticks)[:6])


def connect(server_hostname, http_path, access_token, **kwargs):
def connect(server_hostname, http_path, experimental_oauth_persistence=None, **kwargs):
from .client import Connection

return Connection(server_hostname, http_path, access_token, **kwargs)
return Connection(server_hostname, http_path, experimental_oauth_persistence, **kwargs)
21 changes: 21 additions & 0 deletions src/databricks/sql/auth/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Copyright 2022 Databricks, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"), except
# that the use of services to which certain application programming
# interfaces (each, an "API") connect requires that the user first obtain
# a license for the use of the APIs from Databricks, Inc. ("Databricks"),
# by creating an account at www.databricks.com and agreeing to either (a)
# the Community Edition Terms of Service, (b) the Databricks Terms of
# Service, or (c) another written agreement between Licensee and Databricks
# for the use of the APIs.
#
# You may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
96 changes: 96 additions & 0 deletions src/databricks/sql/auth/auth.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Copyright 2022 Databricks, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"), except
# that the use of services to which certain application programming
# interfaces (each, an "API") connect requires that the user first obtain
# a license for the use of the APIs from Databricks, Inc. ("Databricks"),
# by creating an account at www.databricks.com and agreeing to either (a)
# the Community Edition Terms of Service, (b) the Databricks Terms of
# Service, or (c) another written agreement between Licensee and Databricks
# for the use of the APIs.
#
# You may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from typing import List
from enum import Enum
from databricks.sql.auth.authenticators import CredentialsProvider, \
AccessTokenAuthProvider, BasicAuthProvider, DatabricksOAuthProvider
from databricks.sql.experimental.oauth_persistence import OAuthPersistence


class AuthType(Enum):
DATABRICKS_OAUTH = "databricks-oauth"
# other supported types (access_token, user/pass) can be inferred
# we can add more types as needed later


class ClientContext:
def __init__(self,
hostname: str,
username: str = None,
password: str = None,
access_token: str = None,
auth_type: str = None,
oauth_scopes: List[str] = None,
oauth_client_id: str = None,
oauth_redirect_port_range: List[int] = None,
use_cert_as_auth: str = None,
tls_client_cert_file: str = None,
oauth_persistence=None
):
self.hostname = hostname
self.username = username
self.password = password
self.access_token = access_token
self.auth_type = auth_type
self.oauth_scopes = oauth_scopes
self.oauth_client_id = oauth_client_id
self.oauth_redirect_port_range = oauth_redirect_port_range
self.use_cert_as_auth = use_cert_as_auth
self.tls_client_cert_file = tls_client_cert_file
self.oauth_persistence = oauth_persistence


def get_auth_provider(cfg: ClientContext):
if cfg.auth_type == AuthType.DATABRICKS_OAUTH.value:
return DatabricksOAuthProvider(cfg.hostname, cfg.oauth_persistence, cfg.oauth_redirect_port_range, cfg.oauth_client_id, cfg.oauth_scopes)
elif cfg.access_token is not None:
return AccessTokenAuthProvider(cfg.access_token)
elif cfg.username is not None and cfg.password is not None:
return BasicAuthProvider(cfg.username, cfg.password)
elif cfg.use_cert_as_auth and cfg.tls_client_cert_file:
# no op authenticator. authentication is performed using ssl certificate outside of headers
return CredentialsProvider()
else:
raise RuntimeError("No valid authentication settings!")


OAUTH_SCOPES = ["sql", "offline_access"]
# TODO: moderakh to be changed once registered on the service side
OAUTH_CLIENT_ID = "databricks-sql-python"
OAUTH_REDIRECT_PORT_RANGE = range(8020, 8025)

def get_python_sql_connector_auth_provider(hostname: str, oauth_persistence: OAuthPersistence = None, **kwargs):
cfg = ClientContext(hostname=hostname,
auth_type=kwargs.get("auth_type"),
access_token=kwargs.get("access_token"),
username=kwargs.get("_username"),
password=kwargs.get("_password"),
use_cert_as_auth=kwargs.get("_use_cert_as_auth"),
tls_client_cert_file=kwargs.get("_tls_client_cert_file"),
oauth_scopes=OAUTH_SCOPES,
oauth_client_id=OAUTH_CLIENT_ID,
oauth_redirect_port_range=OAUTH_REDIRECT_PORT_RANGE,
oauth_persistence=oauth_persistence)
return get_auth_provider(cfg)


130 changes: 130 additions & 0 deletions src/databricks/sql/auth/authenticators.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
# Copyright 2022 Databricks, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"), except
# that the use of services to which certain application programming
# interfaces (each, an "API") connect requires that the user first obtain
# a license for the use of the APIs from Databricks, Inc. ("Databricks"),
# by creating an account at www.databricks.com and agreeing to either (a)
# the Community Edition Terms of Service, (b) the Databricks Terms of
# Service, or (c) another written agreement between Licensee and Databricks
# for the use of the APIs.
#
# You may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import logging
from typing import Dict, List

from databricks.sql.auth.oauth import OAuthManager
import base64


# Private API: this is an evolving interface and it will change in the future.
# Please must not depend on it in your applications.
from databricks.sql.experimental.oauth_persistence import OAuthToken, OAuthPersistence


class CredentialsProvider:
def add_headers(self, request_headers: Dict[str, str]):
pass


# Private API: this is an evolving interface and it will change in the future.
# Please must not depend on it in your applications.
class AccessTokenAuthProvider(CredentialsProvider):
def __init__(self, access_token: str):
self.__authorization_header_value = "Bearer {}".format(access_token)

def add_headers(self, request_headers: Dict[str, str]):
request_headers['Authorization'] = self.__authorization_header_value


# Private API: this is an evolving interface and it will change in the future.
# Please must not depend on it in your applications.
class BasicAuthProvider(CredentialsProvider):
def __init__(self, username: str, password: str):
auth_credentials = f"{username}:{password}".encode("UTF-8")
auth_credentials_base64 = base64.standard_b64encode(auth_credentials).decode("UTF-8")

self.__authorization_header_value = f"Basic {auth_credentials_base64}"

def add_headers(self, request_headers: Dict[str, str]):
request_headers['Authorization'] = self.__authorization_header_value


# Private API: this is an evolving interface and it will change in the future.
# Please must not depend on it in your applications.
class DatabricksOAuthProvider(CredentialsProvider):
SCOPE_DELIM = ' '

def __init__(self, hostname: str, oauth_persistence: OAuthPersistence, redirect_port_range: List[int], client_id: str, scopes: List[str]):
try:
self.oauth_manager = OAuthManager(port_range=redirect_port_range, client_id=client_id)
self._hostname = self._normalize_host_name(hostname=hostname)
self._scopes_as_str = DatabricksOAuthProvider.SCOPE_DELIM.join(scopes)
self._oauth_persistence = oauth_persistence
self._client_id = client_id
self._access_token = None
self._refresh_token = None
self._initial_get_token()
except Exception as e:
logging.error(f"unexpected error", e, exc_info=True)
raise e

def add_headers(self, request_headers: Dict[str, str]):
self._update_token_if_expired()
request_headers['Authorization'] = f"Bearer {self._access_token}"

@staticmethod
def _normalize_host_name(hostname: str):
maybe_scheme = "https://" if not hostname.startswith("https://") else ""
maybe_trailing_slash = "/" if not hostname.endswith("/") else ""
return f"{maybe_scheme}{hostname}{maybe_trailing_slash}"

def _initial_get_token(self):
try:
if self._access_token is None or self._refresh_token is None:
if self._oauth_persistence:
token = self._oauth_persistence.read()
if token:
self._access_token = token.get_access_token()
self._refresh_token = token.get_refresh_token()

if self._access_token and self._refresh_token:
self._update_token_if_expired()
else:
(access_token, refresh_token) = self.oauth_manager.get_tokens(
hostname=self._hostname,
scope=self._scopes_as_str)
self._access_token = access_token
self._refresh_token = refresh_token
self._oauth_persistence.persist(OAuthToken(access_token, refresh_token))
except Exception as e:
logging.error(f"unexpected error in oauth initialization", e, exc_info=True)
raise e

def _update_token_if_expired(self):
try:
(fresh_access_token, fresh_refresh_token, is_refreshed) = self.oauth_manager.check_and_refresh_access_token(
hostname=self._hostname,
access_token=self._access_token,
refresh_token=self._refresh_token)
if not is_refreshed:
return
else:
self._access_token = fresh_access_token
self._refresh_token = fresh_refresh_token

if self._oauth_persistence:
token = OAuthToken(self._access_token, self._refresh_token)
self._oauth_persistence.persist(token)
except Exception as e:
logging.error(f"unexpected error in oauth token update", e, exc_info=True)
raise e
Loading