-
Notifications
You must be signed in to change notification settings - Fork 114
[PECO-205] Add functional examples #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
1128752
Copied in examples from public documentation
624157e
Adapted the unit test tests/e2e/driver_tests::PySQLCoreTestSuite::tes…
0a747f5
Add oauth examples
9234508
Add README for examples.
f90b7fb
Add user_agent set example
b81a222
Add user agent entry to README
abc5882
Format fix for query cancel
af55a7b
Add more detail to examples/README.md
b96ea57
Add segment about how to run examples
f8a81e8
Add clarifying wording that OAuth is experimental with links to see w…
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# `databricks-sql-connector` Example Usage | ||
|
||
We provide example scripts so you can see the connector in action for basic usage. You need a Databricks account to run them. The scripts expect to find your Databricks account credentials in these environment variables: | ||
|
||
- DATABRICKS_SERVER_HOSTNAME | ||
- DATABRICKS_HTTP_PATH | ||
- DATABRICKS_TOKEN | ||
|
||
Follow the quick start in our [README](../README.md) to install `databricks-sql-connector` and see | ||
how to find the hostname, http path, and access token. Note that for the OAuth examples below a | ||
personal access token is not needed. | ||
|
||
|
||
## How to run an example script | ||
|
||
To run all of these examples you can clone the entire repository to your disk. Or you can use `curl` to fetch an individual script. | ||
|
||
### Clone the repo | ||
1. Clone this repository to your local system | ||
2. Follow the quick start in the [README](../README.md) to install the connector and obtain authentication credentials. | ||
3. `cd examples/` | ||
4. Then run any script using the `python` CLI. For example `python query_execute.py` | ||
|
||
### Fetch with `curl` | ||
|
||
1. Follow the quick start in the [README](../README.md) to install the connector and obtain authentication credentials. | ||
2. Use the GitHub UI to find the URL to the **Raw** version of one of these examples. For example: `https://raw.githubusercontent.com/databricks/databricks-sql-python/main/examples/query_execute.py` | ||
3. `curl` this URL to your local file-system: `curl https://raw.githubusercontent.com/databricks/databricks-sql-python/main/examples/query_execute.py > query_execute.py` | ||
4. Then run the script with the `python` CLI. `python query_execute.py` | ||
# Table of Contents | ||
|
||
- **`query_execute.py`** connects to the `samples` database of your default catalog, runs a small query, and prints the result to screen. | ||
- **`insert_data.py`** adds a tables called `squares` to your default catalog and inserts one hundred rows of example data. Then it fetches this data and prints it to the screen. | ||
- **`query_cancel.py`** shows how to cancel a query assuming that you can access the `Cursor` executing that query from a different thread. This is necessary because `databricks-sql-connector` does not yet implement an asynchronous API; calling `.execute()` blocks the current thread until execution completes. Therefore, the connector can't cancel queries from the same thread where they began. | ||
- **`interactive_oauth.py`** shows the simplest example of authenticating by OAuth (no need for a PAT generated in the DBSQL UI) while Bring Your Own IDP is in public preview. When you run the script it will open a browser window so you can authenticate. Afterward, the script fetches some sample data from Databricks and prints it to the screen. For this script, the OAuth token is not persisted which means you need to authenticate every time you run the script. | ||
- **`persistent_oauth.py`** shows a more advanced example of authenticating by OAuth while Bring Your Own IDP is in public preview. In this case, it shows how to use a sublcass of `OAuthPersistence` to reuse an OAuth token across script executions. | ||
- **`set_user_agent.py`** shows how to customize the user agent header used for Thrift commands. In | ||
this example the string `ExamplePartnerTag` will be added to the the user agent on every request. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
from databricks import sql | ||
import os | ||
|
||
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"), | ||
http_path = os.getenv("DATABRICKS_HTTP_PATH"), | ||
access_token = os.getenv("DATABRICKS_TOKEN")) as connection: | ||
|
||
with connection.cursor() as cursor: | ||
cursor.execute("CREATE TABLE IF NOT EXISTS squares (x int, x_squared int)") | ||
|
||
squares = [(i, i * i) for i in range(100)] | ||
values = ",".join([f"({x}, {y})" for (x, y) in squares]) | ||
|
||
cursor.execute(f"INSERT INTO squares VALUES {values}") | ||
|
||
cursor.execute("SELECT * FROM squares LIMIT 10") | ||
|
||
result = cursor.fetchall() | ||
|
||
for row in result: | ||
print(row) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
from databricks import sql | ||
import os | ||
|
||
"""Bring Your Own Identity Provider with fined grained OAuth scopes is currently public preview on | ||
Databricks in AWS. databricks-sql-connector supports user to machine OAuth login which means the | ||
end user has to be present to login in a browser which will be popped up by the Python process. You | ||
must enable OAuth in your Databricks account to run this example. More information on how to enable | ||
OAuth in your Databricks Account in AWS can be found here: | ||
|
||
https://docs.databricks.com/administration-guide/account-settings-e2/single-sign-on.html | ||
|
||
Pre-requisites: | ||
- You have a Databricks account in AWS. | ||
- You have configured OAuth in Databricks account in AWS using the link above. | ||
- You have installed a browser (Chrome, Firefox, Safari, Internet Explorer, etc) that will be | ||
accessible on the machine for performing OAuth login. | ||
|
||
This code does not persist the auth token. Hence after the Python process terminates the | ||
end user will have to login again. See examples/persistent_oauth.py to learn about persisting the | ||
token across script executions. | ||
|
||
Bring Your Own Identity Provider is in public preview. The API may change prior to becoming GA. | ||
You can monitor these two links to find out when it will become generally available: | ||
|
||
1. https://docs.databricks.com/administration-guide/account-settings-e2/single-sign-on.html | ||
2. https://docs.databricks.com/dev-tools/python-sql-connector.html | ||
""" | ||
|
||
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"), | ||
http_path = os.getenv("DATABRICKS_HTTP_PATH"), | ||
auth_type="databricks-oauth") as connection: | ||
|
||
for x in range(1, 100): | ||
cursor = connection.cursor() | ||
cursor.execute('SELECT 1+1') | ||
result = cursor.fetchall() | ||
for row in result: | ||
print(row) | ||
cursor.close() | ||
|
||
connection.close() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
"""Bring Your Own Identity Provider with fined grained OAuth scopes is currently public preview on | ||
Databricks in AWS. databricks-sql-connector supports user to machine OAuth login which means the | ||
end user has to be present to login in a browser which will be popped up by the Python process. You | ||
must enable OAuth in your Databricks account to run this example. More information on how to enable | ||
OAuth in your Databricks Account in AWS can be found here: | ||
|
||
https://docs.databricks.com/administration-guide/account-settings-e2/single-sign-on.html | ||
|
||
Pre-requisites: | ||
- You have a Databricks account in AWS. | ||
- You have configured OAuth in Databricks account in AWS using the link above. | ||
- You have installed a browser (Chrome, Firefox, Safari, Internet Explorer, etc) that will be | ||
accessible on the machine for performing OAuth login. | ||
|
||
For security, databricks-sql-connector does not persist OAuth tokens automatically. Hence, after | ||
the Python process terminates the end user will have to log-in again. We provide APIs to be | ||
implemented by the end user for persisting the OAuth token. The SampleOAuthPersistence reference | ||
shows which methods you may implement. | ||
|
||
For this example, the DevOnlyFilePersistence class is provided. Do not use this in production. | ||
|
||
Bring Your Own Identity Provider is in public preview. The API may change prior to becoming GA. | ||
You can monitor these two links to find out when it will become generally available: | ||
|
||
1. https://docs.databricks.com/administration-guide/account-settings-e2/single-sign-on.html | ||
2. https://docs.databricks.com/dev-tools/python-sql-connector.html | ||
""" | ||
|
||
import os | ||
from typing import Optional | ||
|
||
from databricks import sql | ||
from databricks.sql.experimental.oauth_persistence import OAuthPersistence, OAuthToken, DevOnlyFilePersistence | ||
|
||
|
||
class SampleOAuthPersistence(OAuthPersistence): | ||
def persist(self, hostname: str, oauth_token: OAuthToken): | ||
"""To be implemented by the end user to persist in the preferred storage medium. | ||
|
||
OAuthToken has two properties: | ||
1. OAuthToken.access_token | ||
2. OAuthToken.refresh_token | ||
|
||
Both should be persisted. | ||
""" | ||
pass | ||
|
||
def read(self, hostname: str) -> Optional[OAuthToken]: | ||
"""To be implemented by the end user to fetch token from the preferred storage | ||
|
||
Fetch the access_token and refresh_token for the given hostname. | ||
Return OAuthToken(access_token, refresh_token) | ||
""" | ||
pass | ||
|
||
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"), | ||
http_path = os.getenv("DATABRICKS_HTTP_PATH"), | ||
auth_type="databricks-oauth", | ||
experimental_oauth_persistence=DevOnlyFilePersistence("./sample.json")) as connection: | ||
|
||
for x in range(1, 100): | ||
cursor = connection.cursor() | ||
cursor.execute('SELECT 1+1') | ||
result = cursor.fetchall() | ||
for row in result: | ||
print(row) | ||
cursor.close() | ||
|
||
connection.close() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
from databricks import sql | ||
import os, threading, time | ||
|
||
""" | ||
The current operation of a cursor may be cancelled by calling its `.cancel()` method as shown in the example below. | ||
""" | ||
|
||
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"), | ||
http_path = os.getenv("DATABRICKS_HTTP_PATH"), | ||
access_token = os.getenv("DATABRICKS_TOKEN")) as connection: | ||
|
||
with connection.cursor() as cursor: | ||
def execute_really_long_query(): | ||
try: | ||
cursor.execute("SELECT SUM(A.id - B.id) " + | ||
"FROM range(1000000000) A CROSS JOIN range(100000000) B " + | ||
"GROUP BY (A.id - B.id)") | ||
except sql.exc.RequestError: | ||
print("It looks like this query was cancelled.") | ||
|
||
exec_thread = threading.Thread(target=execute_really_long_query) | ||
|
||
print("\n Beginning to execute long query") | ||
exec_thread.start() | ||
|
||
# Make sure the query has started before cancelling | ||
print("\n Waiting 15 seconds before canceling", end="", flush=True) | ||
|
||
seconds_waited = 0 | ||
while seconds_waited < 15: | ||
seconds_waited += 1 | ||
print(".", end="", flush=True) | ||
time.sleep(1) | ||
|
||
print("\n Cancelling the cursor's operation. This can take a few seconds.") | ||
cursor.cancel() | ||
|
||
print("\n Now checking the cursor status:") | ||
exec_thread.join(5) | ||
|
||
assert not exec_thread.is_alive() | ||
print("\n The previous command was successfully canceled") | ||
|
||
print("\n Now reusing the cursor to run a separate query.") | ||
|
||
# We can still execute a new command on the cursor | ||
cursor.execute("SELECT * FROM range(3)") | ||
|
||
print("\n Execution was successful. Results appear below:") | ||
|
||
print(cursor.fetchall()) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
from databricks import sql | ||
import os | ||
|
||
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"), | ||
http_path = os.getenv("DATABRICKS_HTTP_PATH"), | ||
access_token = os.getenv("DATABRICKS_TOKEN")) as connection: | ||
|
||
with connection.cursor() as cursor: | ||
cursor.execute("SELECT * FROM default.diamonds LIMIT 2") | ||
result = cursor.fetchall() | ||
|
||
for row in result: | ||
print(row) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
from databricks import sql | ||
import os | ||
|
||
with sql.connect(server_hostname = os.getenv("DATABRICKS_SERVER_HOSTNAME"), | ||
http_path = os.getenv("DATABRICKS_HTTP_PATH"), | ||
access_token = os.getenv("DATABRICKS_TOKEN"), | ||
_user_agent_entry="ExamplePartnerTag") as connection: | ||
|
||
with connection.cursor() as cursor: | ||
cursor.execute("SELECT * FROM default.diamonds LIMIT 2") | ||
result = cursor.fetchall() | ||
|
||
for row in result: | ||
print(row) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please also mention that the API (interface) for OAuthPersistence are experimental and evolve over time. I think the package name
databricks.sql.experimental.oauth_persistence
indicates that but it would be great if we could also explicitly say so.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in f8a81e8
Thanks!