Skip to content

Add Neptune Analytics support #541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions ChangeLog.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
Starting with v1.31.6, this file will contain a record of major features and updates made in each release of graph-notebook.

## Upcoming

## Release 4.0.0 (Nov 29, 2023)
- Added support for Neptune Analytics ([Link to PR](https://github.com/aws/graph-notebook/pull/541))
- Added Air-Routes and EPL sample seed datasets for openCypher ([Link to PR](https://github.com/aws/graph-notebook/pull/540))

## Release 3.9.0 (Oct 9, 2023)
Expand Down
52 changes: 46 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Instructions for connecting to the following graph databases:
| :-----------------------------: | :---------------------: | :-----------------: |
|[Gremlin Server](#gremlin-server)| property graph | Gremlin |
| [Blazegraph](#blazegraph) | RDF | SPARQL |
|[Amazon Neptune](#amazon-neptune)| property graph or RDF | Gremlin or SPARQL |
|[Amazon Neptune](#amazon-neptune)| property graph or RDF | Gremlin, openCypher, or SPARQL |
| [Neo4J](#neo4j) | property graph | Cypher |

We encourage others to contribute configurations they find useful. There is an [`additional-databases`](https://github.com/aws/graph-notebook/blob/main/additional-databases) folder where more information can be found.
Expand Down Expand Up @@ -184,6 +184,7 @@ Configuration options can be set using the `%graph_notebook_config` magic comman
| aws_region | The AWS region to use for Amazon Neptune connections | your-region-1 | string |
| host | The host url to form a connection with | localhost | string |
| load_from_s3_arn | The ARN of the S3 bucket to load data from [Amazon Neptune only] | | string |
| neptune_service | The name of the Neptune service for the host url [Amazon Neptune only] | neptune-db | string |
| port | The port to use when creating a connection | 8182 | number |
| proxy_host | The proxy host url to route a connection through [Amazon Neptune only]| | string |
| proxy_port | The proxy port to use when creating proxy connection [Amazon Neptune only] | 8182 | number |
Expand Down Expand Up @@ -251,12 +252,15 @@ To setup a new local Blazegraph database for use with the graph notebook, check

### Amazon Neptune

Change the configuration using `%%graph_notebook_config` and modify the defaults as they apply to your Neptune cluster:
Change the configuration using `%%graph_notebook_config` and modify the defaults as they apply to your Neptune instance.

#### Neptune DB

``` python
%%graph_notebook_config
{
"host": "your-neptune-endpoint",
"neptune_service": "neptune-db",
"port": 8182,
"auth_mode": "DEFAULT",
"load_from_s3_arn": "",
Expand All @@ -266,9 +270,45 @@ Change the configuration using `%%graph_notebook_config` and modify the defaults
}
```

#### Neptune Analytics

``` python
%%graph_notebook_config
{
"host": "your-neptune-endpoint",
"neptune_service": "neptune-graph",
"port": 443,
"auth_mode": "IAM",
"ssl": true,
"ssl_verify": true,
"aws_region": "your-neptune-region"
}
```

To setup a new Amazon Neptune cluster, check out the [Amazon Web Services documentation](https://docs.aws.amazon.com/neptune/latest/userguide/manage-console-launch.html).

When connecting the graph notebook to Neptune, make sure you have a network setup to communicate to the VPC that Neptune runs on. If not, you can follow [this guide](https://github.com/aws/graph-notebook/tree/main/additional-databases/neptune).
When connecting the graph notebook to Neptune via a private endpoint, make sure you have a network setup to communicate to the VPC that Neptune runs on. If not, you can follow [this guide](https://github.com/aws/graph-notebook/tree/main/additional-databases/neptune).

In addition to the above configuration options, you can also specify the following options:

### Amazon Neptune Proxy Connection

``` python
%%graph_notebook_config
{
"host": "clustername.cluster-ididididid.us-east-1.neptune.amazonaws.com",
"neptune_service": "neptune-db",
"port": 8182,
"ssl": true,
"proxy_port": 8182,
"proxy_host": "host.proxy.com",
"auth_mode": "IAM",
"aws_region": "us-east-1",
"load_from_s3_arn": ""
}
```

Connecting to Amazon Neptune from clients outside the Neptune VPC using AWS Network [Load Balancer](https://aws-samples.github.io/aws-dbs-refarch-graph/src/connecting-using-a-load-balancer/#connecting-to-amazon-neptune-from-clients-outside-the-neptune-vpc-using-aws-network-load-balancer)

in addition to the above configuration options, you can also specify the following options:

Expand Down Expand Up @@ -298,6 +338,7 @@ If you are running a SigV4 authenticated endpoint, ensure that your configuratio
%%graph_notebook_config
{
"host": "your-neptune-endpoint",
"neptune_service": "neptune-db",
"port": 8182,
"auth_mode": "IAM",
"load_from_s3_arn": "",
Expand Down Expand Up @@ -376,7 +417,7 @@ python3 setup.py bdist_wheel

You should now be able to find the built distribution at

`./dist/graph_notebook-3.9.0-py3-none-any.whl`
`./dist/graph_notebook-4.0.0-py3-none-any.whl`

And use it by following the [installation](https://github.com/aws/graph-notebook#installation) steps, replacing

Expand All @@ -387,8 +428,7 @@ pip install graph-notebook
with

``` python
pip install ./dist/graph_notebook-3.9.0-py3-none-any.whl

pip install ./dist/graph_notebook-4.0.0-py3-none-any.whl
```

## Contributing Guidelines
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
sudo -u ec2-user -i <<'EOF'

echo "export GRAPH_NOTEBOOK_AUTH_MODE=DEFAULT" >> ~/.bashrc # set to IAM instead of DEFAULT if cluster is IAM enabled
echo "export GRAPH_NOTEBOOK_SERVICE=neptune-db" >> ~/.bashrc
echo "export GRAPH_NOTEBOOK_HOST=CHANGE-ME" >> ~/.bashrc
echo "export GRAPH_NOTEBOOK_PORT=8182" >> ~/.bashrc
echo "export NEPTUNE_LOAD_FROM_S3_ROLE_ARN=" >> ~/.bashrc
Expand All @@ -28,8 +29,15 @@ python3 -m ipykernel install --sys-prefix --name python3 --display-name "Python
echo "installing python dependencies..."
pip uninstall NeptuneGraphNotebook -y # legacy uninstall when we used to install from source in s3

pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn "jupyter_core<=5.3.2"
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn "jupyter_server<=2.7.3"
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn "jupyter-console<=6.4.0"
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn "jupyter-client<=6.1.12"
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn "ipywidgets==7.7.2"
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn "jupyterlab_widgets==1.1.1"
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn "notebook==6.4.12"
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn "nbclient<=0.7.0"
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn "itables<=1.4.2"
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.tuna.tsinghua.edu.cn "awswrangler"

if [[ ${VERSION} == "" ]]; then
Expand Down Expand Up @@ -58,6 +66,7 @@ chmod -R a+rw ~/SageMaker/Neptune/*
source ~/.bashrc || exit
HOST=${GRAPH_NOTEBOOK_HOST}
PORT=${GRAPH_NOTEBOOK_PORT}
SERVICE=${GRAPH_NOTEBOOK_SERVICE}
AUTH_MODE=${GRAPH_NOTEBOOK_AUTH_MODE}
SSL=${GRAPH_NOTEBOOK_SSL}
LOAD_FROM_S3_ARN=${NEPTUNE_LOAD_FROM_S3_ROLE_ARN}
Expand All @@ -69,13 +78,15 @@ fi
echo "Creating config with
HOST: ${HOST}
PORT: ${PORT}
SERVICE: ${SERVICE}
AUTH_MODE: ${AUTH_MODE}
SSL: ${SSL}
AWS_REGION: ${AWS_REGION}"

/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin/python -m graph_notebook.configuration.generate_config \
--host "${HOST}" \
--port "${PORT}" \
--neptune_service "${SERVICE}" \
--auth_mode "${AUTH_MODE}" \
--ssl "${SSL}" \
--load_from_s3_arn "${LOAD_FROM_S3_ARN}" \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
sudo -u ec2-user -i <<'EOF'

echo "export GRAPH_NOTEBOOK_AUTH_MODE=DEFAULT" >> ~/.bashrc # set to IAM instead of DEFAULT if cluster is IAM enabled
echo "export GRAPH_NOTEBOOK_SERVICE=neptune-db" >> ~/.bashrc # set to neptune-graph for Neptune Analytics host
echo "export GRAPH_NOTEBOOK_HOST=CHANGE-ME" >> ~/.bashrc
echo "export GRAPH_NOTEBOOK_PORT=8182" >> ~/.bashrc
echo "export NEPTUNE_LOAD_FROM_S3_ROLE_ARN=" >> ~/.bashrc
Expand All @@ -28,8 +29,15 @@ python3 -m ipykernel install --sys-prefix --name python3 --display-name "Python
echo "installing python dependencies..."
pip uninstall NeptuneGraphNotebook -y # legacy uninstall when we used to install from source in s3

pip install "jupyter_core<=5.3.2"
pip install "jupyter_server<=2.7.3"
pip install "jupyter-console<=6.4.0"
pip install "jupyter-client<=6.1.12"
pip install "ipywidgets==7.7.2"
pip install "jupyterlab_widgets==1.1.1"
pip install "notebook==6.4.12"
pip install "nbclient<=0.7.0"
pip install "itables<=1.4.2"
pip install awswrangler

if [[ ${VERSION} == "" ]]; then
Expand Down Expand Up @@ -58,6 +66,7 @@ chmod -R a+rw ~/SageMaker/Neptune/*
source ~/.bashrc || exit
HOST=${GRAPH_NOTEBOOK_HOST}
PORT=${GRAPH_NOTEBOOK_PORT}
SERVICE=${GRAPH_NOTEBOOK_SERVICE}
AUTH_MODE=${GRAPH_NOTEBOOK_AUTH_MODE}
SSL=${GRAPH_NOTEBOOK_SSL}
LOAD_FROM_S3_ARN=${NEPTUNE_LOAD_FROM_S3_ROLE_ARN}
Expand All @@ -69,13 +78,15 @@ fi
echo "Creating config with
HOST: ${HOST}
PORT: ${PORT}
SERVICE: ${SERVICE}
AUTH_MODE: ${AUTH_MODE}
SSL: ${SSL}
AWS_REGION: ${AWS_REGION}"

/home/ec2-user/anaconda3/envs/JupyterSystemEnv/bin/python -m graph_notebook.configuration.generate_config \
--host "${HOST}" \
--port "${PORT}" \
--neptune_service "${SERVICE}" \
--auth_mode "${AUTH_MODE}" \
--ssl "${SSL}" \
--load_from_s3_arn "${LOAD_FROM_S3_ARN}" \
Expand Down
2 changes: 1 addition & 1 deletion src/graph_notebook/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
SPDX-License-Identifier: Apache-2.0
"""

__version__ = '3.9.0'
__version__ = '4.0.0'
23 changes: 17 additions & 6 deletions src/graph_notebook/configuration/generate_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@
from graph_notebook.neptune.client import SPARQL_ACTION, DEFAULT_PORT, DEFAULT_REGION, DEFAULT_GREMLIN_SERIALIZER, \
DEFAULT_GREMLIN_TRAVERSAL_SOURCE, DEFAULT_NEO4J_USERNAME, DEFAULT_NEO4J_PASSWORD, DEFAULT_NEO4J_DATABASE, \
NEPTUNE_CONFIG_HOST_IDENTIFIERS, is_allowed_neptune_host, false_str_variants, \
GRAPHSONV3_VARIANTS, GRAPHSONV2_VARIANTS, GRAPHBINARYV1_VARIANTS
GRAPHSONV3_VARIANTS, GRAPHSONV2_VARIANTS, GRAPHBINARYV1_VARIANTS, \
NEPTUNE_DB_SERVICE_NAME, normalize_service_name

DEFAULT_CONFIG_LOCATION = os.path.expanduser('~/graph_notebook_config.json')

Expand Down Expand Up @@ -117,6 +118,7 @@ def to_dict(self):

class Configuration(object):
def __init__(self, host: str, port: int,
neptune_service: str = NEPTUNE_DB_SERVICE_NAME,
auth_mode: AuthModeEnum = DEFAULT_AUTH_MODE,
load_from_s3_arn='', ssl: bool = True, ssl_verify: bool = True, aws_region: str = DEFAULT_REGION,
proxy_host: str = '', proxy_port: int = DEFAULT_PORT,
Expand All @@ -135,6 +137,7 @@ def __init__(self, host: str, port: int,
or is_allowed_neptune_host(hostname=self.proxy_host, host_allowlist=neptune_hosts)
if is_neptune_host:
self.is_neptune_config = True
self.neptune_service = normalize_service_name(neptune_service)
self.auth_mode = auth_mode
self.load_from_s3_arn = load_from_s3_arn
self.aws_region = aws_region
Expand Down Expand Up @@ -165,6 +168,7 @@ def to_dict(self) -> dict:
if self.is_neptune_config:
return {
'host': self.host,
'neptune_service': self.neptune_service,
'port': self.port,
'proxy_host': self.proxy_host,
'proxy_port': self.proxy_port,
Expand Down Expand Up @@ -199,26 +203,30 @@ def write_to_file(self, file_path=DEFAULT_CONFIG_LOCATION):


def generate_config(host, port, auth_mode: AuthModeEnum = AuthModeEnum.DEFAULT, ssl: bool = True,
ssl_verify: bool = True, load_from_s3_arn='',
ssl_verify: bool = True, neptune_service: str = NEPTUNE_DB_SERVICE_NAME, load_from_s3_arn='',
aws_region: str = DEFAULT_REGION, proxy_host: str = '', proxy_port: int = DEFAULT_PORT,
sparql_section: SparqlSection = SparqlSection(), gremlin_section: GremlinSection = GremlinSection(),
neo4j_section=Neo4JSection(), neptune_hosts: list = NEPTUNE_CONFIG_HOST_IDENTIFIERS):
use_ssl = False if ssl in false_str_variants else True
verify_ssl = False if ssl_verify in false_str_variants else True
c = Configuration(host, port, auth_mode, load_from_s3_arn, use_ssl, verify_ssl, aws_region, proxy_host, proxy_port,
sparql_section, gremlin_section, neo4j_section, neptune_hosts)
c = Configuration(host, port, neptune_service, auth_mode, load_from_s3_arn, use_ssl, verify_ssl, aws_region,
proxy_host, proxy_port, sparql_section, gremlin_section, neo4j_section, neptune_hosts)
return c


def generate_default_config():
c = generate_config('change-me', 8182, AuthModeEnum.DEFAULT, True, True, '', DEFAULT_REGION)
c = generate_config('change-me', 8182, AuthModeEnum.DEFAULT, True, True, NEPTUNE_DB_SERVICE_NAME, '', DEFAULT_REGION)
return c


if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--host", help="the host url to form a connection with", required=True)
parser.add_argument("--port", help="the port to use when creating a connection", default=8182)
parser.add_argument("--neptune_service",
default=NEPTUNE_DB_SERVICE_NAME,
help="The neptune service name to use for signing requests. "
"Use 'neptune-db' for Neptune DB, and 'neptune-graph' for Neptune Analytics.")
parser.add_argument("--auth_mode", default=AuthModeEnum.DEFAULT.value,
help="type of authentication the cluster being connected to is using. Can be DEFAULT or IAM")
parser.add_argument("--ssl",
Expand Down Expand Up @@ -259,7 +267,10 @@ def generate_default_config():
args = parser.parse_args()

auth_mode_arg = args.auth_mode if args.auth_mode != '' else AuthModeEnum.DEFAULT.value
config = generate_config(args.host, int(args.port), AuthModeEnum(auth_mode_arg), args.ssl, args.ssl_verify,
config = generate_config(args.host, int(args.port),
AuthModeEnum(auth_mode_arg),
args.ssl, args.ssl_verify,
args.neptune_service,
args.load_from_s3_arn, args.aws_region, args.proxy_host, int(args.proxy_port),
SparqlSection(args.sparql_path, ''),
GremlinSection(args.gremlin_traversal_source, args.gremlin_username,
Expand Down
10 changes: 8 additions & 2 deletions src/graph_notebook/configuration/get_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@
from graph_notebook.configuration.generate_config import DEFAULT_CONFIG_LOCATION, Configuration, AuthModeEnum, \
SparqlSection, GremlinSection, Neo4JSection
from graph_notebook.neptune.client import NEPTUNE_CONFIG_HOST_IDENTIFIERS, is_allowed_neptune_host, false_str_variants, \
DEFAULT_NEO4J_USERNAME, DEFAULT_NEO4J_PASSWORD, DEFAULT_NEO4J_DATABASE
DEFAULT_NEO4J_USERNAME, DEFAULT_NEO4J_PASSWORD, DEFAULT_NEO4J_DATABASE, \
NEPTUNE_DB_SERVICE_NAME, NEPTUNE_ANALYTICS_SERVICE_NAME, NEPTUNE_DB_CONFIG_NAMES, NEPTUNE_ANALYTICS_CONFIG_NAMES

neptune_params = ['neptune_service', 'auth_mode', 'load_from_s3_arn', 'aws_region']

neptune_params = ['auth_mode', 'load_from_s3_arn', 'aws_region']

Expand All @@ -27,14 +30,17 @@ def get_config_from_dict(data: dict, neptune_hosts: list = NEPTUNE_CONFIG_HOST_I
is_neptune_host = is_allowed_neptune_host(hostname=data["host"], host_allowlist=neptune_hosts)

if is_neptune_host:
neptune_service = data['neptune_service'] if 'neptune_service' in data else NEPTUNE_DB_SERVICE_NAME
if gremlin_section.to_dict()['traversal_source'] != 'g':
print('Ignoring custom traversal source, Amazon Neptune does not support this functionality.\n')
if neo4j_section.to_dict()['username'] != DEFAULT_NEO4J_USERNAME \
or neo4j_section.to_dict()['password'] != DEFAULT_NEO4J_PASSWORD:
print('Ignoring Neo4J custom authentication, Amazon Neptune does not support this functionality.\n')
if neo4j_section.to_dict()['database'] != DEFAULT_NEO4J_DATABASE:
print('Ignoring Neo4J custom database, Amazon Neptune does not support multiple databases.\n')
config = Configuration(host=data['host'], port=data['port'], auth_mode=AuthModeEnum(data['auth_mode']),
config = Configuration(host=data['host'], port=data['port'],
neptune_service=neptune_service,
auth_mode=AuthModeEnum(data['auth_mode']),
ssl=data['ssl'], ssl_verify=ssl_verify, load_from_s3_arn=data['load_from_s3_arn'],
aws_region=data['aws_region'], sparql_section=sparql_section,
gremlin_section=gremlin_section, neo4j_section=neo4j_section,
Expand Down
17 changes: 17 additions & 0 deletions src/graph_notebook/decorators/decorators.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

import ipywidgets as widgets
from graph_notebook.visualization.template_retriever import retrieve_template
from graph_notebook.neptune.client import NEPTUNE_ANALYTICS_SERVICE_NAME
from gremlin_python.driver.protocol import GremlinServerError
from requests import HTTPError

Expand Down Expand Up @@ -133,6 +134,22 @@ def use_magic_variables(*args, **kwargs):
return use_magic_variables


def neptune_db_only(func):
@functools.wraps(func)
def check_neptune_db(self, *args, **kwargs):
if not hasattr(self.graph_notebook_config, 'neptune_service'):
return func(*args, **kwargs)
else:
service_type = self.graph_notebook_config.neptune_service
if service_type == NEPTUNE_ANALYTICS_SERVICE_NAME:
print(f'This magic is unavailable for Neptune Analytics.')
return
else:
return func(*args, **kwargs)

return check_neptune_db


def http_ex_to_html(http_ex: HTTPError):
try:
error = json.loads(http_ex.response.content.decode('utf-8'))
Expand Down
Loading