Skip to content

Commit ae5d515

Browse files
committed
Added documentation for S3 compatible storage
1 parent c51ab98 commit ae5d515

File tree

2 files changed

+49
-0
lines changed

2 files changed

+49
-0
lines changed

demo-notebooks/guided-demos/mnist_fashion.py

+1
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,7 @@ def train_func_distributed():
7474
# For GPU Training, set `use_gpu` to True.
7575
use_gpu = True
7676

77+
# To learn more about configuring S3 compatible storage check out our docs -> https://github.com/project-codeflare/codeflare-sdk/blob/main/docs/s3-compatible-storage.md
7778
trainer = TorchTrainer(
7879
train_func_distributed,
7980
scaling_config=ScalingConfig(

docs/s3-compatible-storage.md

+48
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# S3 compatible storage with Ray Train examples
2+
Some of our distributed training examples require an external storage solution so that all nodes can access the same data. <br>
3+
The following are examples for configuring S3 or Minio storage for your Ray Train script or interactive session.
4+
5+
## S3 Bucket
6+
In your Python Script add the following environment variables:
7+
``` python
8+
os.environ["AWS_ACCESS_KEY_ID"] = "XXXXXXXX"
9+
os.environ["AWS_SECRET_ACCESS_KEY"] = "XXXXXXXX"
10+
os.environ["AWS_DEFAULT_REGION"] = "XXXXXXXX"
11+
```
12+
In your Trainer configuration you can specify a `run_config` which will utilise your external storage.
13+
``` python
14+
trainer = TorchTrainer(
15+
train_func_distributed,
16+
scaling_config=scaling_config,
17+
run_config = ray.train.RunConfig(storage_path="s3://BUCKET_NAME/SUB_PATH/", name="unique_run_name")
18+
)
19+
```
20+
To learn more about Amazon S3 Storage you can find information [here](https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-bucket.html).
21+
22+
## Minio Bucket
23+
In your Python Script add the following function for configuring your run_config:
24+
``` python
25+
import s3fs
26+
import pyarrow
27+
28+
def get_minio_run_config():
29+
s3_fs = s3fs.S3FileSystem(
30+
key = os.getenv('MINIO_ACCESS_KEY', "XXXXX"),
31+
secret = os.getenv('MINIO_SECRET_ACCESS_KEY', "XXXXX"),
32+
endpoint_url = os.getenv('MINIO_URL', "XXXXX")
33+
)
34+
custom_fs = pyarrow.fs.PyFileSystem(pyarrow.fs.FSSpecHandler(s3_fs))
35+
run_config = ray.train.RunConfig(storage_path='training', storage_filesystem=custom_fs)
36+
return run_config
37+
```
38+
You can update the `run_config` to further suit your needs above.
39+
Lastly the new `run_config` must be added to the Trainer:
40+
``` python
41+
trainer = TorchTrainer(
42+
train_func_distributed,
43+
scaling_config=scaling_config,
44+
run_config = get_minio_run_config()
45+
)
46+
```
47+
To find more information on creating a Minio Bucket compatible with RHOAI you can refer to this [documentation](https://ai-on-openshift.io/tools-and-applications/minio/minio/).<br>
48+
Note: You must have `sf3s` and `pyarrow` installed in your environment for this method.

0 commit comments

Comments
 (0)