Skip to content

Commit 03aa70f

Browse files
committed
SUBMARINE-83. Refine the documents of submarine targeting 0.2.0 release. Contributed by Zhankun Tang.
1 parent 5565f2c commit 03aa70f

File tree

9 files changed

+28
-135
lines changed

9 files changed

+28
-135
lines changed

hadoop-submarine/hadoop-submarine-core/README.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -37,11 +37,12 @@
3737
\__________________________________________________________/ (_)
3838
```
3939

40-
Submarine is a project which allows infra engineer / data scientist to run *unmodified* Tensorflow programs on YARN.
40+
Submarine is a project which allows infra engineer / data scientist to run
41+
*unmodified* Tensorflow or PyTorch programs on YARN or Kubernetes.
4142

4243
Goals of Submarine:
4344
- It allows jobs easy access data/models in HDFS and other storages.
44-
- Can launch services to serve Tensorflow/MXNet models.
45+
- Can launch services to serve Tensorflow/PyTorch models.
4546
- Support run distributed Tensorflow jobs with simple configs.
4647
- Support run user-specified Docker images.
4748
- Support specify GPU and other resources.
@@ -51,5 +52,3 @@ Goals of Submarine:
5152
Please jump to [QuickStart](src/site/markdown/QuickStart.md) guide to quickly understand how to use this framework.
5253

5354
Please jump to [Examples](src/site/markdown/Examples.md) to try other examples like running Distributed Tensorflow Training for CIFAR 10.
54-
55-
If you're a developer, please find [Developer](src/site/markdown/DeveloperGuide.md) guide for more details.

hadoop-submarine/hadoop-submarine-core/src/site/markdown/DeveloperGuide.md

Lines changed: 0 additions & 24 deletions
This file was deleted.

hadoop-submarine/hadoop-submarine-core/src/site/markdown/Examples.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,4 @@ Here're some examples about Submarine usage.
1818

1919
[Running Distributed CIFAR 10 Tensorflow Job](RunningDistributedCifar10TFJobs.html)
2020

21-
[Running Standalone CIFAR 10 PyTorch Job](RunningSingleNodeCifar10PTJobs.html)
22-
23-
[Running Zeppelin Notebook on YARN](RunningZeppelinOnYARN.html)
21+
[Running Standalone CIFAR 10 PyTorch Job](RunningSingleNodeCifar10PTJobs.html)

hadoop-submarine/hadoop-submarine-core/src/site/markdown/Index.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,8 @@
1212
limitations under the License. See accompanying LICENSE file.
1313
-->
1414

15-
Submarine is a project which allows infra engineer / data scientist to run *unmodified* Tensorflow programs on YARN.
15+
Submarine is a project which allows infra engineer / data scientist to run
16+
*unmodified* Tensorflow or PyTorch programs on YARN or Kubernetes.
1617

1718
Goals of Submarine:
1819

@@ -43,6 +44,4 @@ Click below contents if you want to understand more.
4344

4445
- [How to write Dockerfile for Submarine PyTorch jobs](WriteDockerfilePT.html)
4546

46-
- [Developer guide](DeveloperGuide.html)
47-
4847
- [Installation guides](HowToInstall.html)

hadoop-submarine/hadoop-submarine-core/src/site/markdown/QuickStart.md

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818

1919
Must:
2020

21-
- Apache Hadoop 3.1.x, YARN service enabled.
21+
- Apache Hadoop version newer than 2.7.3
2222

2323
Optional:
2424

@@ -37,6 +37,20 @@ For more details, please refer to:
3737

3838
- [How to write Dockerfile for Submarine PyTorch jobs](WriteDockerfilePT.html)
3939

40+
## Submarine runtimes
41+
After submarine 0.2.0, it supports two runtimes which are YARN native service
42+
runtime and Linkedin's TonY runtime. Each runtime can support both Tensorflow
43+
and Pytorch framework. And the user don't need to worry about the usage
44+
because the two runtime implements the same interface.
45+
46+
To use the TonY runtime, please set below value in the submarine configuration.
47+
48+
|Configuration Name | Description |
49+
|:---- |:---- |
50+
| `submarine.runtime.class` | org.apache.hadoop.yarn.submarine.runtimes.tony.TonyRuntimeFactory |
51+
52+
For more details of TonY runtime, please check [TonY runtime guide](TonYRuntimeGuide.html)
53+
4054
## Run jobs
4155

4256
### Commandline options
@@ -164,7 +178,8 @@ See below screenshot:
164178

165179
![alt text](./images/tensorboard-service.png "Tensorboard service")
166180

167-
If there is no hadoop client, we can also use the java command and the uber jar, hadoop-submarine-all-*.jar, to submit the job.
181+
After v0.2.0, if there is no hadoop client, we can also use the java command
182+
and the uber jar, hadoop-submarine-all-*.jar, to submit the job.
168183

169184
```
170185
java -cp /path-to/hadoop-conf:/path-to/hadoop-submarine-all-*.jar \

hadoop-submarine/hadoop-submarine-core/src/site/markdown/RunningZeppelinOnYARN.md

Lines changed: 0 additions & 37 deletions
This file was deleted.

hadoop-submarine/hadoop-submarine-tony-runtime/src/site/markdown/QuickStart.md renamed to hadoop-submarine/hadoop-submarine-core/src/site/markdown/TonYRuntimeGuide.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -247,16 +247,16 @@ CLASSPATH=$(hadoop classpath --glob): \
247247
/home/pi/hadoop/TonY/tony-cli/build/libs/tony-cli-0.3.2-all.jar \
248248
249249
java org.apache.hadoop.yarn.submarine.client.cli.Cli job run --name tf-job-001 \
250-
--framework tensorflow \
251250
--num_workers 2 \
252251
--worker_resources memory=3G,vcores=2 \
253252
--num_ps 2 \
254253
--ps_resources memory=3G,vcores=2 \
255254
--worker_launch_cmd "venv.zip/venv/bin/python mnist_distributed.py" \
256255
--ps_launch_cmd "venv.zip/venv/bin/python mnist_distributed.py" \
257-
--insecure
256+
--insecure \
258257
--conf tony.containers.resources=PATH_TO_VENV_YOU_CREATED/venv.zip#archive,PATH_TO_MNIST_EXAMPLE/mnist_distributed.py, \
259-
PATH_TO_TONY_CLI_JAR/tony-cli-0.3.2-all.jar
258+
PATH_TO_TONY_CLI_JAR/tony-cli-0.3.2-all.jar \
259+
--conf tony.application.framework=pytorch
260260
261261
```
262262
You should then be able to see links and status of the jobs from command line:
@@ -284,7 +284,6 @@ CLASSPATH=$(hadoop classpath --glob): \
284284
/home/pi/hadoop/TonY/tony-cli/build/libs/tony-cli-0.3.2-all.jar \
285285
286286
java org.apache.hadoop.yarn.submarine.client.cli.Cli job run --name tf-job-001 \
287-
--framework tensorflow \
288287
--docker_image hadoopsubmarine/tf-1.8.0-cpu:0.0.3 \
289288
--input_path hdfs://pi-aw:9000/dataset/cifar-10-data \
290289
--worker_resources memory=3G,vcores=2 \
@@ -297,5 +296,6 @@ java org.apache.hadoop.yarn.submarine.client.cli.Cli job run --name tf-job-001 \
297296
--env HADOOP_COMMON_HOME=/hadoop-3.1.0 \
298297
--env HADOOP_HDFS_HOME=/hadoop-3.1.0 \
299298
--env HADOOP_CONF_DIR=/hadoop-3.1.0/etc/hadoop \
300-
--conf tony.containers.resources=--conf tony.containers.resources=/home/pi/hadoop/TonY/tony-cli/build/libs/tony-cli-0.3.2-all.jar
299+
--conf tony.containers.resources=PATH_TO_TONY_CLI_JAR/tony-cli-0.3.2-all.jar \
300+
--conf tony.application.framework=pytorch
301301
```

hadoop-submarine/hadoop-submarine-tony-runtime/src/site/resources/css/site.css

Lines changed: 0 additions & 29 deletions
This file was deleted.

hadoop-submarine/hadoop-submarine-tony-runtime/src/site/site.xml

Lines changed: 0 additions & 28 deletions
This file was deleted.

0 commit comments

Comments
 (0)