Skip to content

Commit 092c780

Browse files
committed
HADOOP-19083. aws sdk is optional in release builds
* update building doc * LICENSE-binary makes clear it is optional * hadoop s3guard bucket-info tool reports error better * docs cover how to install. It's actually quite hard to manually install; unless we can give better instructions I almost think we'd want to create releases with and without the AWS SDK. Let's target 3.4.1 for that Change-Id: I2c91963a21b5c289e05218c2cbce0561b8e48b60
1 parent 70273da commit 092c780

File tree

4 files changed

+59
-5
lines changed

4 files changed

+59
-5
lines changed

BUILDING.txt

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,7 @@ Maven build goals:
146146
* Run clover : mvn test -Pclover
147147
* Run Rat : mvn apache-rat:check
148148
* Build javadocs : mvn javadoc:javadoc
149-
* Build distribution : mvn package [-Pdist][-Pdocs][-Psrc][-Pnative][-Dtar][-Preleasedocs][-Pyarn-ui]
149+
* Build distribution : mvn package [-Pdist][-Pdocs][-Psrc][-Pnative][-Dtar][-Preleasedocs][-Pyarn-ui][-Pawssdk]
150150
* Change Hadoop version : mvn versions:set -DnewVersion=NEWVERSION
151151

152152
Build options:
@@ -159,6 +159,7 @@ Maven build goals:
159159
* Use -Pyarn-ui to build YARN UI v2. (Requires Internet connectivity)
160160
* Use -DskipShade to disable client jar shading to speed up build times (in
161161
development environments only, not to build release artifacts)
162+
* Use -Pawssdk to include the AWS V2 SDK in the release distribution
162163

163164
YARN Application Timeline Service V2 build options:
164165

@@ -371,6 +372,13 @@ Create binary distribution with native code:
371372

372373
$ mvn package -Pdist,native -DskipTests -Dtar
373374

375+
Create binary distribution with AWS SDK:
376+
377+
$ mvn package -Pdist,awssdk -DskipTests -Dtar
378+
379+
This ensures that the hadoop-aws sdk has all its dependencies,
380+
but does approximately double the size of the tar file.
381+
374382
Create source distribution:
375383

376384
$ mvn package -Psrc -DskipTests

LICENSE-binary

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -362,6 +362,8 @@ org.objenesis:objenesis:2.6
362362
org.xerial.snappy:snappy-java:1.1.10.4
363363
org.yaml:snakeyaml:2.0
364364
org.wildfly.openssl:wildfly-openssl:1.1.3.Final
365+
366+
In distributions which include the aws V2 SDK:
365367
software.amazon.awssdk:bundle:jar:2.25.53
366368

367369

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardTool.java

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -422,8 +422,14 @@ public int run(String[] args, PrintStream out)
422422
CommandFormat commands = getCommandFormat();
423423
URI fsURI = toUri(s3Path);
424424

425-
S3AFileSystem fs = bindFilesystem(
426-
FileSystem.newInstance(fsURI, getConf()));
425+
S3AFileSystem fs;
426+
try {
427+
fs = bindFilesystem(FileSystem.newInstance(fsURI, getConf()));
428+
} catch (NoClassDefFoundError e) {
429+
println(out, "Failed to instantiate S3A filesystem due to missing class: %s", e);
430+
println(out, "Make sure the AWS v2 SDK is on the classpath");
431+
throw e;
432+
}
427433
Configuration conf = fs.getConf();
428434
URI fsUri = fs.getUri();
429435
println(out, "Filesystem %s", fsUri);

hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md

Lines changed: 40 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,8 +53,46 @@ full details.
5353

5454
## <a name="overview"></a> Overview
5555

56-
Apache Hadoop's `hadoop-aws` module provides support for AWS integration.
57-
applications to easily use this support.
56+
Apache Hadoop's `hadoop-aws` module provides support for AWS integration,
57+
primarily the s3a open source connector to Amazon S3 Storage, including
58+
Amazon S3 Express One zone storage as well as third-party stores with S3
59+
compatibility.
60+
61+
## <a name="installation"></a> Installation
62+
63+
### <a name="SDK download"></a> SDK Download
64+
65+
This release uses the AWS SDK for Java 2.0
66+
67+
Unless using a hadoop release with the AWS SDK `bundle.jar` JAR included
68+
in the binary distribution, the library MUST be downloaded and installed
69+
into the hadoop distribution.
70+
71+
The exact version of the SDK to be used is listed in the file:
72+
```
73+
LICENSE-binary
74+
```
75+
The [mvn repository](https://mvnrepository.com/)
76+
site will list it as a "Compile Dependency" of the
77+
[hadoop-aws](https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws) artifact.
78+
79+
AWS SDK releases can be downloaded from github at [AWS SDK for Java 2.0](https://github.com/aws/aws-sdk-java-v2)
80+
81+
Or from the [Maven central repository](https://repo1.maven.org/maven2/software/amazon/awssdk/bundle/).
82+
83+
Download the release and place it in the directory `share/hadoop/tools/lib`
84+
of the hadoop distribution.
85+
86+
* Using an earlier SDK than that this SDK was compiled and tested against
87+
will not work.
88+
* Using a later SDK *should* work, but there are no guarantees.
89+
* The V1 SDK will not work.
90+
91+
Any project declaring a dependency on `hadoop-aws` in their Maven/Ivy/SBT/Gradle
92+
build will automatically get the specific version of the AWS SDK which this
93+
module was compiled against.
94+
95+
### <a name="inclusion-on-classpath"></a> Inclusion on classpath
5896

5997
To include the S3A client in Apache Hadoop's default classpath:
6098

0 commit comments

Comments
 (0)