Skip to content

Commit a5f5648

Browse files
committed
Add draft of replay bundle specification
1 parent ed9ae0b commit a5f5648

File tree

1 file changed

+179
-0
lines changed

1 file changed

+179
-0
lines changed
Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# Draft: Native Image Replay Bundles
2+
3+
## Motivation
4+
5+
1. The deployment of a native image is just one step in the lifecycle of an application or service. Real world
6+
applications run for years and sometimes need to be updated or patched long after deployment (security fixes). It would
7+
be great if there would be an easy way to redo an image build at some point in the future as accurately as possible.
8+
9+
2. Another angle is provided from the development phase. If image building fails or a malfunctioning image is
10+
created (i.e. the same application runs fine when executed via JVM) we would like to get bug reports that allow us to
11+
reproduce the problem locally without hours of replicating their setup. We would want some way to bundle up what the
12+
user built (or tried to build) into a nice package that allows us to instantly reproduce the problem on our side.
13+
14+
3. Debugging an image created long time ago is also sometimes needed. It would be great if there is a single bundle that
15+
contains everything needed to perform this task.
16+
17+
## Replay Bundles
18+
19+
A set of options should be added to the `native-image` command that allows to create so-called "replay bundles" that can
20+
be used to help with problems described above. There shall be
21+
22+
```shell
23+
native-image --replay-create ...other native-image arguments...
24+
```
25+
26+
This will instruct native-image to create a replay bundle alongside the image.
27+
28+
For example, after the running:
29+
30+
```shell
31+
native-image --replay-create -Dlaunchermode=2 -EBUILD_ENVVAR=env42 \
32+
-p somewhere/on/my/drive/app.launcher.jar:/mnt/nfs/server0/mycorp.base.jar \
33+
-cp $HOME/ourclasses:somewhere/logging.jar:/tmp/other.jar:aux.jar \
34+
-m app.launcher/paw.AppLauncher alaunch
35+
```
36+
the user sees the following build results:
37+
```shell
38+
~/foo$ ls
39+
alaunch alaunch.nirb.jar somewhere aux.jar
40+
```
41+
As we can see, in addition to the image a `<imagename>.nirb.jar`-file was created. This is the native image replay
42+
bundle for the image that got built. At any time later, if the same version of GraalVM is used, the image can be rebuilt
43+
with:
44+
45+
```shell
46+
native-image --replay .../path/to/alaunch.nirb.jar
47+
```
48+
49+
this will rebuild the `alaunch` image with the same image arguments, environment variables, system properties
50+
settings, classpath and module-path options as in the initial build.
51+
52+
## Replay Bundles File Format
53+
54+
A `<imagename>.nirb.jar` file is a regular jar-file that contains all information needed to replay a previous build.
55+
For example, the `alaunch.nirb.jar` replay bundle has the following inner structure:
56+
57+
```
58+
alaunch.nirb.jar
59+
├── build
60+
│ └── report <- Contains information about the build proccess.
61+
│ │ In case of replay these will be compared against.
62+
│ ├── analysis_results.json
63+
│ ├── build_artifacts.json
64+
│ ├── build.log
65+
│ ├── build_output.json
66+
│ ├── jni_access_details.json
67+
│ └── reflection_details.json
68+
├── input
69+
│ ├── classes <- Contains all class-path and module-path entries passed to the builder
70+
│ │ ├── cp
71+
│ │ │ ├── aux.jar
72+
│ │ │ ├── logging.jar
73+
│ │ │ ├── other.jar
74+
│ │ │ └── ourclasses
75+
│ │ └── p
76+
│ │ ├── app.launcher.jar
77+
│ │ └── mycorp.base.jar
78+
│ └── stage
79+
│ ├── all.env <- All environment variables used in the image build
80+
│ ├── all.properties <- All system properties passed to the builder
81+
│ ├── build.cmd <- Full native-image command line (minus --replay-create option)
82+
│ └── run.cmd <- Arguments to run application on java (for laucher, see below)
83+
├── META-INF
84+
│ ├── MANIFEST.MF <- Specifes rbundle/Launcher as mainclass
85+
│ └── RBUNDLE.MF <- Contains replay bundle version info:
86+
│ * Replay-bundle format version
87+
│ * GraalVM / Native-image version used for build
88+
├── out
89+
│ ├── alaunch.debug <- Native debuginfo for the built image
90+
│ └── sources <- Reachable sources needed for native debugging
91+
└── rbundle
92+
└── Launcher.class <- Launcher for running of application with `java`
93+
(uses files from input directory)
94+
```
95+
96+
As we can see, there are several components in a replay bundle that we need to describe in more detail.
97+
98+
### `META-INF`:
99+
100+
Since the bundle is also a regular jar-file we have a `META-INF` subdirectory with the familiar `MANIFEST.MF`. The
101+
bundle can be used like a regular jar-launcher (by running command `java -jar <imagename>.nirb.jar`) so that the
102+
application we build an image from is instead executed on the JVM. For that purpose the `MANIFEST.MF` specifies the
103+
`rbundle/Launcher` as main class.
104+
105+
Here we also find `RBUNDLE.MF`. This file is specific to replay bundles. Its existence makes clear that this is no
106+
ordinary jar-file but a native image replay bundle. The file contains version information of the native image replay
107+
bundle format itself and also which GraalVM version was used to create the bundle. This can later be used to report a
108+
warning message if a bundle gets replayed with a GraalVM version different from the one used to create the bundle.
109+
110+
### `input`:
111+
112+
This directory contains the entire amount of information needed to redo the previous image build. The original
113+
class-path and module-path entries are placed into corresponding files (for jar-files) and subdirectories (for
114+
directory-based class/module-path entries) into the `input/classes/cp` (original -cp/--class-path entries) and the
115+
`input/classes/p` (original -p/--module-path entries) folders. The `input/stage` folder contains all information
116+
needed to replicate the previous build context.
117+
118+
#### `input/stage`:
119+
120+
Here we have `build.cmd` that contains all native-image command line options used in the previous build. Note that
121+
**even the initial build that created the bundle already uses a class- and/or module-path that refers to the contents
122+
of the `input/classes` folder**. This way we can guarantee that a replay build sees exactly the same relocated
123+
class/module-path entries as the initial build. The use of `run.cmd` is explained later.
124+
125+
File `all.env` contains the environment variables that we allowed the builder to see during the initial build and
126+
`all.properties` the respective system-properties.
127+
128+
### `build`:
129+
130+
This folder is used to document the build process that lead to the image that was created alongside the bundle.
131+
The `report` sub-folder holds `build.log`. It is equivalent to what would have been created if the user had appended
132+
`|& tee build.log` to the original native-image command line. Additionally, we have several json-files:
133+
* `analysis_results.json`: Contains the results of the static analysis. A rerun should compare the new
134+
`analysis_results.json` file with this one and report deviations in a user-friendly way.
135+
* `build_artifacts.json`: Contains a list of the artifacts that got created during the initial build. As before,
136+
changes should be reported to the user.
137+
* `build_output.json`: Similar information as `build.log`.
138+
* `jni_access_details.json`: Overview which methods/classes/fields have been made jni-accessible for image-runtime.
139+
* `reflection_details.json`: Same kind of information for reflection access at image runtime.
140+
141+
As already mentioned a rebuild should compare its newly generated set of json-files against the one in the bundle and
142+
report deviations from the original ones in a user-friendly way.
143+
144+
### `out`:
145+
146+
This folder contains all the debuginfo needed in case we need to debug the image at some point in the future.
147+
148+
### `rbundle`:
149+
150+
Contains the `Launcher.class` that is used when the bundle is run as a regular java launcher. The class-file is not
151+
specific to a particular bundle. Instead, the Launcher class extracts the contents of the `input` into a temporary
152+
subdirectory in `$TEMP` and uses the files from `input/stage/all.*` and `input/stage/run.cmd` to invoke
153+
`$JAVA_HOME/bin/java` with the environment-variables and with the arguments (e.g. system-properties) needed to run the
154+
application on the JVM.
155+
156+
## Enforced sanitized image building
157+
158+
### Containerized image building on supported platforms
159+
160+
If available, docker/podman should be used to run the image builder inside a well-defined container image. **This allows
161+
us to prevent the builder from using the network during image build**, thus guaranteeing that the image build result did
162+
not depend on some unknown (and therefore unreproducible) network state. Another advantage is that we can mount
163+
`input/classes` and `$GRAALVM_HOME` read-only into the container and only allow read-write access to the mounted `out`
164+
and `build` directories. This will prevent the application code that runs at image build time to mess with anything
165+
other than those directories.
166+
167+
### Fallback for systems without container support
168+
169+
If containerized builder execution is not possible we can still at least **have the builder run in a sanitized
170+
environment variable state** and make sure that **only environment variables are visible that were explicitly
171+
specified with `-E<env_var_name>=<env_var_value>` or `-E<env_var_name>`** (to allow passing through from the
172+
surrounding environment).
173+
174+
## Handling of Image build errors
175+
176+
To ensure replay bundles are feasible for the [second usecase decribed above](#motivation) we have to make sure a
177+
bundle gets successfully created even if the image build fails. Most likely in this case the `out` folder will be
178+
missing in the bundle. But as usual `build/report/build.log` will contain all the command line output that was shown
179+
during the image build. This also includes any error messages that resulted in the build failure.

0 commit comments

Comments
 (0)