-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Make it easier to test load all the .sql
files during a docker build
operation.
#731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
We had a similar scenario, not about testing the dump at build time but more as to build a reusable Docker image for integration testing. Here is an attempt at implementing this feature, not exactly the same keywords as in the initial request, but the change is quite minimal
and a
I can issue a Pull Request if necessary but I'm not sure how it should be structured as the |
I just stumbled upon this thread while trying to implement the same thing! For lack of better ("better" depends on the priorities of course) alternatives, I want to "bake" the imported/processed SQL right into the Docker image. I ended up with a slightly modified version of Rémi's approach: diff --git a/docker-entrypoint.sh.orig b/docker-entrypoint.sh
old mode 100644
new mode 100755
index a383a36..62655a1
--- a/docker-entrypoint.sh
+++ b/docker-entrypoint.sh.new
@@ -343,7 +343,9 @@ _main() {
fi
fi
- exec "$@"
+ if [ "${POSTGRES_INIT_THEN_EXIT:-}" != '1' ]; then
+ exec "$@"
+ fi
}
if ! _is_sourced; then Here is my FROM ghcr.io/public-transport/gtfs-via-postgres AS sql
WORKDIR /importer
ENV DEST_PATH=/tmp/sql/gtfs.sql
ADD import.sh ./
RUN --mount=type=cache,target=/tmp/gtfs,sharing=locked \
./import.sh
FROM postgis/postgis:15-3.4-alpine AS import
# configure access to the container-local PostgreSQL server
ARG POSTGRES_USER
ENV POSTGRES_USER=$POSTGRES_USER
ARG POSTGRES_PASSWORD
ENV POSTGRES_PASSWORD=$POSTGRES_PASSWORD
ARG POSTGRES_DB
ENV POSTGRES_DB=$POSTGRES_DB
# patch pre-populated docker-entrypoint.sh
ADD docker-entrypoint.sh /usr/local/bin/
# tell docker-entrypoint.sh *not to* (re)start PostgreSQL after importing /docker-entrypoint-initdb.d/*
ENV POSTGRES_INIT_THEN_EXIT=1
# We prefix our GTFS SQL with `20_`, so that it gets processed *after* the pre-existing PostGIS init script (/docker-entrypoint-initdb.d/10_postgis.sh).
RUN --mount=type=bind,from=sql,source=/tmp/sql,target=/tmp/sql \
ln -s /tmp/sql/gtfs.sql /docker-entrypoint-initdb.d/20_gtfs.sql && \
/usr/local/bin/docker-entrypoint.sh postgres && \
rm /docker-entrypoint-initdb.d/20_gtfs.sql
# For the final image, use an "unadulterated" PostGIS image.
FROM postgis/postgis:15-3.4-alpine
# Copy over the imported DB from the `import` stage.
COPY --from=import /var/lib/postgresql/data /var/lib/postgresql/data |
Issue: Make it easy to test load all files from
docker-entrypoint-initdb.d
and verify they work, during adocker build
, without fully starting the final db instance.I have a
Dockerfile
that doesFROM postgis/postgis:10-3.0-alpine
, which itself hasFROM postgres:10-alpine
or such. Then we add a bunch of machine generated.sql
files intodocker-entrypoint-initdb.d
, to effectively create a readonly db artifact that can be used in our various environments. Sometimes we find issues where they don't integrate well in our final container we build (ex, one file creates an index on a column that changed in another). But we don't find out until we actually run the container. We actually used to do at the end of ourDockerfile
aRUN /docker-entrypoint.sh postgres --describe-config
to create that behaviour, but at some point a while back, the adding of the_pg_want_help
function caused these options to short-circuit the db load process there (it always exits with success regardless of whether thedocker-entrypoint-initdb.d
files are valid or not).In order to catch this at build time, I have created a hacked up
dbtest.sh
that sources thedocker-entrypoint.sh
, then runs a limited version of the startup process that basically does everything in_main
except the finalexec $@
. But I'm concerned that this will be hard to keep in sync with thedocker-entrypoint.sh
if other changes are made in the future.I feel like this would be best to modify the
docker-entrypoint.sh
to allow it to stub out that part of the startup into a separate function. Or, add a magic keyword to thedocker-entrypoint.sh
to activate special init-db-but-don't-launch behavior. It also feels like this would be generally useful for people, and also would maybe give people the option to create containers where the initdb.d stuff is already loaded into tables during the container build process (though that specific feature might be bad, too). In our case,/var/lib/postgresql/data
isn't kept in the build, so the full load of thedocker-entrypoint-initdb.d
will happen on container load, which happens to work well for us.I can create a PR, but was wondering if there are preferences here as to the preferred paths, first. I think option 3 below seems least disruptive, but also slightly harder to use for some users...
docker-entrypoint.sh postgres-load-only
and it does everything but theexec "$@"
at the end of_main
.docker-entrypoint.sh postgres --load-only
that shell strips out and then lets the init start and then skips theexec "$@"
at the end of_main
docker_init_all
that takes almost everything out of_main
into a function, and_main
would just be effectivelydocker_init_all $@;exec "$@"
. Then a user could justsource docker-entrypoint.sh
and thendocker_init_all
to clone that functionality. In a dockerfile, it could look likeRUN . /docker-entrypoint.sh ; docker_init_all
The text was updated successfully, but these errors were encountered: