-
Notifications
You must be signed in to change notification settings - Fork 9.3k
Dockerfile and automatic generation of images via GitHub Actions #1469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
You're right, if I try to start the
I guess --cpu should be set by default in the cpu image. |
Alternatively, the documentation could mention that you have to run the |
The I've introduced a separate |
7e5777d
to
006a9d8
Compare
For completeness' sake, I've added all image variants and the information about the EXTRA_ARGS=--cpu build arg to the README. |
fad7f91
to
19422a4
Compare
I've slightly modified the Dockerfile to install git and git-lfs, so that ComfyUI Manager works out of the box. I've updated the collapsed documentation in the original post. |
Updates: I've update the requirements files to the latest versions (and created a separate PR for it). I've changed the base image to 3.11-slim-bookworm to make sure that Python 3.11 is used (not 3.12). |
Update: I've removed the requirement-*.txt files because I also wanted to build a cu118 version, and since xformers is gone (for now), it just seems easier and more flexible not to have to commit everything to a file. It also might make this PR more acceptable to be merged. I've updated the description in the first post accordingly. |
What is blocking this PR ? |
Hey, I'm using your docker and sometime with custom node I get this error
Seems that |
Thanks for reaching out. Can you tell me which custom node this is? I'll try to figure out which packages are used by commonly used custom nodes... |
I don't know exactly which one but this occurs with https://github.com/ltdrdata/ComfyUI-Impact-Pack with |
It looks like this might be solved by installing |
I will not be able to try before tomorrow, I will let you know |
When will this be merged? :-) |
You can now mount a folder to /app/custom_venv and the image will copy its venv there and launch from there. This will make installed custom_node dependencies persist across restarts. |
31fb6d8
to
de3c90e
Compare
@ravensorb not likely, see #1469 (comment) |
@oxc i think that is better to migrate this PR in a dedicated repo. Many github organisations separate the software it self from the dockerization. What do you think? |
@rizlas the entire point of this PR is to have an image that's officially endorsed by ComfyUI maintainers. If they want a separate repo for this they'll request that, but in my experience many projects keep this support in the same project repo. To have an unofficial image at @oxc own repo defeats the intent of the PR. There's a variety of unoffical images already published, but we want ComfyUI to either endorse one, or better yet accept one to maintain and publish officially as that would be more trustworthy for users along with centralizing contributions/maintenance on the image. |
Yeah but, it's been one year since this pr was opened. One of the many pr aiming to address docker support. I think that they will not merge any docker related pr without also having on board a full time maintainer of it. So imho better have single ref point, even as unofficial (but in the end official), than having a years long pr. My2c |
Feel free to use the images published to my fork of the repo (without any guarantees). However, as @polarathene has perfectly explained, that is not the intent of this PR. In any case, any such decisions would require input from one of the maintainers, otherwise it's just guess work. |
This PR is the only remaining active one AFAIK. It's author, @oxc is very responsive to feedback and keeping the PR relevant in hopes it'll eventually get considered. In the meantime, interested users will engage in review / discussion here and provide feedback from their own usage which should help towards instilling more confidence to accept this PR. Last I recall in past PRs for Docker, at the time I think the only maintainer was the original author of ComfyUI and they were very much against accepting a PR for Docker, since they were not confident in reviewing contributions or maintaining it. Best thing you can do right now is help demonstrate community demand for this with a 👍 reaction to the PR description at the top (we only have 8 atm), and wait until a maintainer chimes in. I know that it's a long wait, but the maintainers will prioritize what they're comfortable with, thus until one of them wants an official Docker image the only other option is sufficient demand from the community to justify it (just the 👍 reactions, not +1 comments which do no good). Alternatively if someone engages with the project long enough with contributions to gain maintainer trust and become an official maintainer themselves, that might be "faster", but is a lot of work 😅 You may see a decision to use a separate repo in the org and assign someone like @oxc as a maintainer for that. They'd be granted sufficient permissions to manage that repo, and don't need to have the ability to manage any secrets. That repo could enforce PRs (protected main branch) and still require approval to merge from the core maintainers if that extra precaution were necessary. Depends what their comfortable with, but for now we need to demonstrate there is demand for it and that what the PR proposes can be trusted (which community engagement is assisting with). Most importantly, try not to ping the maintainers unless there is good reason to. They'll be swamped in notifications and other tasks. Wanting a PR looked at is fair, but if they're bothered by it too frequently it'll get the opposite effect you want.
I understand what you're saying but until a project maintainer adds to the project README / docs a section of endorsed images to direct users, you already have to go find some unofficial published image and place trust that the publisher is not malicious (or review and build manually). What you're proposing still requires discovery/awareness of the third-party maintained image. |
with: | ||
registry: ghcr.io | ||
username: ${{ github.repository_owner }} | ||
password: ${{ secrets.GITHUB_TOKEN }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A token doesn't need to be created manually, if you provide the permissions as part of the GitHub action, see also: https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/controlling-permissions-for-github_token
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the token provided by GitHub actions, it's not created manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but the write:packages
scope is missing in the permissions for the workflow, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works on my fork. If not restricted, the token seems to have read/write permissions by default. Here are some more details:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good god, you are fast. You are not looking for a job by any chance? 🤣
@huchenlei excuse the ping. 🫣 A lot of good effort has been put into this patch, but it needs some input to progress. I'd love to eventually see an official image, as do many others. Thanks 🍀 |
I am chipping away at a review with extensive feedback. It's gotten quite verbose however with enough change suggestions that it may be easier for me to take that and create yet another PR attempt 😂 I'm also curious about alternative tooling vs Question - Has the ComfyUI Manager integration changed?While I work away at the review, as I haven't used ComfyUI manager myself yet... can anybody chime in with the current situation since this PR was created? There's now an official CLI package for ComfyUI that installs ComfyUI and can install the manager, along with an official registry for manager packages to install vs the less trustworthy approach prior (IIRC the official registry was in response to some exploit/malware event from a bad actor abusing users trust of the manager?). I'm curious if that changes anything. I haven't yet explored the CLI install method, but am curious how dependent that is upon a full git clone, which IIRC was decided for ComfyUI Manager to handle updates or something? (still working through past review discussions) I am aware that ComfyUI Manager has some options at least for just leveraging Concern - Custom Venv with third-party package additions vs Python/PyTorch upgrades?One other concern that my review also raises is with the venv sync. When a user upgrades to a new image release, that may contain a change in Python version or PyTorch / Cuda. The rsync operation during container startup will copy that over to a custom venv, but will that risk breaking that custom venv? Apart from size bloat with no longer used PyTorch packages (which appear to be around 1.5GB in size for the CPU variant), anything installed by the user via ComfyUI Manager or similar to that custom venv would presumably be at risk of breakage? I am not much of a Python user or dev, but if I'm not mistaken, the venv has packages stored in Python version specific directories, so upgrading to newer Python presumably makes those packages no longer used by the venv? (that or the venv configures for the old Python version that'll no longer be available on the system? Although the rsync would correct that AFAIK, site-packages dir aside) We could consider having the Python runtime installed/pinned separately in a way that it'd be persisted as part of the users volume, rather than relying on a base image to provide it. That won't really help with the PyTorch change however. If anyone is familiar with the situation with ComfyUI Manager (or anything else a user might setup afterwards), is that keeping track of deps/packages somewhere? Surely it's something those would do on a host to upgrade Python or PyTorch with an existing install on a host instead of a container...? Is there a modified list like a Image design when mutability of dependencies is desiredOr is it more realistic to just disregard Python/PyTorch upgrades, by keeping that as an initial install at container run time when a volume is first mounted and empty? In that sense the image would be rather lightweight itself, only packaging the ComfyUI repo + convenience scripts. Afterwards image upgrades would get a newer version of ComfyUI, but you'd otherwise have everything else managed externally at your volume mount 🤔 Perhaps a bit slower for the first time run of the container vs time downloading a much larger image?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used your build on the fork to successfully install and run comfyUI in docker. Went pretty smooth, all things considered.
I have been linking volumes to subfolders within the app folder individually. I think that's the most difficult thing to figure out which folders are necessary. I wonder if providing a volume for /app on the whole would be detrimental to the deploy.
e: And since I just published this after starting it days ago, I'll edit with some agreements to the previous post -
Or is it more realistic to just disregard Python/PyTorch upgrades, by keeping that as an initial install at container run time when a volume is first mounted and empty?
In that sense the image would be rather lightweight itself, only packaging the ComfyUI repo + convenience scripts. Afterwards image upgrades would get a newer version of ComfyUI, but you'd otherwise have everything else managed externally at your volume mount 🤔
I like this idea as it creates a more flexible base image. The one thing I would caution: the installation of pytorch can be fraught with complications, so the startup install script would need to be good. I have been running a number of torch/cuda containers, and the only ones that have worked easily have been those pre-built with cuda 12.4 and torch.
For maintainers, CI cache usage would be considerably lighter too. The image registry would only have the single variant (the user could just use an ENV to handle the cpu/cuda/rocm at runtime instead).
possibly worth creating a base image that does runtime install, and a few common version combos? 1 rcom, 2 cuda, and then the base image with just cpu?
That said, without bundling this weight into the image itself, the container would only be practical with an explicit volume mounted.
I have seen images with required volumes and I think that's a fair requirement. This one actually needs a number of required volumes as, at the minimum, you need a volume for models if you want it to do anything of value in the tool. That was why my question above: would it be possible to just add a volume for /app
and get all the benefits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the compose example.
# For CPU only | ||
docker pull ghcr.io/comfyanonymous/comfyui:latest-cpu | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to have a short breakdown of recommended volumes, any key environment variables, or just a quick-start docker run command here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have it in the compose file
My WIP review addresses this. ComfyUI has a There's also a YAML extra paths file (see the comfyui repo) that can be used to further customize dir organization if needed, and I think that has an arg as well.
I'm still not that experienced with building PyTorch containers, any additional context on complications would be good. I know that you want to install PyTorch upfront to what is compatible with your GPU, otherwise using the CPU wheel. Since the packages bundle everything there though, I don't know about any other concerns beyond getting the index right and installing that prior to the actual project deps.
Unrelated would be additional system packages, I've seen some images bring in stuff like One image I saw was installing the persisted venv with the python version and pytorch variant in the venv dir name, so that it wouldn't overwrite a pre-existing one should the image have bumped those up in it's scripts.
That's something I've also looked into, the only concern here with CUDA 12.4 I think was it was the last one supported by an older Nvidia driver series (550?), not sure if that's still the case, but it's probably the main limitation to keep in mind with compatibility there. I have a link in my review that helps track that information 👍
Yeah, that's basically just shifting some
One image I saw clones the repo at container startup too, it is basically a Dockerized installer to a volume. Their scripts for it though was a bit of a mess that didn't look fun to maintain at all. If you'd like to beat me to it, you should be able to just run a Fedora or Ubuntu container for example, install I don't know how common / comfortable users are with rootless containers, but if you run those (quite easy with Podman) then your ownership woes with the volume mount are taken care of for you regardless of what UID your have on the host. The rootful container switch to a non-root user can be supported too, it's just more work and has some potential for bugs (not that I'd really want to trust random custom nodes using a root user in a rootful container, it should be secure for the most part these days but I'd still feel iffy about that risk). Regarding your review feedback on docs, AFAIK once you use ENV wise I don't think the image itself needs much there for a user to think about. Possibly |
@polarathene - I took that kind of approach in the container that I recently build. You can take a look here if interested: |
@ravensorb thanks for the reference. That's a bit tidier than some I've seen elsewhere, but for an official upstream PR here that'd be far too much to get approval. The community can always extend the official image, but without getting that first step it'll be spread out with various third-party attempts 😓 I'd love more context on the additional apt deps you've installed if you know what those were needed for to support. I do have some feedback though from a quick glance.
|
username: ${{ github.repository_owner }} | ||
password: ${{ secrets.GITHUB_TOKEN }} | ||
- name: Build and push | ||
uses: docker/build-push-action@v5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uses: docker/build-push-action@v5 | |
uses: docker/build-push-action@v6 |
Hey I have gone through the whole PR and it has been very help full I have created a fort of Comfyui to add docker support on the official page but currently it only supports Nvidia GPU, since I don't have access to a amd card I can't really test my code for amd. Please take a look at my fork and let me know what you think and any problems that my fork has or might face. Thank you |
Found a slight issue with using this as an image. I think that it's related to the pip cache not being a volume? I installed a few modules, but they broke after a power failure. I don't know if I had ever restarted the container after bringing it up initially. Had to exec into the image and manually reinstall missing dependencies. Honestly haven't done much debugging of it, so can't say for certain, but it might be something that must be a volume to work properly rather than an optional volume. Someone with more knowledge of this system may be able to tell me where I've gone wrong. Didn't think about this PR at the time, so didn't keep the logs as I brought the server back up. |
@kayakyakr cache is not installed in image, because purpose of the cache is to speed up following installs, while you don't plan to install anything for complete build you better send an environment (which packages) an the error |
Based on ComfyUI PR comfyanonymous#530 Co-Authored-By: ZacharyACoon <show>
Add a GitHub Action that builds a docker image on every push and tag, and publishes it to GitHub container repository, and -- if a docker username is defined in the repository secrets -- to the Docker Hub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got side-tracked while writing up this rather lengthy review.
I didn't quite finish it as I wanted to do some extra investigation on my end and summarize all of this for you. To avoid losing all this information thus far I'm submitting the review in it's current state as I think it's only persisted so long through the browser cache 😓
I don't know if I'll have time until perhaps later in the month to respond to much engagement (if any) on the review. Once I've cleared away some existing work and can refresh over this review again I'll be happy to continue, or spin up a separate PR with my feedback applied.
Hope it's somewhat helpful should anyone read through it 😎
ARG PYTORCH_INSTALL_ARGS="" | ||
ARG EXTRA_ARGS="" | ||
ARG USERNAME=comfyui | ||
ARG USER_UID=1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default used by official DB images (redis,postgres,mongo,mysql,etc) is 999
for both UID and GID.
I understand the intention here is for rootful containers to default to a more convenient ownership and/or UID of processes that will run in the container so that they're write to storage with that ownership...
However, should the user install some third-party code, or some other vulnerability be exploited to enable a container escape, this would be similar to escaping as root (assuming the convenience is on user facing systems rather than servers, where the environment is often configured to allow running docker CLI commands without requiring credentials).
As such I'd discourage this default choice. You also shouldn't need to depend on it for building an image.
The official Docker DB images tend to have an entrypoint script with this logic to correct ownership mismatch as the container root user to the non-root user that it switches to internally. That only runs when the container is provided with a --user
(Docker CLI) / user:
(compose.yaml
) option to override the container user.
That referenced entrypoint script then sets a umask since for databases, it's wiser to restrict access and this will ensure any new files or folders created at runtime are not accessible to group/other users, only the file owner.
Finally the entrypoint uses exec
to execute a command and replace the current shell process with that (which ensures signals are properly forwarded). There's some other parts in the script to support custom command / CMD
from the user when it's only options, and an ENV they have for alternatively providing those too (like the you're already doing in the current Dockerfile
iteration).
Introducing an ENTRYPOINT
script was suggested in my previous review feedback, with reluctance expressed regarding it adding friction to the review process.
Not handling this sort of thing properly though may result in support requests / complaints that instead has maintainers drop official image support. So I'm proposing one with references and context to provide confidence in it. There should be little need to modify it going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reference
For context, here's where these were implemented/discussed in the official docker images:
find
+chown
: Only chown things in the entrypoint that are not already owned by redis redis/docker-library-redis#166setpriv
: Allow arbitrary --user values redis/docker-library-redis#48 (gosu
was used originally, Valkey's image changed tosetpriv
due to security concerns, considered as equivalent in functionality, but there is a difference regardingHOME
ENV IIRC,setpriv
will not change that)umask
: Insecure permissions for dump.rdb redis/docker-library-redis#305exec
is important when a container inits via a shell script (vs direct binary call asENTRYPOINT
)- PID 1 importance with
exec
+ENTRYPOINT
script
If you need anything explained about the bash syntax below, let me know. I have revised it a little with improvements that should help make it easier to grok / maintain.
NOTE: In Debian /bin/sh
links to /usr/bin/dash
, not bash. In Alpine images you'd get a link to ash (which lacks some bash features like arrays), and other glibc base images may link to bash (Fedora I think), so be sure to use shebangs that explicitly call for bash and consider it as an explicit SHELL
instruction in the Dockerfile
if potentially using bash in RUN
/ CMD
(without an ENTRYPOINT
).
./docker/entrypoint.sh
:
#!/usr/bin/env bash
set -e
# Optional, for convenience:
# If the user provided a custom command that starts with a dash,
# then treat the command as args/options to append to the default CMD:
if [[ "${1#-}" != "${1}" ]]; then
set -- comfyui-start "${@}"
fi
export XDG_DATA_HOME="${XDG_DATA_HOME:-/service/comfyui/data}"
# Workaround until resolved: https://github.com/comfyanonymous/ComfyUI/issues/8110
if [[ ! -d "${XDG_DATA_HOME}/custom_nodes" ]]; then
mkdir -p "${XDG_DATA_HOME}/custom_nodes"
fi
# Optional, for security (non-root switch) + convenience (chown):
# Support `docker run --user` by only running `chown` when using the default CMD + USER (root):
# 1. Ensure the data directory has the expected ownership of the non-root user (`comfyui` user).
# 2. Switch to the `comfyui` user + group via `setpriv` which will re-run this entrypoint script.
# NOTE: `setpriv` will change the UID/GID to non-root, but will not adjust ENV like HOME.
# If a service requires such, it must be set explicitly prior to starting that service.
if [[ "${1}" == 'comfyui-start' && "$(id -u)" == '0' && "${PG_UID}" != '0' ]]; then
echo 'Switching to non-root user'
# Assign ownership to the comfyui users data directory (used as ComfyUI `--base-directory`):
find "${XDG_DATA_HOME}" -not -user comfyui -exec chown comfyui:comfyui '{}' +
# Assign ownership to the VENV (this is large, switching will be slow)
#find "/app/venv" -not -user comfyui -exec chown comfyui:comfyui '{}' +
# Update HOME env for pip to be able to write cache (required for ComfyUI-Manager)
export HOME='/home/comfyui'
# Run this entrypoint script again (with any additional args provided), but as the `comfyui` user:
exec setpriv --reuid=comfyui --regid=comfyui --clear-groups -- "${0}" "${@}"
fi
# Optional, for extra security:
# When `umask` is the default (0022),
# Ensure only the owner can access files created by the process at runtime:
if [[ "$(umask)" == '0022' ]]; then
umask 0077
fi
# Run the intended command, replacing the active process (this shell) as PID 1:
exec "${@}"
./docker/start.sh
: (could technically exist as part of the entrypoint, but separating responsibility is better)
#!/usr/bin/env bash
# Support for persisting Python venv to a volume mount:
# https://github.com/comfyanonymous/ComfyUI/pull/1469#issuecomment-2034975575
# - Sync the containers internal venv + correct the absolute path in `pyenv.cfg`
# - The PATH env gives preference to the custom location when present
#
# NOTE: rsync will copy over any updates from the source path overwriting the
# equivalent at the destination path. Any other files at the destination that
# don't exist at the source path will be left as-is.
if [[ -d "${VIRTUAL_ENV_CUSTOM}" ]]; then
echo "Syncing '${VIRTUAL_ENV}/' to '${VIRTUAL_ENV_CUSTOM}/'"
rsync -a "${VIRTUAL_ENV}/" "${VIRTUAL_ENV_CUSTOM}/"
sed -i "s!${VIRTUAL_ENV}!${VIRTUAL_ENV_CUSTOM}!g" "${VIRTUAL_ENV_CUSTOM}/pyvenv.cfg"
fi
# Implicitly prepend `--cpu` to the args when running the CPU image variant:
if [[ "${IMAGE_VARIANT}" == 'cpu' ]]; then
set -- '--cpu' "${@}"
fi
exec python -u /opt/comfyui/app/main.py \
--listen "${COMFYUI_ADDRESS}" \
--port "${COMFYUI_PORT}" \
--base-directory "${XDG_DATA_HOME}" \
${COMFYUI_EXTRA_ARGS} ${@}
-P
flag removed fromrsync
, there's like 30k files in the venv at first sync, that's a considerable amount of noise to output, just go with a single line to stdout to communicate any delay instead 😅- The
python -u
could drop the-u
forENV PYTHONUNBUFFERED=1
in theDockerfile
which might be better should the container try to usepython
at any other point (such as a running container with a user shelling in and troubleshooting or running their own python scripts, would also work with python shebangs). - I still don't really see the point of the
--listen
+--port
options with ENV set to the defaults ComfyUI already uses... users could just dodocker run image-name --listen 127.0.0.1 --port 80
(or equivalentcommand:
incompose.yaml
), if they need more flexibility via ENV they could manage that themselves, the image itself doesn't really need to support that 🤷♂️
# syntax=docker/dockerfile:1
# Default to CPU:
ARG VARIANT="cpu"
ARG PYTORCH_INSTALL_ARGS="--index-url https://download.pytorch.org/whl/${VARIANT}"
ARG PYTHON_VERSION=3.12
FROM python:${PYTHON_VERSION}-slim
# Fail fast on errors or unset variables
SHELL ["/usr/bin/bash", "-eux", "-o", "pipefail", "-c"]
ARG DEBIAN_FRONTEND=noninteractive
RUN <<HEREDOC
# Create comfyui user + group (reserving a fixed ID before any packages install):
# NOTE: Debian specific `adduser` (more convenient than the alternative `useradd`)
# - `--system` intentionally ensures no login shell or password (system user)
# - `--home` will create the home dir (system users default to /nonexistent)
# - `--group` creates a group with the same user and ID (aborts if already taken)
adduser --system --home /home/comfyui --uid 999 --group comfyui
apt-get update
apt-get install -y --no-install-recommends \
git \
git-lfs \
rsync \
fonts-recommended
HEREDOC
ARG XDG_CACHE_HOME=/cache
# This ENV is used by Python tooling, normally it's set by activating a venv:
ENV VIRTUAL_ENV=/app/venv
# This ENV isn't standard, but will
ENV VIRTUAL_ENV_CUSTOM=/app/custom_venv
# Run python from venv (prefer the custom venv when available)
ENV PATH="${VIRTUAL_ENV_CUSTOM}/bin:${VIRTUAL_ENV}/bin:${PATH}"
ARG PYTORCH_INSTALL_ARGS
# Create a virtual environment + install base packages:
RUN --mount=type=cache,target=/cache <<HEREDOC
python -m venv "${VIRTUAL_ENV}"
pip install torch torchvision torchaudio ${PYTORCH_INSTALL_ARGS}
HEREDOC
# Separate layer to not invalidate the much larger torch layer,
# This layer will invalidate when requirements.txt is updated.
RUN --mount=type=cache,target=/cache \
--mount=type=bind,source=requirements.txt,target=requirements.txt \
<<HEREDOC
pip install -r requirements.txt
# Optional: Prevents a concern with installing custom nodes,
# that introduce an incompatible cv2 dependency:
# https://github.com/comfyanonymous/ComfyUI/pull/1469#issuecomment-2571452051
pip install opencv-python-headless
HEREDOC
# Copy over everything from the git clone, except files matching `.dockerignore` patterns:
COPY . /opt/comfyui/app/
# Workaround due to the top-level file organization of the repo:
# This places the scripts in the PATH and adjusts filenames to look more like commands
RUN <<HEREDOC
cd /opt/comfyui/app
mv docker/start.sh /usr/local/bin/comfyui-start
mv docker/entrypoint.sh /usr/local/bin/comfyui-entrypoint
rm -rf docker/
chmod +x /usr/local/bin/comfyui-*
HEREDOC
ARG VARIANT
ENV IMAGE_VARIANT="${VARIANT}"
# Default environment variables used by `comfyui-start`:
ENV COMFYUI_ADDRESS=0.0.0.0
ENV COMFYUI_PORT=8188
ENV COMFYUI_EXTRA_ARGS=""
ENTRYPOINT ["/usr/local/bin/comfyui-entrypoint"]
CMD ["comfyui-start"]
- Removed
PIP_CACHE_DIR=/cache/pip
,pip
will respectXDG_CACHE_HOME
. You don't need to make the cache dir upfront either, it'll do that when it's missing. - Switched
XDG_CACHE_HOME
fromENV
toARG
, it's only needed for the cache mount during build, afterwards it makes more sense to change the location. I previously highlightedENV
=>ARG
in prior review feedback. - Removed
WORKDIR
, no apparent need for it. It was used for avoiding more explicit/absolute paths withCOPY
and callingmain.py
. - I am inclined to drop the ENV for
COMFYUI_ADDRESS
+COMFYUI_PORT
as per prior review feedback, there's little value in it. You shouldn't be adding them to the image by default for "potential" specific requirements... it is far better to have feature requests for those so they're known instead of unknown and can be documented accordingly (even if that's just viagit blame
history tying a PR to an issue).
Could additionally:
volumes: | ||
- "./models:/app/models" | ||
- "./input:/app/input" | ||
- "./temp:/app/output/temp" | ||
- "./output:/app/output" | ||
- "./user:/app/user" | ||
- "./custom_venv:/app/custom_venv" | ||
- "./custom_nodes:/app/custom_nodes" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be ideal to minimize these volume mounts, one for the ComfyUI state itself, and probably best to keep another separate for the custom venv.
We can use ComfyUI's own support for these various locations, but I don't think base_path
+ is_default
is sufficient alone? (ComfyUI Manager has some support of it's own)
comfyui:
base_path: /home/comfyui/data
is_default: true
input: input/
user: user/
checkpoints: models/checkpoints/
clip: models/clip/
clip_vision: models/clip_vision/
configs: models/configs/
controlnet: models/controlnet/
diffusion_models: |
models/diffusion_models
models/unet
embeddings: models/embeddings/
loras: models/loras/
upscale_models: models/upscale_models/
vae: models/vae/
Even with that, ComfyUI refuses to start if it can't create it's own default directories like /user
(crash) or /input
(error). I was curious about running the app install with it left as root
owned, despite the switch to a non-root user. Technically a bug on ComfyUI's end there I think, when new default paths are configured.
Resolved by creating those missing directories in advance.
is_default: true
support arrived with the 0.2.3
release (Oct 2024).
UPDATE: The CLI arg --base-directory
(is this really not documented anywhere else?) can be used instead of this verbose YAML config.
- It works better than the
base
setting above since all paths will actually default to that location (_themodels/
subpath folders retain that prefix due tofolder_paths.py
. --base-directory
is fairly new (v0.3.13
, released Jan 30 2025).
NOTE: ComfyUI Manager 3.0 will leverage the ComfyUI user/
directory for config and generated files at <USER_DIRECTORY>/default/ComfyUI-Manager/
.
There's also a snapshot feature, but it states that it's reliant on git
for full support.
There's some overlap here with comfy CLI
(which presumably an official wrapper to ComfyUI Manager and maybe cm-cli
?) 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another concern with the custom venv syncing is when the image upgrades to a new version of Python, it'll sync over the new site packages folder with PyTorch deps, but you would not have the equivalent for anything third-party, so caution would be needed when upgrading?
ports: | ||
- "8188:8188" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it more explicit that it's only accessible from the host itself. The default is 0.0.0.0
which to less experienced users will publish the port to public interfaces which firewalls like UFW last I checked will not prevent (since Docker manages iptables directly and it's rules have precedence). When firewalld is used instead, this scopes to the docker
zone IIRC so it differs a bit.
ports: | |
- "8188:8188" | |
ports: | |
- "127.0.0.1:8188:8188" |
deploy: | ||
resources: | ||
reservations: | ||
devices: | ||
- driver: nvidia | ||
count: all | ||
capabilities: [gpu] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for reference, is it assumed the user has done any required setup to support this?
It's simplified for you if you're using Docker Desktop and Nvidia, but otherwise on Linux IIRC there's a little bit of extra setup.
CDI is where OCI spec is adopting this config more broadly, although Intel and AMD haven't quite got there at creating the necessary host files. Nvidia supports it well IIRC but might still require a bit more work than the snippet chosen here at present.
Intel/AMD is a bit more involved IIRC, although you're building ROCm images, without the guidance here users will likely report issues to project maintainers about this. You'd want to either clarify the lack of support for those image variants or provide guidance on how to configure that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding AMD/ROCm, I have seen this issue where the user reports a compose config with related config to support that. From that and others I've seen in the past, the required ENV varies by GPU, so generic guidance for AMD hardware is probably a bit difficult to offer, I guess it depends how much of that is specific to a container deployment when it comes to the burden of support from maintainers.
Probably will improve once AMD has better support for CDI 🤔 (I think that can remove the need for specific ENV for example as relevant config is all generated on the host, only needing to provide a single line to the container runtime to reference it in a standardized manner)
user: "1000:1000" | ||
build: . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user
shouldn't be necessary. It doesn't make much sense with your current iteration vs just using USER
in the Dockerfile
to set that since you're tying it to build-time anyway..
build
is fine, but you could add image
to refer to the nvidia image that would be published. That'll be grabbed instead of the the build
then. You may want to add the related build-arg if you think it's worth including build
in the compose.yaml
, but there's not much advantage to that if the user doesn't intend to build regularly vs just running the equivalent CLI command.
NOTE: Mixing image
+ build
would have any build triggered replace the image
locally which may be a subtle source of bug reports (similar to relying on latest tags that become stale over time without an explicit pull and are prone to breaking changes).
FWIW: Without image
, Compose will build the image and tag it based on the project + service name (so ComfyUI/comfyui:latest
). Generally though I think a user interested in this compose file would prefer the registry image over a local build.
- name: Build and push | ||
uses: docker/build-push-action@v5 | ||
with: | ||
context: . | ||
push: ${{ github.event_name != 'pull_request' }} | ||
tags: ${{ steps.meta.outputs.tags }} | ||
labels: ${{ steps.meta.outputs.labels }} | ||
build-args: | | ||
PYTORCH_INSTALL_ARGS=${{ matrix.pytorch_install_args }} | ||
EXTRA_ARGS=${{ matrix.extra_args }} | ||
cache-from: type=gha,scope=${{ github.ref_name }}-${{ matrix.id }} | ||
cache-to: type=gha,mode=max,scope=${{ github.ref_name }}-${{ matrix.id }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to indicate the source of what implicit labels are being configured?:
- name: Build and push | |
uses: docker/build-push-action@v5 | |
with: | |
context: . | |
push: ${{ github.event_name != 'pull_request' }} | |
tags: ${{ steps.meta.outputs.tags }} | |
labels: ${{ steps.meta.outputs.labels }} | |
build-args: | | |
PYTORCH_INSTALL_ARGS=${{ matrix.pytorch_install_args }} | |
EXTRA_ARGS=${{ matrix.extra_args }} | |
cache-from: type=gha,scope=${{ github.ref_name }}-${{ matrix.id }} | |
cache-to: type=gha,mode=max,scope=${{ github.ref_name }}-${{ matrix.id }} | |
- name: Build and push | |
uses: docker/build-push-action@v5 | |
with: | |
context: . | |
push: ${{ github.event_name != 'pull_request' }} | |
tags: ${{ steps.meta.outputs.tags }} | |
# Defaults: | |
# https://github.com/docker/metadata-action/blob/8e1d5461f02b7886d3c1a774bfbd873650445aa2/src/meta.ts#L509-L522 | |
labels: ${{ steps.meta.outputs.labels }} | |
build-args: | | |
VARIANT=${{ matrix.variant }} | |
cache-from: type=gha,scope=${{ github.ref_name }}-${{ matrix.id }} | |
cache-to: type=gha,mode=max,scope=${{ github.ref_name }}-${{ matrix.id }} |
Cache wise I can't comment atm as I'm not that familiar with the type=gha
one yet. My Dockerfile
suggestions should ensure layers could be shared up until the PyTorch package install, but I don't know if there's much size in that to be worthwhile trying to share across variants cache 😅 (I wouldn't want to complicate maintenance further)
Reference
These are the defaults configured, mostly sourced from repository metadata API (docker/metadata-action
acquires it here, and this is the equivalent response for this repo):
const res: Array<string> = [
`org.opencontainers.image.title=${this.repo.name || ''}`,
`org.opencontainers.image.description=${this.repo.description || ''}`,
`org.opencontainers.image.url=${this.repo.html_url || ''}`,
`org.opencontainers.image.source=${this.repo.html_url || ''}`,
`org.opencontainers.image.version=${this.version.main || ''}`,
`org.opencontainers.image.created=${this.date.toISOString()}`,
`org.opencontainers.image.revision=${this.context.sha || ''}`,
`org.opencontainers.image.licenses=${this.repo.license?.spdx_id || ''}`
];
this.version.main
is set to the parsed version. Fortype=edge
that'll be the branch name, andtype=semver
that'd be the release tag (withv
prefix stripped), along with any prefix/suffix. So overriding that may be better if it's only meant to represent ComfyUI release itself not a variant like the tag.- The
created
label will default to the workflow run timestamp (ISO 8601 timestamp), but the action itself provides{{ date }}
and{{ commit_date }}
expressions. Otherwise you'd manually do such like I've done in the past:
steps:
- name: "Get current timestamp (ISO 8601)"
id: build-info
run: echo "date=$(date --utc --iso-8601=seconds)" >> "${GITHUB_OUTPUT}"
# ...
# docker/metadata-action step:
labels: |
org.opencontainers.image.created=${{ steps.build-info.outputs.date }}
Rough equivalent defaults as explicit labels
So roughly explicit definitions of these using context expressions (this is roughed out and not verified, github.event.repository
should mostly be reliable across workflow types IIRC 😅):
labels: |
org.opencontainers.image.title=${{ github.event.repository.name }}
org.opencontainers.image.description=${{ github.event.repository.description }}
org.opencontainers.image.url==${{ github.event.repository.html_url }}
org.opencontainers.image.source=${{ github.event.repository.html_url }}
org.opencontainers.image.version=${{ github.ref_type == 'tag' && github.ref_name || 'edge' }}
org.opencontainers.image.created={{ date }}
org.opencontainers.image.revision=${{ github.sha }}
org.opencontainers.image.licenses=${{ github.event.repository.license?.spdx_id }}
revision
is usinggithub.sha
context, which IIRC is tied the commit SHA triggering the workflow event, which might not always align with the actual git clone commit being built, care should be taken there 😅created
is relying on the{{ date }}
placeholder from the action.{{ commit_date }}
may be more relevant if it should reflect the date of the source (to align with the revision?). Depends if it's more useful to know what time the container itself was built. OCI spec states it should reflect the date the image was built, similar to some other fields regarding documentation and urls.version
applies my earlier suggestion to use the tag when present oredge
as a fallback (instead of the branch name itself). This excludes any prefix or suffix for variants that is used in the actual image tags.licenses
may not work, I'm not sure if?.
is supported- No
|| ''
fallback needed, as anull
(missing) value should already be treated as''
.
.dockerignore
Outdated
.git | ||
__pycache__/ | ||
*.py[cod] | ||
input | ||
models | ||
notebooks | ||
output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that every docker user knows that containers do not persist data if not instructed otherwise thus update should be handled in other ways (e.g. docker tags)
Just want to point out that VOLUME
instruction in an image will implicitly create an anonymous volume. Docker doesn't presently have a way to opt-out of that like Podman does.
It can be problematic with Docker Compose (where the intent is to persist that volume across image updates, and it does so by associating the volume to the compose service name rather than the image itself), hence an entirely different image with the same VOLUME
path will get that same mount. That can be unexpected such as with a service name db
, and switching from MongoDB to Redis which both share /data
internally.
It'd also not be wise to use that with data that is necessary for the image to function (such as the venv) if were on the same internal one that were updated with image updates for example you'd implicitly get failures when the old venv of your VOLUME
is no longer compatible.
Generally you make an immutable image and have volumes to persist state. That works well for a typical deployment.
Then there is the alternative, where you have a project that's say NodeJS or Python based. Instead of building a custom image with everything baked in, you just use the official image (and any extras that it may require system package wise), and you run that with the volume mount to install and persist your project deps. No git clone or anything within the image.
I've done the latter before for projects serving thousands of users from a single server. The image just provides the environment, the project is version controlled with git, and there's a bit of manual work (or script automation) to push out an update. Specialized images are kinda redundant/wasteful for that purpose.
This type of approach seems to align with what is being done here with reliance upon third-party customization / deps. The volume mount isn't just application state, it's messing with deps and that's going to have some interplay with the Python and PyTorch versions should you use the image to upgrade.
What @TomK32 describes would be a very light image that doesn't have much point in being published or upgraded after install? That's easier to maintain and would effectively be a documented / committed Dockerfile
that you build locally instead.
Publishing an image with the different permutations (CPU + GPU toolkits) and image updates with git source baked in makes sense when there's less runtime modification going on. It's meant to be convenient/useful, but when you're no longer persisting state there's going to be more risk of breakage should you expect to rely on updates via image upgrades 🤷♂️ (which from the sounds of it will be a concern when adding stuff like ComfyUI Manager)
ComfyUI Manager could instead be bundled if it's the popular, but that doesn't quite solve the issue given how syncing a venv would be done, nor the fact that this addition still installs more deps. With the Comfy CLI tool, you could at least extend this image as a base and install deps in a deterministic manner. You could then just rebuild that to avoid all these problems. A little inconvenient for users that install via UI, if there's no way to export a list (ideally one would be managed/updated as a file without user interaction required).
.* | ||
!.git | ||
__pycache__/ | ||
*.py[cod] | ||
input | ||
models | ||
notebooks | ||
output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prior review feedback of mine to consider.
This ignores the standard folders input,models,output
being copied over. models
has one folder with some actual content in it (last update was 2 years ago, so I assume it's not important?). Other than that, the folders are expected to exist at run time, or ComfyUI attempts to create them (and would fail if it lacked the correct permissions).
EDIT: Non-issue as it seems better to prefer providing the --base-directory
arg like I've mentioned in my review feedback for the compose.yaml
volumes config.
# Fail fast on errors or unset variables | ||
SHELL ["/bin/bash", "-eux", "-o", "pipefail", "-c"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While it won't affect this image build, should anyone extend it with ComfyUI Manager, note that presently cm-cli
would not be reliable to fail early.
UPDATE:
- That issue is resolved, but requires an explicit opt-in for proper exit behaviour.
comfy-cli
has not yet adopted support for it (and maintenance seems to be lagging along with being prone to breakage from transitive deps thatcomfy-cli
might cause more friction than value).
@@ -0,0 +1,101 @@ | |||
name: Build and publish Docker images |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Proposed workflow change (Rough example, I've ran out of time for now to refine and test):
name: 'Build and Publish Docker Images'
on:
workflow_dispatch:
push:
branches:
- master
tags:
- 'v*.*.*'
permissions:
contents: read
packages: write
jobs:
docker:
name: 'Deploy Docker Images'
uses: comfyanonymous/ComfyUI/.github/workflows/generic_publish.yml@master
with:
image-variant: CUDA
project-name: comfyui
revision: ${{ github.sha }}
source: https://github.com/${{ github.repository }}
version: ${{ github.ref_type == 'tag' && github.ref_name || 'edge' }}
secrets: inherit
generic_publish.yml
:
name: 'Build and Publish the Docker Image'
on:
workflow_call:
inputs:
image-variant:
required: false
type: string
project-name:
required: true
type: string
revision:
required: true
type: string
source:
required: true
type: string
version:
required: true
type: string
permissions:
contents: read
packages: write
jobs:
publish-images:
name: 'Build and Publish'
runs-on: ubuntu-latest
steps:
- name: 'Checkout'
uses: actions/checkout@v4
- name: 'Find the latest tag (sorted via SemVer)'
id: tag
env:
REPO_URL: ${{ github.event.repository.html_url }}
TAG_FILTER: '^refs/tags/v[0-9]+\.[0-9]+\.[0-9]+$'
run: |
TAG=$(
git ls-remote --exit-code --refs --tags --sort='-v:refname' "${REPO_URL}" \
| awk '{print $2}' | grep -E ${TAG_FILTER} | head -n 1
)
echo "/${TAG}"
# This step will support building non-native platform archs via emulation (slow)
# Skip this when you can compile for that platform natively, or cross-compile (Zig, Go)
- name: 'Set up QEMU'
uses: docker/setup-qemu-action@v3
with:
platforms: arm64
- name: 'Set up Docker Buildx'
uses: docker/setup-buildx-action@v3
# The metadata action appends labels and sets the image names + tags to publish.
# Tags for the image to be published, based on the following rules:
# https://github.com/docker/metadata-action?tab=readme-ov-file#typeedge
# https://github.com/docker/metadata-action?tab=readme-ov-file#typesemver
# Updates to master branch will publish an image with the `:edge` tag (useful between official releases)
# Pushing a git tag will publish an image with all 3 semver tags
#
# NOTE: Publishing the semver tags like this is useful for software that monitors an image
# for updates only by a digest change to the image tag at the registry, like WatchTower does:
# https://github.com/containrrr/watchtower/issues/1802
#
# NOTE: By default the `:latest` image tag will be updated for the `semver` type below,
# you may want to have more control over that if your release policy could publish a patch
# release to an older major/minor release (as this would unintentionally update the latest tag too):
# https://github.com/docker/metadata-action?tab=readme-ov-file#latest-tag
- name: 'Prepare image tags'
id: docker-metadata
uses: docker/metadata-action@v5
with:
images: |
docker.io/comfyanonymous/${{ inputs.project-name }}
ghcr.io/comfyanonymous/${{ inputs.project-name }}
tags: |
type=edge,branch=master
# Remove the enable condition once v0 releases are no longer published:
# https://github.com/docker/metadata-action/tree/master#major-version-zero
type=semver,pattern={{major}},enable=${{ !startsWith(github.ref, 'refs/tags/v0.') }}
type=semver,pattern={{major}}.{{minor}}
type=semver,pattern={{major}}.{{minor}}.{{patch}}
# https://specs.opencontainers.org/image-spec/annotations/#pre-defined-annotation-keys
labels: |
org.opencontainers.image.title=ComfyUI for ${{ inputs.image-variant }}
org.opencontainers.image.description=${{ github.event.repository.description }}
# Project + organization info:
org.opencontainers.image.vendor=Comfy Org
org.opencontainers.image.url=https://www.comfy.org/
org.opencontainers.image.documentation=https://docs.comfy.org/
org.opencontainers.image.authors=https://www.comfy.org/about
org.opencontainers.image.licenses=GPL-3.0
# Build info:
org.opencontainers.image.source=${{ github.event.repository.html_url }}
org.opencontainers.image.revision=${{ github.sha }}
org.opencontainers.image.created={{ date }}
org.opencontainers.image.version=${{ github.ref_type == 'tag' && github.ref_name || 'edge' }}
# Publish to image registries (GHCR and DockerHub):
- name: 'Login to GitHub Container Registry'
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
# Requires a maintainer to configure credentials as secrets, unlike GHCR:
- name: 'Login to DockerHub'
uses: docker/login-action@v3
with:
registry: docker.io
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
# Build the image from a Dockerfile for the requested platforms,
# then publish a multi-platform image at the registries with the tags and labels declared above
- name: 'Build and publish images'
uses: docker/build-push-action@v6
with:
context: .
platforms: linux/amd64,linux/arm64
push: true
labels: ${{ steps.docker-metadata.outputs.labels }}
tags: ${{ steps.docker-metadata.outputs.tags }}
# annotations: ${{ steps.meta.outputs.labels }}
# Customize any Dockerfile ARG values here:
build-args: |
VARIANT=${{ inputs.image-variant }}
# Prefer OCI media types and OCI artifacts (can remove this line once these become default)
outputs: type=image,oci-mediatypes=true,oci-artifact=true
# Disable provenance attestation (avoid publishing `unknown/unknown` platforms):
# https://docs.docker.com/build/attestations/slsa-provenance/
# https://github.com/docker-mailserver/docker-mailserver/issues/3582#issuecomment-2716365953
provenance: false
The first action snippet would need to be adapted to using a matrix config like you've got here already. My WIP iteration at the time was a little different:
jobs:
docker:
strategy:
# https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#jobsjob_idstrategyfail-fast
fail-fast: false
matrix:
# compute_platform: [CPU, CUDA, ROCm]
include:
- compute-platform: CPU
pytorch-index: cpu
- compute-platform: CUDA
pytorch-index: cu124
version: 12.4
- compute-platform: ROCm
pytorch-index: rocm6.2
version: '6.2'
I currently don't have the resources to keep running ComfyUI and to keep working on this. Is anyone interested in taking this over? @polarathene, you have put a lot of thought and work into your reviews, would you be willing to continue on this? Any feedback from an official maintainer would also be welcome, to gauge whether this has generally any chance of being merged. |
Did you move on to an alternative project?
Yes I will contribute a PR with my own feedback applied when time permits.
I have a small PR open to fix a minor bug where the comfy init code panics if a custom nodes directory doesn't exist, yet that's received no feedback. I am doubtful of a PR for this going through review any time soon unless there is sufficient demand to warrant the maintainers engaging. We can remain hopeful I guess, but I think for now the maintainers have higher priorities for where their time is spent. |
I have stopped playing with local image generators, especially because my hardware was always a bit underpowered. More importantly though, work is keeping me busy on several fronts. |
No worries, I understand. I don't have much time to spend with projects like these myself but I like to try stay informed of where AI tooling is going. I have very little disk space for these kind of images, so I'm slowly learning more about building images for CUDA and trying to get more confident in compatibility/performance concerns. These are a fair bit more daunting than my typical I'll comment here again with a link to a new PR once I've managed to tackle that, might be sometime next month. In the mean time perhaps leave this PR open for visibility? Then can close it once my own PR is up. If someone else wants to tackle a new PR by all means chime in here, otherwise I'll get to it when I can. |
If anyone finds it helpful, I have a small repo here doing automatic builds of Docker images for ComfyUI: https://github.com/TheRealPSV/comfyui-docker |
TL;DR
Building on the great work of ZacharyACoon, I'm providing a Dockerfile that is a bit less complex and more aligned with common best practices, and makes builds more cachable.
Additionally, this PR contains a GitHub Actions workflow that builds the image in five different PyTorch flavours (cu118, cu121, rocm5.6, rocm5.7-nightly, cpu), and pushes them to GitHub Container Repository (works out of the box) and optionally to Docker Hub (requires a username + token configured in action secrets).
You can see the action in action (no pun intended) in my fork: https://github.com/oxc/ComfyUI/actions/workflows/docker.yml
Dockerfile
The first commit adds the Dockerfile. Since you mentioned that you are not too familiar with those, and are unsure about maintaining it, I will go through the commands in the file and give some additional explanations. Hopefully that makes the decision easier.
The Dockerfile is used by docker to build the docker image. It basically is the recipe used to cook a final image. The image itself can then be uploaded to a remote registry, and from there everybody can pull it to their local registry to simply run it in a container.
The docker-compose.yml file can make it easier to run the docker image using docker compose, but is not strictly a requirement.
Expand here to see the commented Dockerfile
# syntax=docker/dockerfile:1.4
Specifying this syntax directive allows us to use certain features like RUN --mount=type=cache.
ARG at the top of the file defines build-args that can be set when building the docker image and used in FROM statements.
FROM ${BASE_IMAGE}
FROM specifies the base image that we are building our image on. This Dockerfile uses a python image based on Debian by default, selecting a python version that is supported by PyTorch.
I personally have not seen making the base image configurable as a build-arg, but it does not really hurt, so I decided to keep it. It might be useful for people who are building their own images, and just want to use a different base image that provides a few more installed tools, for example.
ARG after a FROM define that this build stage uses the mentioned args. These args also include default values, which can be overridden on the command line.
RUN executes the given shell command. These commands install git in the container, and create the user/group which will be running comfyui.
USER defines the user/group as which to run the given commands.
WORKDIR /app
WORKDIR defines that all following commands will run in the context of the given directory in the container. You can think of it a bit like
cd
.ENV defines environment variables that we will be using later in the file.
Create the transformers cache directory that we will be setting later as env.
Run python to create a virtual environment with the venv module.
Set the PATH so that pip and python will be run in the context of our venv.
Now install the PyTorch dependencies, passing the PYTORCH_INSTALL_ARGS so that users can select which PyTorch index/platform they want to use. This will be passed by the GitHub Action.
Now we COPY the requirements.txt to the container, and run pip install on it.
We are using RUN --mount=type=cache to mount a cache directory for our pip cache that is managed by the docker runtime and shared between builds.
Why are we only copying the requirements file? Here you need to know that docker tries to cache all commands when building an image, but as soon as one command fails to use the cached version, all further commands have to be re-run also. For this reason, it helps to do the more stable stuff earlier in the image, and the more frequently changing things later.
Since copying the program code of ComfyUI will change with every commit, copying it here would invalidate all caches for requirements etc. We need the requirements file however, so we copy only that. As long as it does not change, the result pip install can be cached. One can always run the build command with --no-cache to force a re-run.
You can read more about the cache here: https://docs.docker.com/build/cache/
COPY --chown=${USER_UID}:${USER_GID} . .
Now, as a last step, we copy the whole program code into the container image.
Finally, we define some more environment variables that can be overridden, and define the CMD that will be run when the container is launched.
GitHub Action
The GitHub action spawns several jobs in a matrix, one for each supported PyTorch flavor (cu118, cu121, rocm5.6, rocm5.7-nightly, cpu).
For each flavor, it builds the container image, and pushes it to the GitHub Container Registry. This works out of the box with the information provided by GitHub Actions.
If a
DOCKERHUB_USERNAME
and aDOCKERHUB_TOKEN
secret are configured on the GitHub repository, the action will also login and push to the Docker Hub. Here is the image the action on my fork created: https://hub.docker.com/r/obeliks/comfyuiThe workflow uses several actions provided by Docker, one for generating image metadata from the gIthub context, one for logging in, and one for building and pushing. Ultimately, the workflow has been pieced together from the following examples (and adjusted accordingly):
I have opted not to add extra complexity to persist the pip cache in GitHub Actions cache. In most cases, the persisted layer cache will suffice. Take a look at this run that uses cached layers, compared to this one that creates most of them from scratch. The downloading of those large layers that contain pytorch wheels still takes some time, but is much faster then doing the full build.
Note that it still makes sense to have those pip caches in a cache mount, because it is useful for local building, and should prevent the cache from ending up in the image without having to erase it explicitly.