-
Notifications
You must be signed in to change notification settings - Fork 316
Description
Required Info:
- AWS ParallelCluster version [e.g. 3.1.1]: 3.11.1, 3.13.2
- Full cluster configuration without any credentials or personal data.
Region: us-west-2
Image:
Name: slurm-rocky-test
Tags:
- Key: Name
Value: slurm-rocky-test
RootVolume:
Size: 50
Encrypted: yes
KmsKeyId: arn:aws:kms:us-west-2:171496337684:key/4b63e407-423e-496f-b937-fd5ca6421fc4
Build:
InstanceType: m5.4xlarge
ParentImage: ami-022aac693cf236af2
Tags:
- Key: purpose
Value: infrastructure
SecurityGroupIds:
- sg-74e42c12
- sg-be56d4d8
SubnetId: subnet-00bbd054b223b7501
- Cluster name: N/A
Bug description and how to reproduce:
When using pcluster build-image
using a Rocky 9 AMI that has Docker CE installed, the image build fails. In the image builder logs, we get this error:
2025-08-06T20:07:28.609Z
CmdExecution: Stderr: OS='rocky9'
PLATFORM='RHEL'
if [[ ${PLATFORM} == RHEL ]]; then
yum -y update krb5-libs
yum -y groupinstall development && sudo yum -y install wget jq
if [[ ${OS} != alinux2023 ]]; then
# Do not install curl on al2023 since curl-minimal-8.5.0-1.amzn2023* is already shipped and conflicts.
yum -y install curl
fi
elif [[ ${PLATFORM} == DEBIAN ]]; then
if [[ "false" == "false" ]]; then
# disable apt-daily.timer to avoid dpkg lock
flock $(apt-config shell StateDir Dir::State/d | sed -r "s/.*'(.*)\/?'$/\1/")/daily_lock systemctl disable --now apt-daily.timer apt-daily.service apt-daily-upgrade.timer apt-daily-upgrade.service
# disable unattended upgrades
sed "/Update-Package-Lists/s/\"1\"/\"0\"/; /Unattended-Upgrade/s/\"1\"/\"0\"/;" /etc/apt/apt.conf.d/20auto-upgrades > "/etc/apt/apt.conf.d/51pcluster-unattended-upgrades"
fi
apt-cache search build-essential
apt-get clean
apt-get -y update
apt-get -y install build-essential curl wget jq
fi
Errors during downloading metadata for repository 'docker-ce-stable':
- Status code: 404 for https://download.docker.com/linux/rhel/9.6/x86_64/stable/repodata/repomd.xml (IP: 3.175.34.7)
Error: Failed to download metadata for repo 'docker-ce-stable': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
Errors during downloading metadata for repository 'docker-ce-stable':
- Status code: 404 for https://download.docker.com/linux/rhel/9.6/x86_64/stable/repodata/repomd.xml (IP: 3.175.34.15)
Error: Failed to download metadata for repo 'docker-ce-stable': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
Errors during downloading metadata for repository 'docker-ce-stable':
- Status code: 404 for https://download.docker.com/linux/rhel/9.6/x86_64/stable/repodata/repomd.xml (IP: 3.175.34.116)
Error: Failed to download metadata for repo 'docker-ce-stable': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
You can in fact verify that https://download.docker.com/linux/rhel/9.6/x86_64/stable/repodata/repomd.xml will 404. This is because the URL is expected to be https://download.docker.com/linux/rhel/9/x86_64/stable/repodata/repomd.xml for all RHEL 9 releases.
It looks like the URL changes due to this section earlier in the image builder steps:
2025-08-06T20:07:20.212Z
CmdExecution: Stderr: OS='rocky9'
PLATFORM='RHEL'
KERNEL_VERSION=$(uname -a)
RELEASE_VERSION='9.6'
if [[ ${PLATFORM} == RHEL ]]; then
if [[ ${OS} == rhel9 ]] || [[ ${OS} == rocky9 ]]; then
if [[ ! -f /etc/yum/vars/releasever ]]; then
echo "yes" > /opt/parallelcluster/pin_releasesever
echo ${RELEASE_VERSION} > /etc/yum/vars/releasever
yum clean all
fi
fi
PACKAGE_LIST="kernel-headers-$(uname -r) kernel-devel-$(uname -r)"
if [[ ${OS} != "rocky8" ]] && [[ ${OS} != "rhel8" ]]; then
PACKAGE_LIST+=" kernel-devel-matched-$(uname -r)"
fi
if [[ ${OS} == "rocky8" ]] || [[ ${OS} == "rocky9" ]] ; then
for PACKAGE in ${PACKAGE_LIST}
do
yum install -y ${PACKAGE}
if [ $? -ne 0 ]; then
# Enable vault repository
sed -i 's|^#baseurl=http://dl.rockylinux.org/$contentdir|baseurl=http://dl.rockylinux.org/vault/rocky|g' /etc/yum.repos.d/*.repo
sed -i 's|^#baseurl=https://dl.rockylinux.org/$contentdir|baseurl=https://dl.rockylinux.org/vault/rocky|g' /etc/yum.repos.d/*.repo
yum install -y ${PACKAGE}
fi
done
else
for PACKAGE in ${PACKAGE_LIST}
do
yum -y install ${PACKAGE}
done
fi
yum install -y yum-plugin-versionlock
# listing all the packages because wildcard does not work as expected
yum versionlock kernel kernel-core kernel-modules
if [[ ${OS} == "rocky8" ]] || [[ ${OS} == "rocky9" ]] ; then
yum versionlock rocky-release rocky-repos
elif [[ ${OS} == "rhel8" ]] || [[ ${OS} == "rhel9" ]] ; then
yum versionlock redhat-release
fi
else
apt-get -y install linux-headers-$(uname -r)
apt-mark hold linux-aws* linux-base* linux-headers* linux-image*
fi
(this section also shows the same docker yum errors but seems to be ignored)
I don't know how common it is to overwrite /etc/yum/vars/releasever
like this but at least for Docker CE it is problematic and Docker CE is a pretty widely used tool.