Skip to content

Commit d603b70

Browse files
Attempt to detect QEMU hangs
When building cross platform images with Docker, QEMU is often used under the hood and can have a bug that cause forked processes to deadlock. Before spawning workers we test for that bug. Fix: #495 Closes: #497 Co-Authored-By: Sarun Rattanasiri <[email protected]>
1 parent 5e87800 commit d603b70

File tree

2 files changed

+42
-2
lines changed

2 files changed

+42
-2
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Unreleased
22

3+
* Attempt to detect a QEMU bug that can cause `bootsnap precompile` to hang forever when building ARM64 docker images
4+
from x86_64 machines. See #495.
35
* Improve CLI to detect cgroup CPU limits and avoid spawning too many worker processes.
46

57
# 1.18.4

lib/bootsnap/cli/worker_pool.rb

Lines changed: 40 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
require "etc"
44
require "rbconfig"
5+
require "io/wait" unless IO.method_defined?(:wait_readable)
56

67
module Bootsnap
78
class CLI
@@ -17,12 +18,19 @@ def create(size:, jobs:)
1718
end
1819

1920
def default_size
20-
size = [Etc.nprocessors, cpu_quota || 0].min
21+
nprocessors = Etc.nprocessors
22+
size = [nprocessors, cpu_quota || nprocessors].min
2123
case size
2224
when 0, 1
2325
0
2426
else
25-
size
27+
if fork_defunct?
28+
$stderr.puts "warning: faulty fork(2) detected, probably in cross platform docker builds. " \
29+
"Disabling parallel compilation."
30+
0
31+
else
32+
size
33+
end
2634
end
2735
end
2836

@@ -32,6 +40,7 @@ def cpu_quota
3240
# cgroups v2: https://docs.kernel.org/admin-guide/cgroup-v2.html#cpu-interface-files
3341
cpu_max = File.read("/sys/fs/cgroup/cpu.max")
3442
return nil if cpu_max.start_with?("max ") # no limit
43+
3544
max, period = cpu_max.split.map(&:to_f)
3645
max / period
3746
elsif File.exist?("/sys/fs/cgroup/cpu,cpuacct/cpu.cfs_quota_us")
@@ -40,11 +49,40 @@ def cpu_quota
4049
# If the cpu.cfs_quota_us is -1, cgroup does not adhere to any CPU time restrictions
4150
# https://docs.kernel.org/scheduler/sched-bwc.html#management
4251
return nil if max <= 0
52+
4353
period = File.read("/sys/fs/cgroup/cpu,cpuacct/cpu.cfs_period_us").to_f
4454
max / period
4555
end
4656
end
4757
end
58+
59+
def fork_defunct?
60+
return true unless ::Process.respond_to?(:fork)
61+
62+
# Ref: https://github.com/Shopify/bootsnap/issues/495
63+
# The second forked process will hang on some QEMU environments
64+
r, w = IO.pipe
65+
pids = 2.times.map do
66+
::Process.fork do
67+
exit!(true)
68+
end
69+
end
70+
w.close
71+
r.wait_readable(1) # Wait at most 1s
72+
73+
defunct = false
74+
75+
pids.each do |pid|
76+
_pid, status = ::Process.wait2(pid, ::Process::WNOHANG)
77+
if status.nil? # Didn't exit in 1s
78+
defunct = true
79+
Process.kill(:KILL, pid)
80+
::Process.wait2(pid)
81+
end
82+
end
83+
84+
defunct
85+
end
4886
end
4987

5088
class Inline

0 commit comments

Comments
 (0)