Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
395 commits
Select commit Hold shift + click to select a range
8af4472
Improve error message for `torch.fft.ihfft2` when input's dtype is co…
shink Jun 3, 2025
d375e64
[cutlass backend][forward fix] hex the cutlass key instead of decode …
henrylhtsang Jun 2, 2025
b86aaaa
Revert "[dynamo][guards] Flush cache to more accurately measure guard…
pytorchmergebot Jun 3, 2025
a7e496a
Revert "[dynamo] Record the pre-graph bytecode using fast record func…
pytorchmergebot Jun 3, 2025
13044b2
Move c10/macros/Export.h to torch/standalone (#154850)
desertfire Jun 3, 2025
d91c85b
[c10d][fr] Split cuda and non-cuda fr logic into two cpp file (#154929)
fduwjj Jun 3, 2025
a4da1d4
[Graph Partition] support standalone_compile (#154698)
BoyuanFeng Jun 3, 2025
ea5b9ec
Combine sticky pgo key with job id (#154863)
bobrenjc93 Jun 2, 2025
71a0af8
[TEST][Quantization] Skip test_learnable due to hypothesis (#152819)
Aidyn-A Jun 3, 2025
635b73e
[dynamo][guards] Flush cache to more accurately measure guard overhea…
anijain2305 Jun 3, 2025
00dfd38
[Tiling rewrite pt1] Normalize reads and writes to common iter space …
eellison Jun 2, 2025
9cdce68
[MPS][BE] Reimplement log1p as Metal shader (#154936)
malfet Jun 3, 2025
e9266f8
[BE] Use vendored packaging for testing (#154946)
malfet Jun 3, 2025
0adbde4
Analyze coalesced mem (#153730)
eellison Jun 2, 2025
812deec
Add option to define OpenBLAS version for manylinux Dockerfile_2_28_a…
davsva01 Jun 3, 2025
2608927
Solve for tilings (#153748)
eellison Jun 2, 2025
a1a268a
[dtensor] fix simplefsdp mixed-precision training bugs (#154975)
ruisizhang123 Jun 3, 2025
3685b10
Turn on compile with NVSHMEM (#154538)
kwen2501 May 29, 2025
250e9af
Removing per torch.compile audit. (#154572)
AlannaBurke Jun 3, 2025
283f876
[PP] Fix disabled flaky tests (#154856)
H-Huang Jun 2, 2025
ff92b42
[c10d][gloo] Integrate vendor generic FR into gloo (#152614)
fduwjj Jun 3, 2025
1f131fe
Update bug-report.yml (#154857)
albanD Jun 3, 2025
4014297
Add type annotation to orthogonal_ (#154927)
cora-codes Jun 3, 2025
31405a6
[typing] Add missing type annotations to torch.nn.init module (#154504)
janumiko Jun 3, 2025
f714599
[MPS][BE] Extend torch.special. to integer dtypes (#155002)
malfet Jun 3, 2025
462579a
Update merge_rules.yaml (#155008)
svekars Jun 3, 2025
4672350
[AOTDispatch] Use the proper meta function for `_amp_foreach_non_fini…
StrongerXi Jun 3, 2025
6f7694f
[dynamo] Reconstruct defaultdict properly (#154931)
StrongerXi Jun 3, 2025
e8183f8
add #pragma once to stable/library.h (#154920)
janeyx99 Jun 3, 2025
c014e9d
[inductor][test] test_padding.py: use inductor TestCase instead of dy…
davidberard98 Jun 3, 2025
85fb13d
[BE] Cleanup cuda 12.4 artifacts from scripts and workflows (#154893)
atalman Jun 3, 2025
ea7b233
[flex attention][triton pin] triton_helpers shim for TMA apis (#154858)
davidberard98 Jun 2, 2025
cc96feb
[dynamo] Mark a vt unspecialized nn module variable source earlier (#…
anijain2305 Jun 3, 2025
10c3e6e
[inductor][dynamo] Include operator name in size/stride/alignment ass…
karthickai Jun 3, 2025
debd095
Avoid index integer overflow in gemm_notrans_ (#154809)
cyyever Jun 3, 2025
8e1474d
[inductor] small cleanups in torch/_inductor/codegen/mps.py (#154921)
swolchok Jun 2, 2025
69a57d9
add JSON output support for operator benchmark (#154410)
LifengWang Jun 3, 2025
55873dc
[1/3] Add header file for Graph in nativert (#154530)
yiming0416 Jun 3, 2025
b4c399d
[2/3] Add source file for Graph in nativert (#154531)
yiming0416 Jun 3, 2025
71499fe
[3/3] Add build rule and test for Graph in nativert (#154532)
yiming0416 Jun 3, 2025
5735729
[Cutlass] Cleanup gemm_template evt handling (#154775)
mlazos Jun 2, 2025
1c2b9ce
[Cutlass] Support bias arg for fp8 GEMM (#154761)
mlazos Jun 2, 2025
36596ad
[Cutlass] fp8 dynamic shapes test (#154829)
mlazos Jun 3, 2025
4224a7d
[Cutlass] EVT dynamic shapes support (#154835)
mlazos Jun 3, 2025
6c40e66
[Inductor] Add attention pattern for model DistilBert in transformers…
etaf Jun 3, 2025
8f0e3f4
[Inductor UT] Reuse test_fused_attention.py for Intel GPU. (#154110)
etaf Jun 3, 2025
cbdacd3
[AOTI][Intel GPU] Support multi_arch_kernel_binary option for XPU. (#…
etaf Jun 3, 2025
6c2f941
[dynamo][dynamic] Recompilation hint for nn module integer attributes…
anijain2305 Jun 3, 2025
40a8770
Incorporate coalesce analysis in codegen (#153751)
eellison Jun 3, 2025
50de6ae
Revert "[BE][Ez]: Fully type nn.utils.clip_grad (#154801)"
pytorchmergebot Jun 4, 2025
d8e4c1c
[BE] Define `REGISTER_UNARY_TI_DISPATCH` (#155081)
malfet Jun 4, 2025
e276054
[PT] expose FlightRecord API for building (#154866)
tianfengfrank Jun 4, 2025
34e3930
fix numpy compatibility for 2d small list indices (#154806)
ngimel Jun 4, 2025
37e6bf8
Switch to _apply_to_tensors for dataclass input (#154897)
mori360 Jun 4, 2025
3e57de1
[ONNX] Create support for rotary embeddings (#154745)
justinchuby Jun 4, 2025
437df54
[Inductor] Fix a few FX conversion bugs. (#154958)
blaine-rister Jun 4, 2025
9f39028
[MPS][BE] Move sigmoid op to Metal (#155080)
malfet Jun 4, 2025
6b0c6f2
[BE] Delete pre-CUDA-10.1 code from SparseCUDABlas (#155079)
malfet Jun 4, 2025
72fe1d5
Add randint_like tensor overload for high (#154899)
bobrenjc93 Jun 3, 2025
ec35a36
[ROCm][Windows] Fix building tests for multiple architectures (#154979)
tvukovic-amd Jun 4, 2025
4d93985
[c10d] Separate monitoring thread into a class in PGNCCL (#153977)
fduwjj Jun 3, 2025
7b07434
[Intel GPU] Support f32 intermediate dtype, headdim size <=576 and f3…
LuFinch Jun 4, 2025
75b24c2
Export `torch::utils::tensor_to_numpy` (#154178)
shink Jun 4, 2025
d2bfd97
[export] Refactor pt2 save/load (#152495)
angelayi Jun 4, 2025
1e20745
[ez][AOTI] Fix index offset for Optional Tensor Return (#155073)
yiming0416 Jun 4, 2025
0f10df7
[Intel GPU] Make SDPA output has the same stride as Query. (#154340)
LuFinch Jun 4, 2025
2af78d3
Skip another test file that doesn't run gradcheck for slow gradcheck …
soulitzer Jun 4, 2025
31d12b3
Fix avg_pool2d param kernel_size descripthon (#154353)
zeshengzong Jun 4, 2025
ca0c298
[ONNX] Allow exporter to export SDPA to Attention onnx operator (#154…
kkhode Jun 4, 2025
cf9cad3
Add __main__ guards to tests (#154716)
AnthonyBarbier Jun 4, 2025
c8d44a2
Add __main__ guards to fx tests (#154715)
AnthonyBarbier Jun 4, 2025
3f34d26
Add __main__ guards to distributed tests (#154628)
AnthonyBarbier Jun 4, 2025
1a55fb0
Add __main__ guards to jit tests (#154725)
AnthonyBarbier Jun 4, 2025
a0f2544
Revert "[dynamo][dynamic] Recompilation hint for nn module integer at…
pytorchmergebot Jun 4, 2025
a99a01a
Revert "[dynamo] Mark a vt unspecialized nn module variable source ea…
pytorchmergebot Jun 4, 2025
3ce5102
[ROCm] fix CI failures from inductor periodic (#154896)
jeffdaily Jun 4, 2025
3fa3dbd
Revert "[Cutlass] EVT dynamic shapes support (#154835)"
pytorchmergebot Jun 4, 2025
6f93ce3
Revert "[Cutlass] fp8 dynamic shapes test (#154829)"
pytorchmergebot Jun 4, 2025
2091267
Revert "Add __main__ guards to jit tests (#154725)"
pytorchmergebot Jun 4, 2025
aed938f
Enable check_gomp for Ubuntu OSes (#155119)
malfet Jun 4, 2025
8f08f90
Bump pillow from 10.0.1 to 10.3.0 in /.github/requirements (#154416)
dependabot[bot] Jun 4, 2025
4405dc1
Revert "Always set CPU affinity for benchmark jobs (#154569)"
pytorchmergebot Jun 4, 2025
e9c31fb
[torch.compile] handle a custom __delattr__ method correctly (#150899)
SandishKumarHN Jun 4, 2025
b3e666a
[easy] Bump STATIC_CUDA_LAUNCHER_VERSION=1 (#154861)
jamesjwu Jun 2, 2025
34c6371
Add NVSHMEM to PYTORCH_EXTRA_INSTALL_REQUIREMENTS (#154568)
kwen2501 May 29, 2025
9eb7e67
[PT2][memory] correct wait tensor output size (#153569)
xuanzhang816 May 17, 2025
9567168
[c10d][gloo] Enable using c10::Half for gloo (#153862)
fduwjj Jun 4, 2025
1d67849
[AOTInductor] Activate CPU test for package and update weights (#155078)
yushangdi Jun 4, 2025
1970803
[AOTI] Extend torchgen to generate C shim with version number (#147745)
desertfire Jun 2, 2025
681a818
[dynamo] [1/3] updated gbid mapping for initial registry creation (#1…
Sidharth123-cpu Jun 4, 2025
6c8241c
[dynamo] [2/3] added add_new_gb_type functionality (#154886)
Sidharth123-cpu Jun 4, 2025
4d57644
Fix incorrect get_default_qat_qconfig in prepare_qat_fx docs. (#155100)
b-koopman Jun 4, 2025
e5afbe3
Inductor logging + analysis of torch.profile (#149697)
exclamaforte Jun 4, 2025
0404785
[dynamo] [3/3] added cmd_update_gb_type which supports updating an ex…
Sidharth123-cpu Jun 4, 2025
b084e1b
[HOP] Rework Autograd DispatchKey for scan and map (#153336)
bohnstingl Jun 4, 2025
65a5eb8
Fix for ambiguity in linalg.norm()'s ord argument of +2 & -2 (#155148)
ANotFox Jun 4, 2025
c8566a0
[export] Use patching in test (#155132)
angelayi Jun 4, 2025
6f23ca5
[dynamo] sample gb_registry json file for website testing purposes (#…
Sidharth123-cpu Jun 4, 2025
671553b
Update documentation wording for transformer-related layers (#155123)
mikaylagawarecki Jun 4, 2025
f5e2e4c
[Inductor] Include math and torch in launcher scope (#154673)
muchulee8 Jun 4, 2025
992be94
[MPS][BE] Better error messages (#155150)
malfet Jun 4, 2025
c881f2d
[reland][dynamo] Mark a vt unspecialized nn module variable source ea…
anijain2305 Jun 4, 2025
7cf5b36
Release GIL in PG destructor (#154976)
tushar00jain Jun 4, 2025
1083bc7
[Memory Snapshot] Add Flag to Toggle Global and Local Callbacks for A…
sraikund16 Jun 4, 2025
51b4c51
add missing check for caching triton template caching (#154891)
laithsakka Jun 4, 2025
b0a2ca6
support more prologue functions in generated templates cache (#154892)
laithsakka Jun 4, 2025
fb5a787
[HOP] Added clone for outputs of create_bw_fn that are aliasing the i…
bohnstingl Jun 4, 2025
d5f6422
Add CPython generator/contextlib tests (#150796)
guilhermeleobas May 29, 2025
21f45f7
Add CPython int/float tests (#150795)
guilhermeleobas May 29, 2025
3398d1d
support bmm and mm_plus_mm in generated templates cache (#154904)
laithsakka Jun 4, 2025
642687a
[MPS][BE] Some refactor in preparation for 64-bit iterators (#155178)
malfet Jun 5, 2025
5e03433
Revert "Inductor logging + analysis of torch.profile (#149697)"
pytorchmergebot Jun 5, 2025
a01bb9d
[CI][CUDA] Re-enable the test-nan-assert on CUDA12 (#154448)
nWEIdia Jun 5, 2025
b9312c5
SDPA support gfx950 (#155103)
xw285cornell Jun 5, 2025
450180f
[c10d][fr] Add the log of thread name and thread id into fr (#155142)
fduwjj Jun 4, 2025
fa63de0
Handle empty linemaps in PyCodeCache (#155064)
jamesjwu Jun 4, 2025
80703ca
[FlexAttention] Allow dispatch to SAC for flex (#150080)
drisspg Jun 3, 2025
5130ac6
Revert "Add randint_like tensor overload for high (#154899)"
pytorchmergebot Jun 5, 2025
93012d2
Revert "[forward fix] add support for MemoryFormat after type tighten…
pytorchmergebot Jun 5, 2025
1c82878
[test][dynamo] skip test_deopt_from_append_list on python>=3.13.3 (#1…
davidberard98 Jun 4, 2025
bb43ced
Inductor unit tests: cuda 12.6 -> 12.8 (#155056)
davidberard98 Jun 4, 2025
d3c8f36
Revert "[Intel GPU] Make SDPA output has the same stride as Query. (#…
pytorchmergebot Jun 5, 2025
be16f21
[Graph Partition] add symints to get_graph_inputs (#154679)
BoyuanFeng Jun 5, 2025
bee9c70
[reland][dynamo] Record the pre-graph bytecode using fast record func…
anijain2305 Jun 5, 2025
5b65628
Workflow to tag trunk commits with `trunk/{commit-sha}` tags (#155170)
izaitsevfb Jun 5, 2025
fa3c38c
Add tensor overlap check for `cross` (#154999)
zeshengzong Jun 5, 2025
9a4c08d
[MPS] Parametrize `test_scaled_dot_product_attention_autocast` (#155005)
hvaara Jun 5, 2025
f60b271
Revert "Inductor unit tests: cuda 12.6 -> 12.8 (#155056)"
pytorchmergebot Jun 5, 2025
523b637
Revert "[test][dynamo] skip test_deopt_from_append_list on python>=3.…
pytorchmergebot Jun 5, 2025
3c72b9f
Revert "SDPA support gfx950 (#155103)"
pytorchmergebot Jun 5, 2025
e01fde8
Revert "[reland][dynamo] Record the pre-graph bytecode using fast rec…
pytorchmergebot Jun 5, 2025
196c95d
Add dont constant fold flag (#154945)
shiyang-weng Jun 5, 2025
a1057cd
Revert "Add CPython generator/contextlib tests (#150796)"
pytorchmergebot Jun 5, 2025
9bf6593
Fix docstring for `torch.UntypedStorage.from_file` (#155067)
kiersten-stokes Jun 5, 2025
ed661a5
[MPS] Fix complex scalar binding to Metal tensors (#155184)
malfet Jun 5, 2025
7999735
[CUDA][MPS] Fix torch.arange bound validation for large float inputs …
narekmalk Jun 5, 2025
2f3f833
[BE] Document device memory apis in correct module (#155126)
janeyx99 Jun 4, 2025
e895e96
Update docs build to specify <3.13 in CONTRIBUTING (#155140)
janeyx99 Jun 4, 2025
cd361fc
[CI] Migrate focal (ubuntu 20.04) images to jammy (ubuntu 22.04) (#15…
atalman Jun 5, 2025
a14f427
[BE] Update cudnn to 9.10.1.4 (#155122)
Skylion007 Jun 5, 2025
13ea0f2
[dynamo][dynamic] Recompilation hint for nn module integer attributes…
anijain2305 Jun 4, 2025
cadcb5d
[inductor] disable compiler on the compiled_module_main (#155169)
anijain2305 Jun 5, 2025
be2ab96
Inductor unit tests: cuda 12.6 -> 12.8 (#155056)
davidberard98 Jun 5, 2025
2481c4b
[cutlass backend] add teraflops and increase rep for benchmark script…
henrylhtsang Jun 4, 2025
a3098a7
Add pinned numpy and fix build (#155129)
albanD Jun 5, 2025
05dd638
Revert "Add dont constant fold flag (#154945)"
pytorchmergebot Jun 5, 2025
dd41a39
[MPS] Fix unary/binary ops for 2**32+ elem tensors (#155183)
malfet Jun 5, 2025
5e93abe
Address docs for clip_grad functions (#155125)
jbschlosser Jun 4, 2025
c8c892b
[scan] disable functionalization key in backward tracing (#154343)
ydwu4 Jun 4, 2025
606d73b
Adding from_node for nodes in gm.module() (#155053)
yushangdi Jun 5, 2025
5911f87
[Cutlass] fp8 dynamic shapes test (#154829)
mlazos Jun 5, 2025
9a42f01
[Cutlass] EVT dynamic shapes support (#154835)
mlazos Jun 5, 2025
a85ad55
[ROCm][Windows] Fix offload gpu arch list in tests (#155212)
tvukovic-amd Jun 5, 2025
7dcc77e
Turn on new tiling by default (#154768)
eellison Jun 5, 2025
0827464
Replace runtime type parameterization (#155221)
eellison Jun 5, 2025
0a092c7
Enable CPP Extension Open Registration tests on Arm (#144774)
murste01 Jun 5, 2025
e1180c7
Add Intel GPU info collection to the collect env script (#137846)
jingxu10 Jun 5, 2025
9e88d6c
[ROCm] manywheel missing hipsparselt deps (#155254)
jeffdaily Jun 5, 2025
fa705f7
[BE] minor refactor + some comments on behavior (#154695)
nmacchioni Jun 2, 2025
26f066b
Add AOTI model name config (#154129)
yushangdi Jun 5, 2025
9bae2fc
[profiler] Enable all configured activities in CUPTI Range profiler m…
briancoutinho Jun 5, 2025
72453a6
[PT2][comms] put `visualize_overlap` in a try-except block (#155222)
xuanzhang816 Jun 5, 2025
28796f7
Redo D75092426: [internal] Expose additional metadata to compilation …
xmfan Jun 5, 2025
0db3e0c
Revert "Add Intel GPU info collection to the collect env script (#137…
pytorchmergebot Jun 6, 2025
d3d64c6
Revert "Add pinned numpy and fix build (#155129)"
pytorchmergebot Jun 6, 2025
c6b4f98
Add Intel GPU info collection to the collect env script (#137846)
jingxu10 Jun 6, 2025
e694280
Custom FX pass for inductor's backend registration (#154841)
marpioch Jun 6, 2025
07da8a4
[CI] fix xpu-smi hang issue on some xpu runners (#155194)
chuanqi129 Jun 6, 2025
9d59b51
Make device check throw specific error (#155085)
zeshengzong Jun 6, 2025
36a722e
[typo] Fix 'intialize' -> 'initialize' in proxy_tensor.py (#155301)
nakanoh Jun 6, 2025
9656251
Revert "[BE] Update cudnn to 9.10.1.4 (#155122)"
pytorchmergebot Jun 6, 2025
271ca67
[reland][dynamo] Record the pre-graph bytecode using fast record func…
anijain2305 Jun 6, 2025
58e5d20
[BE] Delete IS_SPMM_AVAILABLE() logic (#155296)
malfet Jun 6, 2025
10cef1e
Remove torch XPU ABI=0 build logic for old compiler (#150095)
guangyey Jun 3, 2025
6b1211d
[BE]: Backport runtime_checkable perf improvements/behavior from 3.12…
Skylion007 Jun 6, 2025
907aea0
Add claude local md files (#155299)
drisspg Jun 6, 2025
348fd45
Support detached checkout in tools/nightly.py (#154314)
ezyang May 25, 2025
529e035
[inductor] Add typing to _inductor/ir.py (#149958)
rec Jun 6, 2025
231eb99
[MPS][BE] Extend ndim_and_dtypes to 4 elements (#155272)
malfet Jun 5, 2025
b0fbbef
Revert "Turn on new tiling by default (#154768)"
pytorchmergebot Jun 6, 2025
7e4c097
Revert "[inductor] Add typing to _inductor/ir.py (#149958)"
pytorchmergebot Jun 6, 2025
fc77269
Add randint_like tensor overload for high (#154899)
bobrenjc93 Jun 5, 2025
7ae7c14
Reduce scope of s390x CI (#155208)
AlekseiNikiforovIBM Jun 6, 2025
706bc41
pass mempool arg through emptyCache (#155315)
ngimel Jun 6, 2025
64436c3
[logs] Add autotuning data (#154771)
stashuk-olek Jun 6, 2025
453bc9f
[a2av] 2D all-to-all-vdev (#155058)
kwen2501 Jun 6, 2025
2e2ea72
[Inductor] Support autotuning in the FX backend. (#155049)
blaine-rister Jun 6, 2025
1ccc57e
Log backward no-op to tlparse and pt2 compile events. (#154544)
jovianjaison Jun 6, 2025
749757a
[a2av] Align length of major dimension in output of 2D a2av (#155172)
kwen2501 Jun 6, 2025
067fd0b
[dynamo][cleanup] Simplify disabling of the helper functions on tenso…
anijain2305 Jun 6, 2025
4f5b344
DOC: Convert to markdown: torch.overrides.rst, type_info.rst, utils.r…
loganthomas Jun 6, 2025
0d8c029
[FSDP2] keep root unsharded when not specifying reshard_after_forward…
weifengpy Jun 6, 2025
bc5a11b
[easy][invoke_subgraph] Remove skip from already fixed test (#155286)
anijain2305 Jun 6, 2025
d2a2bfc
Turn on new tiling by default (#154768)
eellison Jun 6, 2025
c95705d
[Docs] Convert to markdown: torch.compiler_troubleshooting_old.rst, t…
kiszk Jun 6, 2025
457dd79
[BE][Ez]: Remove unnecessary accesses of dim vector (#155334)
Skylion007 Jun 6, 2025
cd82096
DOC: Convert to markdown: ddp_comm_hooks.rst, debugging_environment_v…
framoncg Jun 6, 2025
9b4db09
Add C shim for at::pad and fix some typos (#155226)
janeyx99 Jun 6, 2025
83d2225
[BE][Ez]: Improve typing in torch._logging (#155345)
Skylion007 Jun 7, 2025
be2e432
[CI]Update windows runner to windows-2022 (#154368)
abhishek-iitmadras Jun 7, 2025
5596cef
Fix segfault during NumPy string tensor conversion (#155364)
malfet Jun 6, 2025
10cd1de
[ROCm] Make optional features in LoadHIP better conditioned. (#155305)
stellaraccident Jun 7, 2025
81b0b30
[dynamo] constant fold torch.cuda.is_initialized (#155300)
williamwen42 Jun 6, 2025
400f439
[pt][easy] Rename metadata column (#155365)
stashuk-olek Jun 7, 2025
f140fac
[MPS] Implement erfc (#155382)
malfet Jun 7, 2025
30387ab
[ROCm] Adds initialization support for PyTorch when built from ROCm w…
stellaraccident Jun 7, 2025
5fbaa04
SDPA support gfx950 (#155103)
xw285cornell Jun 7, 2025
da1f898
[nativert] move function schema to torch (#154948)
dolpm Jun 7, 2025
386aa72
[BE] Cleanup old ExecuTorch codegen and runtime code (#154165)
larryliu0820 Jun 6, 2025
c1f531f
[Graph Partition] move cpu scalar tensor to gpu (#154464)
BoyuanFeng Jun 7, 2025
0f3f597
[invoke_subgraph] Throw assertion on uncaptured speculate_subgraph (#…
anijain2305 Jun 7, 2025
db49182
[invoke_subgraph] Add logging (#155284)
anijain2305 Jun 7, 2025
694028f
update get_default_device to also respect torch.device ctx manager (#…
kshitij12345 Jun 7, 2025
29e6033
[Break XPU] Fix failed test cases which are introduced by community f…
etaf Jun 6, 2025
456f40c
Add docblock for autotune_cache.py (#155133)
bobrenjc93 Jun 4, 2025
ab56e5a
[CUDA][BUILD] Add back the capability to use env TORCH_CUDA_ARCH_LIST…
nWEIdia Jun 7, 2025
783a4c1
[ROCm] fix nightly wheel, second attempt (#155388)
jeffdaily Jun 7, 2025
f6e18bc
Fix CUDA 12.8 docker tag (#155087)
cyyever Jun 7, 2025
2596e3d
[inductor] use int64 for large index (#154575)
shunting314 Jun 6, 2025
11bc298
Fix some incorrect reST markups in the document (#154831)
Jun 7, 2025
f1f49e5
[CI] remove xfail sm89 job (#155244)
clee2000 Jun 7, 2025
abf4da0
[Profiler] Induce Inductor Import before Profiling (#155243)
sraikund16 Jun 7, 2025
0756ebc
Add docblock to torch/_dynamo/trace_rules.py (#155401)
bobrenjc93 Jun 7, 2025
1339e88
Add docblock to torch/_dynamo/side_effects.py (#155403)
bobrenjc93 Jun 8, 2025
09328eb
Update auto-tuning support for _scaled_grouped_mm (#150944)
alexsamardzic Jun 6, 2025
b981fb6
Add docblock to torch/_dynamo/variables/builtin.py (#155402)
bobrenjc93 Jun 8, 2025
49888e6
[BE] Polish `Makefile` (#155425)
XuehaiPan Jun 8, 2025
27df0c5
Revert "[inductor] use int64 for large index (#154575)"
pytorchmergebot Jun 8, 2025
30293b8
Preserve Enum types during torch.export serialization and deserializa…
narekmalk Jun 8, 2025
95448b2
Revert "[Inductor] Improve typing, and prepare for ABI-compatible AOT…
pytorchmergebot Jun 8, 2025
3d82a1d
Add checks for empty tensor list (#155383)
malfet Jun 7, 2025
d41f62b
Fix/issue #155027 (#155252)
abhinav-TB Jun 8, 2025
2908c10
Document the default garbage_collection_threshold value and improve t…
ParagEkbote Jun 8, 2025
be2ad70
Fix dynamo tracing into AOTAutogradCache results in cpu tensors (#155…
jamesjwu Jun 6, 2025
6fb6293
Revert "Add Intel GPU info collection to the collect env script (#137…
pytorchmergebot Jun 9, 2025
9b4a748
[nativert] Move Weights to PyTorch core (#155156)
yiming0416 Jun 9, 2025
9968c85
[Dynamo] Replace `unimplemented` with `unimplemented_v2` in `torch/_d…
shink Jun 9, 2025
e158486
[1/n]adding torch.distributed.run option to provide destination for e…
aschhabra Jun 9, 2025
79aef14
[ONNX] Set the name of the producing node using the value name (#155413)
justinchuby Jun 9, 2025
b9b84d8
Generate unique id for tensor storage object by observing the week po…
shengfukevin Jun 9, 2025
4a4cac0
Update torch-xpu-ops commit pin (#154962)
CuiYifeng Jun 9, 2025
6c05f2f
[test] use JK to force graph break on slow aliasing/mutation/dynamic_…
bdhirsh Jun 9, 2025
0083032
[aotd] Support mutations in reordering_to_mimic_autograd_engine (#155…
IvanKobzarev Jun 6, 2025
79bdafe
Revert "Custom FX pass for inductor's backend registration (#154841)"
pytorchmergebot Jun 9, 2025
3863bbb
[BE]: Update cusparselt to 0.7.1 (#155232)
Skylion007 Jun 9, 2025
8b72f5e
Build XCCL as default and make XCCL the default distributed backend f…
Chao1Han Jun 3, 2025
e38f914
Update torch/distributed/distributed_c10d.py
Chao1Han Jun 3, 2025
e590a9a
restore register_backend change
Chao1Han Jun 5, 2025
0b47037
Update CMakeLists.txt
guangyey Jun 6, 2025
da27ae5
Update CMakeLists.txt
guangyey Jun 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions .ci/aarch64_linux/aarch64_ci_build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ if [ "$DESIRED_CUDA" = "cpu" ]; then
USE_PRIORITIZED_TEXT_FOR_LD=1 python /pytorch/.ci/aarch64_linux/aarch64_wheel_ci_build.py --enable-mkldnn
else
echo "BASE_CUDA_VERSION is set to: $DESIRED_CUDA"
export USE_SYSTEM_NCCL=1
#USE_PRIORITIZED_TEXT_FOR_LD for enable linker script optimization https://github.com/pytorch/pytorch/pull/121975/files
USE_PRIORITIZED_TEXT_FOR_LD=1 python /pytorch/.ci/aarch64_linux/aarch64_wheel_ci_build.py --enable-mkldnn --enable-cuda
fi
2 changes: 1 addition & 1 deletion .ci/docker/almalinux/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ RUN bash ./install_mnist.sh
FROM base as all_cuda
COPY --from=cuda11.8 /usr/local/cuda-11.8 /usr/local/cuda-11.8
COPY --from=cuda12.6 /usr/local/cuda-12.6 /usr/local/cuda-12.6
COPY --from=cuda12.4 /usr/local/cuda-12.8 /usr/local/cuda-12.8
COPY --from=cuda12.8 /usr/local/cuda-12.8 /usr/local/cuda-12.8

# Final step
FROM ${BASE_TARGET} as final
Expand Down
80 changes: 43 additions & 37 deletions .ci/docker/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -50,30 +50,21 @@ if [[ "$image" == *xla* ]]; then
exit 0
fi

if [[ "$image" == *-focal* ]]; then
UBUNTU_VERSION=20.04
elif [[ "$image" == *-jammy* ]]; then
if [[ "$image" == *-jammy* ]]; then
UBUNTU_VERSION=22.04
elif [[ "$image" == *ubuntu* ]]; then
extract_version_from_image_name ubuntu UBUNTU_VERSION
elif [[ "$image" == *centos* ]]; then
extract_version_from_image_name centos CENTOS_VERSION
fi

if [ -n "${UBUNTU_VERSION}" ]; then
OS="ubuntu"
elif [ -n "${CENTOS_VERSION}" ]; then
OS="centos"
else
echo "Unable to derive operating system base..."
exit 1
fi

DOCKERFILE="${OS}/Dockerfile"
# When using ubuntu - 22.04, start from Ubuntu docker image, instead of nvidia/cuda docker image.
if [[ "$image" == *cuda* && "$UBUNTU_VERSION" != "22.04" ]]; then
DOCKERFILE="${OS}-cuda/Dockerfile"
elif [[ "$image" == *rocm* ]]; then
if [[ "$image" == *rocm* ]]; then
DOCKERFILE="${OS}-rocm/Dockerfile"
elif [[ "$image" == *xpu* ]]; then
DOCKERFILE="${OS}-xpu/Dockerfile"
Expand All @@ -98,7 +89,7 @@ tag=$(echo $image | awk -F':' '{print $2}')
# configuration, so we hardcode everything here rather than do it
# from scratch
case "$tag" in
pytorch-linux-focal-cuda12.6-cudnn9-py3-gcc11)
pytorch-linux-jammy-cuda12.6-cudnn9-py3-gcc11)
CUDA_VERSION=12.6.3
CUDNN_VERSION=9
ANACONDA_PYTHON_VERSION=3.10
Expand All @@ -110,7 +101,7 @@ case "$tag" in
TRITON=yes
;;
pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9-inductor-benchmarks)
CUDA_VERSION=12.8
CUDA_VERSION=12.8.1
CUDNN_VERSION=9
ANACONDA_PYTHON_VERSION=3.10
GCC_VERSION=9
Expand All @@ -121,7 +112,31 @@ case "$tag" in
TRITON=yes
INDUCTOR_BENCHMARKS=yes
;;
pytorch-linux-focal-cuda12.6-cudnn9-py3-gcc9)
pytorch-linux-jammy-cuda12.8-cudnn9-py3.12-gcc9-inductor-benchmarks)
CUDA_VERSION=12.8.1
CUDNN_VERSION=9
ANACONDA_PYTHON_VERSION=3.12
GCC_VERSION=9
VISION=yes
KATEX=yes
UCX_COMMIT=${_UCX_COMMIT}
UCC_COMMIT=${_UCC_COMMIT}
TRITON=yes
INDUCTOR_BENCHMARKS=yes
;;
pytorch-linux-jammy-cuda12.8-cudnn9-py3.13-gcc9-inductor-benchmarks)
CUDA_VERSION=12.8.1
CUDNN_VERSION=9
ANACONDA_PYTHON_VERSION=3.13
GCC_VERSION=9
VISION=yes
KATEX=yes
UCX_COMMIT=${_UCX_COMMIT}
UCC_COMMIT=${_UCC_COMMIT}
TRITON=yes
INDUCTOR_BENCHMARKS=yes
;;
pytorch-linux-jammy-cuda12.6-cudnn9-py3-gcc9)
CUDA_VERSION=12.6.3
CUDNN_VERSION=9
ANACONDA_PYTHON_VERSION=3.10
Expand Down Expand Up @@ -168,8 +183,8 @@ case "$tag" in
TRITON=yes
INDUCTOR_BENCHMARKS=yes
;;
pytorch-linux-focal-cuda11.8-cudnn9-py3-gcc9)
CUDA_VERSION=11.8.0
pytorch-linux-jammy-cuda12.8-cudnn9-py3-gcc9)
CUDA_VERSION=12.8.1
CUDNN_VERSION=9
ANACONDA_PYTHON_VERSION=3.10
GCC_VERSION=9
Expand All @@ -179,25 +194,25 @@ case "$tag" in
UCC_COMMIT=${_UCC_COMMIT}
TRITON=yes
;;
pytorch-linux-focal-py3-clang10-onnx)
pytorch-linux-jammy-py3-clang12-onnx)
ANACONDA_PYTHON_VERSION=3.9
CLANG_VERSION=10
CLANG_VERSION=12
VISION=yes
ONNX=yes
;;
pytorch-linux-focal-py3.9-clang10)
pytorch-linux-jammy-py3.9-clang12)
ANACONDA_PYTHON_VERSION=3.9
CLANG_VERSION=10
CLANG_VERSION=12
VISION=yes
TRITON=yes
;;
pytorch-linux-focal-py3.11-clang10)
pytorch-linux-jammy-py3.11-clang12)
ANACONDA_PYTHON_VERSION=3.11
CLANG_VERSION=10
CLANG_VERSION=12
VISION=yes
TRITON=yes
;;
pytorch-linux-focal-py3.9-gcc9)
pytorch-linux-jammy-py3.9-gcc9)
ANACONDA_PYTHON_VERSION=3.9
GCC_VERSION=9
VISION=yes
Expand Down Expand Up @@ -252,9 +267,9 @@ case "$tag" in
DOCS=yes
INDUCTOR_BENCHMARKS=yes
;;
pytorch-linux-jammy-cuda11.8-cudnn9-py3.9-clang12)
pytorch-linux-jammy-cuda12.8-cudnn9-py3.9-clang12)
ANACONDA_PYTHON_VERSION=3.9
CUDA_VERSION=11.8
CUDA_VERSION=12.8.1
CUDNN_VERSION=9
CLANG_VERSION=12
VISION=yes
Expand Down Expand Up @@ -303,15 +318,15 @@ case "$tag" in
GCC_VERSION=11
TRITON_CPU=yes
;;
pytorch-linux-focal-linter)
pytorch-linux-jammy-linter)
# TODO: Use 3.9 here because of this issue https://github.com/python/mypy/issues/13627.
# We will need to update mypy version eventually, but that's for another day. The task
# would be to upgrade mypy to 1.0.0 with Python 3.11
PYTHON_VERSION=3.9
;;
pytorch-linux-jammy-cuda11.8-cudnn9-py3.9-linter)
pytorch-linux-jammy-cuda12.8-cudnn9-py3.9-linter)
PYTHON_VERSION=3.9
CUDA_VERSION=11.8
CUDA_VERSION=12.8.1
;;
pytorch-linux-jammy-aarch64-py3.10-gcc11)
ANACONDA_PYTHON_VERSION=3.10
Expand Down Expand Up @@ -370,14 +385,6 @@ esac

tmp_tag=$(basename "$(mktemp -u)" | tr '[:upper:]' '[:lower:]')

#when using cudnn version 8 install it separately from cuda
if [[ "$image" == *cuda* && ${OS} == "ubuntu" ]]; then
IMAGE_NAME="nvidia/cuda:${CUDA_VERSION}-cudnn${CUDNN_VERSION}-devel-ubuntu${UBUNTU_VERSION}"
if [[ ${CUDNN_VERSION} == 9 ]]; then
IMAGE_NAME="nvidia/cuda:${CUDA_VERSION}-devel-ubuntu${UBUNTU_VERSION}"
fi
fi

no_cache_flag=""
progress_flag=""
# Do not use cache and progress=plain when in CI
Expand All @@ -394,7 +401,6 @@ docker build \
--build-arg "LLVMDEV=${LLVMDEV:-}" \
--build-arg "VISION=${VISION:-}" \
--build-arg "UBUNTU_VERSION=${UBUNTU_VERSION}" \
--build-arg "CENTOS_VERSION=${CENTOS_VERSION}" \
--build-arg "DEVTOOLSET_VERSION=${DEVTOOLSET_VERSION}" \
--build-arg "GLIBC_VERSION=${GLIBC_VERSION}" \
--build-arg "CLANG_VERSION=${CLANG_VERSION}" \
Expand Down
2 changes: 1 addition & 1 deletion .ci/docker/ci_commit_pins/executorch.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
b173722085b3f555d6ba4533d6bbaddfd7c71144
f50bfa92602b45dca884a9e511e5d9ddbe8ba314
2 changes: 0 additions & 2 deletions .ci/docker/common/install_base.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,6 @@ install_ubuntu() {
# See https://github.com/pytorch/pytorch/issues/144768
if [[ "$UBUNTU_VERSION" == "20.04"* && "$CUDA_VERSION" == "11.8"* ]]; then
maybe_libnccl_dev="libnccl2=2.15.5-1+cuda11.8 libnccl-dev=2.15.5-1+cuda11.8 --allow-downgrades --allow-change-held-packages"
elif [[ "$UBUNTU_VERSION" == "20.04"* && "$CUDA_VERSION" == "12.4"* ]]; then
maybe_libnccl_dev="libnccl2=2.26.2-1+cuda12.4 libnccl-dev=2.26.2-1+cuda12.4 --allow-downgrades --allow-change-held-packages"
else
maybe_libnccl_dev=""
fi
Expand Down
58 changes: 4 additions & 54 deletions .ci/docker/common/install_cuda.sh
Original file line number Diff line number Diff line change
Expand Up @@ -54,23 +54,9 @@ function install_118 {
ldconfig
}

function install_124 {
CUDNN_VERSION=9.1.0.70
echo "Installing CUDA 12.4.1 and cuDNN ${CUDNN_VERSION} and NCCL and cuSparseLt-0.6.2"
install_cuda 12.4.1 cuda_12.4.1_550.54.15_linux

install_cudnn 12 $CUDNN_VERSION

CUDA_VERSION=12.4 bash install_nccl.sh

CUDA_VERSION=12.4 bash install_cusparselt.sh

ldconfig
}

function install_126 {
CUDNN_VERSION=9.5.1.17
echo "Installing CUDA 12.6.3 and cuDNN ${CUDNN_VERSION} and NCCL and cuSparseLt-0.6.3"
echo "Installing CUDA 12.6.3 and cuDNN ${CUDNN_VERSION} and NCCL and cuSparseLt-0.7.1"
install_cuda 12.6.3 cuda_12.6.3_560.35.05_linux

install_cudnn 12 $CUDNN_VERSION
Expand Down Expand Up @@ -113,40 +99,6 @@ function prune_118 {
rm -rf $CUDA_BASE/libnvvp $CUDA_BASE/nsightee_plugins $CUDA_BASE/nsight-compute-2022.3.0 $CUDA_BASE/nsight-systems-2022.4.2/
}

function prune_124 {
echo "Pruning CUDA 12.4"
#####################################################################################
# CUDA 12.4 prune static libs
#####################################################################################
export NVPRUNE="/usr/local/cuda-12.4/bin/nvprune"
export CUDA_LIB_DIR="/usr/local/cuda-12.4/lib64"

export GENCODE="-gencode arch=compute_50,code=sm_50 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_90,code=sm_90"
export GENCODE_CUDNN="-gencode arch=compute_50,code=sm_50 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_90,code=sm_90"

if [[ -n "$OVERRIDE_GENCODE" ]]; then
export GENCODE=$OVERRIDE_GENCODE
fi
if [[ -n "$OVERRIDE_GENCODE_CUDNN" ]]; then
export GENCODE_CUDNN=$OVERRIDE_GENCODE_CUDNN
fi

# all CUDA libs except CuDNN and CuBLAS
ls $CUDA_LIB_DIR/ | grep "\.a" | grep -v "culibos" | grep -v "cudart" | grep -v "cudnn" | grep -v "cublas" | grep -v "metis" \
| xargs -I {} bash -c \
"echo {} && $NVPRUNE $GENCODE $CUDA_LIB_DIR/{} -o $CUDA_LIB_DIR/{}"

# prune CuDNN and CuBLAS
$NVPRUNE $GENCODE_CUDNN $CUDA_LIB_DIR/libcublas_static.a -o $CUDA_LIB_DIR/libcublas_static.a
$NVPRUNE $GENCODE_CUDNN $CUDA_LIB_DIR/libcublasLt_static.a -o $CUDA_LIB_DIR/libcublasLt_static.a

#####################################################################################
# CUDA 12.4 prune visual tools
#####################################################################################
export CUDA_BASE="/usr/local/cuda-12.4/"
rm -rf $CUDA_BASE/libnvvp $CUDA_BASE/nsightee_plugins $CUDA_BASE/nsight-compute-2024.1.0 $CUDA_BASE/nsight-systems-2023.4.4/
}

function prune_126 {
echo "Pruning CUDA 12.6"
#####################################################################################
Expand Down Expand Up @@ -183,7 +135,7 @@ function prune_126 {

function install_128 {
CUDNN_VERSION=9.8.0.87
echo "Installing CUDA 12.8.1 and cuDNN ${CUDNN_VERSION} and NCCL and cuSparseLt-0.6.3"
echo "Installing CUDA 12.8.1 and cuDNN ${CUDNN_VERSION} and NCCL and cuSparseLt-0.7.1"
# install CUDA 12.8.1 in the same container
install_cuda 12.8.1 cuda_12.8.1_570.124.06_linux

Expand All @@ -203,11 +155,9 @@ do
case "$1" in
11.8) install_118; prune_118
;;
12.4) install_124; prune_124
;;
12.6) install_126; prune_126
12.6|12.6.*) install_126; prune_126
;;
12.8) install_128;
12.8|12.8.*) install_128;
;;
*) echo "bad argument $1"; exit 1
;;
Expand Down
2 changes: 0 additions & 2 deletions .ci/docker/common/install_cudnn.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,6 @@ if [[ -n "${CUDNN_VERSION}" ]]; then
CUDNN_NAME="cudnn-linux-x86_64-9.8.0.87_cuda12-archive"
elif [[ ${CUDA_VERSION:0:4} == "12.6" ]]; then
CUDNN_NAME="cudnn-linux-x86_64-9.5.1.17_cuda12-archive"
elif [[ ${CUDA_VERSION:0:2} == "12" ]]; then
CUDNN_NAME="cudnn-linux-x86_64-9.1.0.70_cuda12-archive"
elif [[ ${CUDA_VERSION:0:2} == "11" ]]; then
CUDNN_NAME="cudnn-linux-x86_64-9.1.0.70_cuda11-archive"
else
Expand Down
10 changes: 1 addition & 9 deletions .ci/docker/common/install_cusparselt.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,7 @@ if [[ ${CUDA_VERSION:0:4} =~ ^12\.[5-8]$ ]]; then
if [ ${TARGETARCH} = 'amd64' ] || [ "${TARGETARCH}" = 'x86_64' ]; then
arch_path='x86_64'
fi
CUSPARSELT_NAME="libcusparse_lt-linux-${arch_path}-0.6.3.2-archive"
curl --retry 3 -OLs https://developer.download.nvidia.com/compute/cusparselt/redist/libcusparse_lt/linux-${arch_path}/${CUSPARSELT_NAME}.tar.xz
elif [[ ${CUDA_VERSION:0:4} == "12.4" ]]; then
arch_path='sbsa'
export TARGETARCH=${TARGETARCH:-$(uname -m)}
if [ ${TARGETARCH} = 'amd64' ] || [ "${TARGETARCH}" = 'x86_64' ]; then
arch_path='x86_64'
fi
CUSPARSELT_NAME="libcusparse_lt-linux-${arch_path}-0.6.2.3-archive"
CUSPARSELT_NAME="libcusparse_lt-linux-${arch_path}-0.7.1.0-archive"
curl --retry 3 -OLs https://developer.download.nvidia.com/compute/cusparselt/redist/libcusparse_lt/linux-${arch_path}/${CUSPARSELT_NAME}.tar.xz
elif [[ ${CUDA_VERSION:0:4} == "11.8" ]]; then
CUSPARSELT_NAME="libcusparse_lt-linux-x86_64-0.4.0.7-archive"
Expand Down
15 changes: 1 addition & 14 deletions .ci/docker/common/install_onnx.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,6 @@ retry () {
"$@" || (sleep 10 && "$@") || (sleep 20 && "$@") || (sleep 40 && "$@")
}

# A bunch of custom pip dependencies for ONNX
pip_install \
beartype==0.15.0 \
filelock==3.9.0 \
flatbuffers==2.0 \
mock==5.0.1 \
ninja==1.10.2 \
networkx==2.5 \
numpy==1.24.2

# ONNXRuntime should be installed before installing
# onnx-weekly. Otherwise, onnx-weekly could be
# overwritten by onnx.
Expand All @@ -29,11 +19,8 @@ pip_install \
transformers==4.36.2

pip_install coloredlogs packaging

pip_install onnxruntime==1.18.1
pip_install onnxscript==0.2.6 --no-deps
# required by onnxscript
pip_install ml_dtypes
pip_install onnxscript==0.3.0

# Cache the transformers model to be used later by ONNX tests. We need to run the transformers
# package to download the model. By default, the model is cached at ~/.cache/huggingface/hub/
Expand Down
3 changes: 1 addition & 2 deletions .ci/docker/common/install_openblas.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@
set -ex

cd /
git clone https://github.com/OpenMathLib/OpenBLAS.git -b v0.3.29 --depth 1 --shallow-submodules

git clone https://github.com/OpenMathLib/OpenBLAS.git -b "${OPENBLAS_VERSION:-v0.3.29}" --depth 1 --shallow-submodules

OPENBLAS_BUILD_FLAGS="
NUM_THREADS=128
Expand Down
7 changes: 6 additions & 1 deletion .ci/docker/common/install_triton.sh
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,12 @@ as_jenkins git clone --recursive ${TRITON_REPO} triton
cd triton
as_jenkins git checkout ${TRITON_PINNED_COMMIT}
as_jenkins git submodule update --init --recursive
cd python

# Old versions of python have setup.py in ./python; newer versions have it in ./
if [ ! -f setup.py ]; then
cd python
fi

pip_install pybind11==2.13.6

# TODO: remove patch setup.py once we have a proper fix for https://github.com/triton-lang/triton/issues/4527
Expand Down
1 change: 1 addition & 0 deletions .ci/docker/manywheel/Dockerfile_2_28_aarch64
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ RUN git config --global --add safe.directory "*"

FROM base as openblas
# Install openblas
ARG OPENBLAS_VERSION
ADD ./common/install_openblas.sh install_openblas.sh
RUN bash ./install_openblas.sh && rm install_openblas.sh

Expand Down
Loading