Closed
Description
I'm creating this issue to capture any work that will need to be done post-merge, inevitably we've had to leave some stuff unfinished to make the ABI breaking window.
For reference, this is the PI merge change: #14145
Tasks from review
- rename UR xpti stream to ur.call #14922
- implement UR_LOG_TRACING env var in UR instead of sycl RT, currently it's implemented in ur.hpp - first raised in this comment #14926
- rename SYCL_ENABLE_PLUGINS accounting for the different in UR terminology (PLUGIN -> ADAPTER) - first raised in this comment #14927
- create new target component for unified-runtime, currently UR libs are installed as part of
COMPONENT level-zero-sycl-dev
- see this comment #14924 - review all documentation pertaining to PI and either remove it, update it or re-direct it to point at docs in the UR repo #14928
- move
loadOsLibrary
andunloadOsLibrary
toos_util
. #14923 - drop tracing from
sycl/test-e2e/External/RSBench/acc.test
test. mentioned here #14932 - replace
bool
withur_bool_t
insycl/include/sycl/info/device_traits.def
and the uses of it. #14919 - Create a minimal header only containing basic UR types oneapi-src/unified-runtime#1890
- Update PI use in xpti samples #15109
- This PR changes seems to might have been reverted after PI removal, so we need to investigate this. This was first mentioned in this comment
- OpenCL devices that don't support the UR_DEVICE_INFO_IP_VERSION query are reporting it is now causing issues. #15149
Regressions
This is the list of known regressions that have had XFAIL added to them in anticipation of post-merge fixes.
From sycl/test/:
- native_cpu/atomic-base.cpp #14726
- native_cpu/call_host_func.cpp
- native_cpu/check-pi-output.cpp
- native_cpu/driver-fsycl.cpp
- native_cpu/example-sycl-application.cpp #14660
- native_cpu/global-id-range.cpp
- native_cpu/globaloffsetchecks.cpp
- native_cpu/link-noinline.cpp #14745
- native_cpu/local-id-range.cpp #14746
- native_cpu/local_basic.cpp
- native_cpu/multi-devices-swap.cpp
- native_cpu/multi-devices.cpp
- native_cpu/multiple_tu.cpp
- native_cpu/no-dead-arg.cpp
- native_cpu/no-opt.cpp
- native_cpu/readwrite_rectops.cpp
- native_cpu/scalar_args.cpp
- native_cpu/sycl-external-static.cpp
- native_cpu/sycl-external.cpp
- native_cpu/unnamed.cpp
- native_cpu/unused-regression.cpp
- native_cpu/user-defined-private-type.cpp
- native_cpu/user-defined-type.cpp
- native_cpu/usm_basic.cpp
- native_cpu/vector-add.cpp
(all but a handful of these will be fixed by the inclusion of oneapi-src/unified-runtime#1871)
From sycl/test-e2e
- AddressSanitizer/common/config-red-zone-size.cpp #14658
- AddressSanitizer/common/kernel-debug.cpp
- AddressSanitizer/multiple-reports/multiple_kernels.cpp
- AddressSanitizer/multiple-reports/one_kernel.cpp
- AddressSanitizer/use-after-free/quarantine-free.cpp
- Basic/aspects.cpp
- Basic/interop/check_carrying_real_kernel_IDs.cpp #14663
- Basic/interop/construction_ocl.cpp #14665
- DeprecatedFeatures/kernel_interop.cpp #14675
- DeprecatedFeatures/opencl_interop.cpp #14676
- DeprecatedFeatures/sampler_ocl.cpp #14679
- DeprecatedFeatures/set_arg_interop.cpp #14680
- DeprecatedFeatures/subbuffer_interop.cpp #14681
- ESIMD/sycl_esimd_mix.cpp #14682
- Graph/Explicit/kernel_bundle.cpp #14702
- Graph/RecordReplay/kernel_bundle.cpp #14763
- KernelAndProgram/cache_env_vars.cpp #14709
- KernelAndProgram/cache_env_vars_lin.cpp #14712
- KernelCompiler/kernel_compiler_sycl.cpp #14662
- OnlineCompiler/online_compiler_OpenCL.cpp #14711
- Plugin/interop-opencl-make-kernel-bundle.cpp #14706
- Plugin/interop-opencl-make-kernel.cpp #14708
- Plugin/interop-opencl.cpp #14661
- Plugin/level_zero_batch_barrier.cpp #14704
- Plugin/level_zero_dynamic_batch_test.cpp #14721
- Plugin/level_zero_usm_device_read_only.cpp #14738
- Plugin/sycl-ls-gpu-default-any.cpp #14741
- Regression/local-arg-align.cpp #14722
- Regression/set-arg-local-accessor.cpp #14723
- SpecConstants/2020/image_selection.cpp #14724
- USM/memory_coherency_hip.cpp #14742
- USM/source_kernel_indirect_access.cpp #14714
- XPTI/basic_event_collection_linux.cpp #14744
- syclcompat/math/math_vectorized_isgreater_test.cpp #14703
- syclcompat/memory/memory_management_test2.cpp #14659
Windows only:
- Windows urAdapterRelease() occurs before releaseDefaultContext() #14768
- Basic/queue/release.cpp
- Scheduler/ReleaseResourcesTest.cpp
- Regression/pi_release.cpp #14950
- Regression/context_is_destroyed_after_exception.cpp
- DiscardEvents/discard_events_usm_ooo_queue.cpp #14775
- KernelAndProgram/cache_env_vars_win.cpp - KernelAndProgram/cache_env_vars.cpp #14709
- KernelAndProgram/disable-caching.cpp #14968
- Plugin/dll-detach-order.cpp - Plugin/dll-detach-order.cpp #14767
- SubGroup/load_store.cpp #15006
- BFloat16/bfloat16_conversions.cpp (seen on L0 gen12) #15011
Cuda only:
- HostInteropTask/interop-task-cuda-buffer-migrate.cpp
- HostInteropTask/interop-task-cuda.cpp #14666
- InorderQueue/in_order_usm_implicit.cpp #14664
New regressions as of merge on 24/07
- Basic/vector/load_store.cpp (CL gen12 + arc) #14749
- Matrix/element_wise_all_ops.cpp (CL/L0 arc only) #14795
- Matrix/element_wise_all_ops_1d.cpp (CL/L0 arc only)
- Matrix/element_wise_all_ops_1d_cont.cpp (CL/L0 arc only)
- Matrix/element_wise_all_ops_scalar.cpp (CL/L0 arc only)
- Matrix/element_wise_all_sizes.cpp (CL/L0 arc only)
- Basic/kernel_bundle/kernel_bundle_api.cpp #14764
- EnqueueNativeCommand/custom-command-cuda.cpp (nvidia/cuda) #14804
- EnqueueNativeCommand/custom-command-multiple-dev-cuda.cpp (nvidia/cuda) #14805
- SubGroup/load_store.cpp (gen12 CL) #14765
unittests:
- SYCL2020/KernelBundleStateFiltering.cpp
New regressions as of merge on 26/07
- Assert/check_resource_leak.cpp #14806
- Basic/reqd_work_group_size.cpp #14841
- DeviceGlobal/device_global_arrow.cpp
- DeviceGlobal/device_global_device_only.cpp
- DeviceGlobal/device_global_operator_passthrough.cpp
- DeviceGlobal/device_global_subscript.cpp
It seems we have also introduced some regressions in the sycl cts, so: