Skip to content

Commit bd3b296

Browse files
abhilash1910hodlen
authored andcommitted
Fix f16_sycl cpy call from Arc (ggml-org#5411)
* fix f16_sycl cpy call * rm old logic * add fp16 build CI * use macro * format fix
1 parent a838165 commit bd3b296

File tree

2 files changed

+46
-3
lines changed

2 files changed

+46
-3
lines changed

.github/workflows/build.yml

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,47 @@ jobs:
184184
cmake -DLLAMA_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx ..
185185
cmake --build . --config Release -j $(nproc)
186186
187+
ubuntu-22-cmake-sycl-fp16:
188+
runs-on: ubuntu-22.04
189+
190+
continue-on-error: true
191+
192+
steps:
193+
- uses: actions/checkout@v2
194+
195+
- name: add oneAPI to apt
196+
shell: bash
197+
run: |
198+
cd /tmp
199+
wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
200+
sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
201+
rm GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
202+
sudo add-apt-repository "deb https://apt.repos.intel.com/oneapi all main"
203+
204+
- name: install oneAPI dpcpp compiler
205+
shell: bash
206+
run: |
207+
sudo apt update
208+
sudo apt install intel-oneapi-compiler-dpcpp-cpp
209+
210+
- name: install oneAPI MKL library
211+
shell: bash
212+
run: |
213+
sudo apt install intel-oneapi-mkl-devel
214+
215+
- name: Clone
216+
id: checkout
217+
uses: actions/checkout@v3
218+
219+
- name: Build
220+
id: cmake_build
221+
run: |
222+
source /opt/intel/oneapi/setvars.sh
223+
mkdir build
224+
cd build
225+
cmake -DLLAMA_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_SYCL_F16=ON ..
226+
cmake --build . --config Release -j $(nproc)
227+
187228
# TODO: build with LLAMA_NO_METAL because test-backend-ops fail on "Apple Paravirtual device" and I don't know
188229
# how to debug it.
189230
# ref: https://github.com/ggerganov/llama.cpp/actions/runs/7131777249/job/19420981052#step:5:1124

ggml-sycl.cpp

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12148,7 +12148,8 @@ inline void ggml_sycl_op_dequantize_mul_mat_vec(
1214812148
const int64_t src1_ncols, const int64_t src1_padded_row_size,
1214912149
const dpct::queue_ptr &stream) {
1215012150

12151-
const int64_t ne00 = src0->ne[0];
12151+
GGML_TENSOR_BINARY_OP_LOCALS
12152+
1215212153
const int64_t row_diff = row_high - row_low;
1215312154

1215412155
// on some GPUs it is faster to convert src1 to half and to use half precision intrinsics
@@ -12167,8 +12168,9 @@ inline void ggml_sycl_op_dequantize_mul_mat_vec(
1216712168
} else {
1216812169
src1_dfloat = src1_dfloat_a.alloc(ne00);
1216912170
ggml_cpy_f32_f16_sycl((const char *)src1_ddf_i, (char *)src1_dfloat,
12170-
ne00, ne00, 1, sizeof(float), 0, 0, ne00, 1,
12171-
sizeof(sycl::half), 0, 0, stream);
12171+
ne00, ne00, ne01, ne02, nb00, nb01, nb02,
12172+
nb03, ne10, ne11, ne12, nb10, nb11, nb12,
12173+
nb13, stream);
1217212174
}
1217312175
}
1217412176
#else

0 commit comments

Comments
 (0)