You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* support SYCL backend windows build
* add windows build in CI
* add for win build CI
* correct install oneMKL
* fix install issue
* fix ci
* fix install cmd
* fix install cmd
* fix install cmd
* fix install cmd
* fix install cmd
* fix win build
* fix win build
* fix win build
* restore other CI part
* restore as base
* rm no new line
* fix no new line issue, add -j
* fix grammer issue
* allow to trigger manually, fix format issue
* fix format
* add newline
* fix format
* fix format
* fix format issuse
---------
Co-authored-by: Abhilash Majumder <[email protected]>
Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device
211
216
```
212
217
218
+
## Windows
219
+
220
+
### Setup Environment
221
+
222
+
1. Install Intel GPU driver.
223
+
224
+
Please install Intel GPU driver by official guide: [Install GPU Drivers](https://www.intel.com/content/www/us/en/products/docs/discrete-gpus/arc/software/drivers.html).
225
+
226
+
2. Install Intel® oneAPI Base toolkit.
227
+
228
+
a. Please follow the procedure in [Get the Intel® oneAPI Base Toolkit ](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html).
229
+
230
+
Recommend to install to default folder: **/opt/intel/oneapi**.
231
+
232
+
Following guide uses the default folder as example. If you use other folder, please modify the following guide info with your folder.
233
+
234
+
b. Enable oneAPI running environment:
235
+
236
+
- In Search, input 'oneAPI'.
237
+
238
+
Search & open "Intel oneAPI command prompt for Intel 64 for Visual Studio 2022"
max compute_units 512, max work group size 1024, max sub group size 32, global mem size 16225243136
349
+
350
+
```
351
+
352
+
|Attribute|Note|
353
+
|-|-|
354
+
|compute capability 1.3|Level-zero running time, recommended |
355
+
|compute capability 3.0|OpenCL running time, slower than level-zero in most cases|
356
+
357
+
4. Set device ID and execute llama.cpp
358
+
359
+
Set device ID = 0 by **set GGML_SYCL_DEVICE=0**
360
+
361
+
```
362
+
set GGML_SYCL_DEVICE=0
363
+
build\bin\main.exe -m models\llama-2-7b.Q4_0.gguf -p "Building a website can be done in 10 simple steps:\nStep 1:" -n 400 -e -ngl 33 -s 0
364
+
```
365
+
or run by script:
366
+
367
+
```
368
+
.\examples\sycl\win-run-llama2.bat
369
+
```
370
+
371
+
Note:
372
+
373
+
- By default, mmap is used to read model file. In some cases, it leads to the hang issue. Recommend to use parameter **--no-mmap** to disable mmap() to skip this issue.
374
+
375
+
376
+
5. Check the device ID in output
377
+
378
+
Like:
379
+
```
380
+
Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device
381
+
```
213
382
214
383
## Environment Variable
215
384
@@ -220,7 +389,7 @@ Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device
220
389
|LLAMA_SYCL|ON (mandatory)|Enable build with SYCL code path. <br>For FP32/FP16, LLAMA_SYCL=ON is mandatory.|
221
390
|LLAMA_SYCL_F16|ON (optional)|Enable FP16 build with SYCL code path. Faster for long-prompt inference. <br>For FP32, not set it.|
222
391
|CMAKE_C_COMPILER|icx|Use icx compiler for SYCL code path|
223
-
|CMAKE_CXX_COMPILER|icpx|use icpx for SYCL code path|
392
+
|CMAKE_CXX_COMPILER|icpx (Linux), icx (Windows)|use icpx/icx for SYCL code path|
224
393
225
394
#### Running
226
395
@@ -232,18 +401,23 @@ Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device
232
401
233
402
## Known Issue
234
403
404
+
- Hang during startup
405
+
406
+
llama.cpp use mmap as default way to read model file and copy to GPU. In some system, memcpy will be abnormal and block.
407
+
408
+
Solution: add **--no-mmap**.
409
+
410
+
## Q&A
411
+
235
412
- Error: `error while loading shared libraries: libsycl.so.7: cannot open shared object file: No such file or directory`.
236
413
237
414
Miss to enable oneAPI running environment.
238
415
239
416
Install oneAPI base toolkit and enable it by: `source /opt/intel/oneapi/setvars.sh`.
240
417
418
+
- In Windows, no result, not error.
241
419
242
-
- Hang during startup
243
-
244
-
llama.cpp use mmap as default way to read model file and copy to GPU. In some system, memcpy will be abnormal and block.
0 commit comments