You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #3119
tidied up llama by removing a library build and op registration.
This seems to have interacted poorly with (broken the cmake flow): examples/xnnpack/quantization/test_quantize.sh
resulting in duplicate registration of: quantized_decomposed::embedding_byte.out
this works for the buck2 build, but not the cmake build of the library I assume because kernels/quantized/targets.bzl was updated.
removing the duplicates in kernels/quantized/quantized.yaml breaks llama2 quantization runs on xnnpack.
removing the duplicates in exir/passes/_quant_patterns_and_replacements.py causes llama testing to fail.
Can we look for a cleaner solution on quantized operator registration that is consistent across users? the xnnpack, arm backend and llama2 uses at least.
I have a patch demonstrating the problem here, as this is blocking a fairly large commit of Arm backend code: 0790d93