[DO NOT MERGE]Qualcomm AI Engine Direct - Mimi Decoder low SQNR reproduce #1

winskuo-quic · 2025-04-15T03:16:30Z

Summary

Please DO NOT MERGE this PR.
This PR is to reproduce the issue where nn.Module inference is working fine, however, once after torch.export.export, sqnr went from 120 -> 8.

Command:
python examples/qualcomm/oss_scripts/moshi/mimi.py -b build-android -s $device -m SM8650 --chunks_per_batch 125
To run nn.Module where it is working fine, please uncomment line 264-267 and comment out line 269-273

Differential Revision: D75911655 Pull Request resolved: pytorch#11344

BNNS copy crashes the process when the dtypes differ (pytorch#11714). With the example in this PR (pytorch#11714), we crash the process on main. Here is the stack trace from LLDB: ``` Process 19234 stopped * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT frame #0: 0x0000000190ac9388 libsystem_kernel.dylib`__pthread_kill + 8 libsystem_kernel.dylib`__pthread_kill: -> 0x190ac9388 <+8>: b.lo 0x190ac93a8 ; <+40> 0x190ac938c <+12>: pacibsp 0x190ac9390 <+16>: stp x29, x30, [sp, #-0x10]! 0x190ac9394 <+20>: mov x29, sp (lldb) bt * thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGABRT * frame #0: 0x0000000190ac9388 libsystem_kernel.dylib`__pthread_kill + 8 frame #1: 0x0000000190b0288c libsystem_pthread.dylib`pthread_kill + 296 frame pytorch#2: 0x0000000190a0bc60 libsystem_c.dylib`abort + 124 frame pytorch#3: 0x0000000190910174 libsystem_malloc.dylib`malloc_vreport + 892 frame pytorch#4: 0x0000000190913c90 libsystem_malloc.dylib`malloc_report + 64 frame pytorch#5: 0x000000019091821c libsystem_malloc.dylib`___BUG_IN_CLIENT_OF_LIBMALLOC_POINTER_BEING_FREED_WAS_NOT_ALLOCATED + 32 frame pytorch#6: 0x000000019d2f4084 libBNNS.dylib`___lldb_unnamed_symbol1620 + 564 frame pytorch#7: 0x000000019d2f5bac libBNNS.dylib`___lldb_unnamed_symbol1628 + 680 frame pytorch#8: 0x000000019d69ce48 libBNNS.dylib`BNNSCopy + 616 frame pytorch#9: 0x000000030c74d950 _portable_lib.cpython-310-darwin.so`(anonymous namespace)::copy_using_bnns(executorchcoreml::MultiArray const&, executorchcoreml::MultiArray&) + 188 frame pytorch#10: 0x000000030c74cfdc _portable_lib.cpython-310-darwin.so`(anonymous namespace)::copy(executorchcoreml::MultiArray const&, executorchcoreml::MultiArray&, executorchcoreml::MultiArray::CopyOptions) + 72 frame pytorch#11: 0x000000030c74ceec _portable_lib.cpython-310-darwin.so`executorchcoreml::MultiArray::copy(executorchcoreml::MultiArray&, executorchcoreml::MultiArray::CopyOptions) const + 148 frame pytorch#12: 0x000000030c7488d4 _portable_lib.cpython-310-darwin.so`invocation function for block in (anonymous namespace)::copy(MLMultiArray*, executorchcoreml::MultiArray&) + 376 frame pytorch#13: 0x000000030c748ac8 _portable_lib.cpython-310-darwin.so`invocation function for block in (anonymous namespace)::copy(MLMultiArray*, executorchcoreml::MultiArray&) + 52 frame pytorch#14: 0x000000019ad33f4c CoreML`CoreML::MultiArrayBuffer::getBytesWithHandler(void (void const*, unsigned long) block_pointer) const + 340 frame pytorch#15: 0x000000019ad34138 CoreML`-[MLMultiArray(ScopedBufferAccess) getBytesWithHandler:] + 152 frame pytorch#16: 0x000000030c7485ec _portable_lib.cpython-310-darwin.so`(anonymous namespace)::copy(MLMultiArray*, executorchcoreml::MultiArray&) + 296 frame pytorch#17: 0x000000030c744f68 _portable_lib.cpython-310-darwin.so`(anonymous namespace)::set_outputs(std::__1::vector<executorchcoreml::MultiArray, std::__1::allocator<executorchcoreml::MultiArray>>&, NSArray<MLMultiArray*>*) + 180 ``` With this PR, the process succeeds.

DRAFT PR to reproduce low sqnr

2e1b801

winskuo-quic changed the base branch from dev1/winskuo/mimi_stage2 to main April 15, 2025 03:18

winskuo-quic changed the base branch from main to dev1/winskuo/mimi_stage2 April 15, 2025 03:19

chenweng-quic pushed a commit that referenced this pull request Jun 9, 2025

Use GraphBuilder in test_replace_ops_passes. #1

fbe2e58

Differential Revision: D75911655 Pull Request resolved: pytorch#11344

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DO NOT MERGE]Qualcomm AI Engine Direct - Mimi Decoder low SQNR reproduce #1

[DO NOT MERGE]Qualcomm AI Engine Direct - Mimi Decoder low SQNR reproduce #1

Uh oh!

winskuo-quic commented Apr 15, 2025

Uh oh!

Uh oh!

[DO NOT MERGE]Qualcomm AI Engine Direct - Mimi Decoder low SQNR reproduce #1

Are you sure you want to change the base?

[DO NOT MERGE]Qualcomm AI Engine Direct - Mimi Decoder low SQNR reproduce #1

Uh oh!

Conversation

winskuo-quic commented Apr 15, 2025

Summary

Uh oh!

Uh oh!