Commit Graph

689 Commits

Author SHA1 Message Date
xiaying b66ec85018 OpenCL:Bugfix: Fix bug for recordable bug for llm 2025-06-06 08:48:18 +08:00
xiaying 09b6cd9862 MNN:Sync: Sync 3.2.0 2025-06-05 15:15:29 +08:00
jxt1234 97223628bc
Merge pull request #3569 from linlinxu-1989/sme_fp16_fp32_new
Bump MNN KleidiAI ukernel to fp16_sme2 fp32_sme2 ukernel
2025-05-27 15:37:07 +08:00
jxt1234 c0dde12d89
Merge pull request #3564 from tumuyan/patch-1
fix MNN_SUPPORT_RENDER OFF
2025-05-27 15:07:56 +08:00
jxt1234 eef4a726cd
Merge pull request #3546 from HenryDen/buf_fix
Bugs fix for the KleidiAi integration of MNN
2025-05-27 14:54:37 +08:00
linlin.xu 556acbfea2 Add macro for kleidiai only resource 2025-05-27 14:46:50 +08:00
linlin.xu a8f2d34b15 Bump MNN KleidiAI ukernel to fp16_sme2 fp32_sme2 ukernel
Add fp16 and fp32 sme2 kernel
1. Add rhs pack nxk kernel lhs pack kernel
2. Add fp16 GEMM kernel and GEMV kernel
3. Add fp32 GEMM kernel and GEMV kernel
2025-05-27 10:21:24 +08:00
tumuyan caf73d6859
fix MNN_SUPPORT_RENDER OFF 2025-05-24 10:42:48 +08:00
zhaode.wzd bd36a3f749 [MNN:Sync] Sync internal:
1. SmolVLM, FastVLM support.
    2. QNN backend init.
    3. Qwen3 MoE support.
    4. Speculative decodeing init.
    5. Some bugfix.
2025-05-23 15:24:18 +08:00
Shuheng Deng 986a065fff Bugs fix for the KleidiAi integration of MNN 2025-05-22 17:35:57 +08:00
xiaying d2477b3091 MNN:Bugfix: Fix bug for ios Compile error 2025-05-09 14:19:41 +08:00
xiaying e457f0ce33 OpenCL:Bugfix: Fix bug for llm bench opencl crash 2025-05-09 14:12:58 +08:00
zhaode.wzd a019d971ad [MNN:Sync] Sync Internal 3.1.4. 2025-05-08 12:39:44 +08:00
xiaying ba766b8bea LLM:Feature, OpenCL:Bugfix: Sync internal bugfix( support llm prompt
dynamic setting, fix bug for opencl softmax in nvidia error)
2025-04-30 18:00:38 +08:00
jxt1234 f65ff90302
Merge pull request #3397 from xhzheng1895/kai_1.7.0_integration
android / android_build (push) Waiting to run Details
ios / ios_build (push) Waiting to run Details
linux / linux_buil_test (push) Waiting to run Details
macos / macos_buil_test (push) Waiting to run Details
windows / windows_build_test (push) Waiting to run Details
Integrate Asym int4 ukernels to MNN
2025-04-30 11:15:41 +08:00
xiaying 0769b81b58 MNN:Sync: Sync Internal 3.1.3 2025-04-28 11:50:24 +08:00
xinhao.zheng 9471946f9a Refine some code to avoid redundant data reorder. 2025-04-27 15:57:31 +08:00
xinhao.zheng 81530694c8 Revert some change to avoid compile warning 2025-04-25 15:33:55 +08:00
xinhao.zheng c043aed737 Fix thread num calculate bug 2025-04-25 08:27:01 +08:00
xinhao.zheng e376109335 Integrate Asym int4 ukernels to MNN
Integrate two kernels matmul_clamp_f16_qsi8d32p_qai4c32p
and matmul_clamp_f32_qsi8d32p_qai4c32p. Now KleidiAI support
asymmetric & block wise quantified & f32/f16activation.

Remove .tar.gz package from source code. Will download from
https://gitlab.arm.com/kleidi/kleidiai/-/releases when cmake.
2025-04-24 08:32:00 +08:00
王召德 513bf36512
Merge pull request #3350 from xhzheng1895/kai_update
android / android_build (push) Has been cancelled Details
ios / ios_build (push) Has been cancelled Details
linux / linux_buil_test (push) Has been cancelled Details
macos / macos_buil_test (push) Has been cancelled Details
windows / windows_build_test (push) Has been cancelled Details
stale / stale (push) Has been cancelled Details
Update KleidiAI version to 1.5.0
2025-04-01 14:54:22 +08:00
xiaying 7391896be3 OpenCL:Bugfix: Fix bug for LayerNorm init bug 2025-03-28 18:48:22 +08:00
xiaying e9c3296942 MNN:Sync: Sync Internal 3.1.2 2025-03-27 11:30:13 +08:00
xinhao.zheng dae2266a43 Update KleidiAI version to 1.5.0 2025-03-27 09:41:26 +08:00
Xi Guo b41edfc29e
Merge branch 'master' into build-qnx 2025-03-12 13:07:45 +08:00
xiaying c0247c6998 MNN:Sync: Sync Internal 3.1.1 2025-03-12 11:35:16 +08:00
Xi Guo d0487ac726 fix: incompatible insertion of unique_ptr in mTensorIdxMap 2025-03-10 20:08:03 +08:00
Xi Guo db0311da39 feat: add compatibility support for QNX Neutrino 2025-03-10 19:14:25 +08:00
garryling 6fe7f9a8c4 fix(nnapi): fix when input is constant, convert tensor to nchw or nhwc 2025-03-05 23:19:37 +08:00
xiaying 4497538e38 HIAI:Bugfix: Fix bug for ohos hiai backend compile, fix bug for resize and unary
op adapter
2025-03-05 10:16:05 +08:00
Weizhao Ouyang aca7c96cfa Fix MNN_HEADERS_KLEIDIAI compile warnings
Signed-off-by: Weizhao Ouyang <o451686892@gmail.com>
2025-03-03 09:48:42 +08:00
zhaode.wzd d9a6ce3ac1 [MNN:Sync] Sync Internal 3.1.0. 2025-02-24 11:44:27 +08:00
xiaying b935891ece MNN:Sync: Sync a few bugfix, add qwen2.5-vl support 2025-02-17 19:11:14 +08:00
xiaying ee9fff2d6c Vulkan:Bugfix: Fix bug for windows compile error 2025-02-12 11:56:52 +08:00
xiaying 3b6ddc0341 MNN:Sync: Sync Internal 3.0.5 2025-02-12 11:14:19 +08:00
xinhao.zheng 1ce1ea35c9 Fix some typo in Linux HWCAPS detect 2025-02-12 08:36:32 +08:00
xinhao.zheng 9e57159bce Refine some static into init 2025-02-11 15:33:06 +08:00
xinhao.zheng 332912cb6b Integrate KleidiAI sme int4 kernel
Add logic to select micro kernel functions when SME2 is enable.
Thread number will be forced to 1 when run matmul, for better
energy efficiency ratio.
2025-02-11 14:23:54 +08:00
xinhao.zheng e5999b5157 Resolve some conflicts 2025-02-11 10:28:51 +08:00
xhzheng1895 444816b145
Merge branch 'alibaba:master' into mnn_kai_interface_refactor 2025-02-11 10:21:56 +08:00
xinhao.zheng 462bbaaab4 Resolve some conflicts 2025-02-11 10:21:22 +08:00
xinhao.zheng b7687c21ea Resolve some conflicts 2025-02-11 10:19:47 +08:00
xinhao.zheng 17ff81af27 Resolve some conflicts 2025-02-11 10:16:27 +08:00
linlin.xu 3d890bc69d Optimize Stft func
1. When the CPUStft constructor is called for the first time, initialize gCosTable and gSinTable.
2. When calling the CPUStft execution function, retrieve the corresponding values from gCosTable and gSinTable based on the index.
2025-02-10 11:27:48 +08:00
Daniel 206b23ebdc
filename fix 2025-01-31 22:39:21 +00:00
xiaying 766815282f MNN:Sync: Sync Internal 3.0.4 2025-01-22 16:28:36 +08:00
juju812 c0e14f2324
CPURuntime: Bugfix: check i8mm support by AT_HWCAP2
According to arch/arm64/include/uapi/asm/hwcap.h, HWCAP2_I8MM should be checked via getauxval(AT_HWCAP2)
2025-01-20 20:41:42 +08:00
xinhao.zheng 73344cd4d0 Refine some functions. 2025-01-09 09:58:26 +08:00
xinhao.zheng 82efb15c3f Refactor MNN KleidiAI interface
Refactor MNN_KleidiAI interface to support more model types,
and facilitate subsequent KleidiAI ukernels' integration.

Re-abstract information stored in class KleidiAI:
1) static info: not related to loaded model, initialized when
interface is constructed and never changed.
2) status: will change while pipeline is running.

Let interface and loaded model decouple for more complex mix
of multiple types of models. Add mAccelType in MNN data structure,
kleidiAI interface will rely on this type to decide which branch
to go.

Move some pack functions to mnn_kleidiai_util.cpp.

Add CPU feature detection in source/backend/cpu/CPURuntime.hpp.
Subsequent ukernels need SME information.
2025-01-07 10:18:20 +08:00
王召德 dd43b5aa4b
Merge pull request #3120 from yiyangfan01/kleidiai_0.5.0
Update the KleidiAI version to r0.5.0.
2025-01-02 16:06:06 +08:00