root/MNN - MNN - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
xiaying	7fde7c7079	MNN:Sync: Sync Internal 3.2.4	2025-09-22 23:05:26 +08:00
xiaying	318a3de860	MNN:Sync: Sync Internal 3.2.3	2025-08-22 18:04:08 +08:00
Shuheng Deng	b9087ffc15	Integrate KleidiAI SME2 Q4 asymmetric kernels into MNN Change-Id: I9413ea48d0a6de0a4077150fb93ce1da52a36d76	2025-08-19 10:24:15 +08:00
zhaode.wzd	55a59a7ebc	[MNN:Sync] Sync Internal reranker, gpt-oss.	2025-08-08 12:24:23 +08:00
yanzhang	a52a884d6d	fix: failures for other platforms Signed-off-by: yanzhang <yanzhang.wang@arm.com> Change-Id: Id7b3da7f3d0c4ef88f4c48bfa87471eaab764973	2025-08-05 15:30:22 +08:00
yanzhang	f699ca2842	Replace the KleidiAI macro with the hint config Signed-off-by: yanzhang <yanzhang.wang@arm.com> Change-Id: I9dcc4f68e1e67a11266b66589006650b197cde1e	2025-08-04 15:58:03 +08:00
SixtyWang	380370fcf8	Fix bug in KleidiAIDenseConvolution, KleidiAIConvolution and QI4_SYM_CHNLQT_F32 - Corrected outputWidth calculation in KleidiAIDenseConvolution - Fixed use-after-free due to late call to getPostParameters in KleidiAIConvolution - Resolved SME symmetry quantization kernel problem	2025-07-31 16:56:51 +08:00
xiaying	db0f559f9d	MNN:Sync: Sync Internal 3.2.2	2025-07-23 14:33:57 +08:00
yanzhang	8e7a63d622	Add imatmul fp16 support for DenseConv Change-Id: Ifb6972146a9c7013328fc75e30ab27c6e3d92d6a Signed-off-by: yanzhang <yanzhang.wang@arm.com>	2025-07-16 18:24:21 +08:00
yanzhang	4c9f48b76b	Minor fixes after rebase. - Enable fp32 1x1 impl after it's right. - Remove winograd memory test because kleidiai not support. Change-Id: Ia61ad8c251490e93a2a365020a6183a944e769b2 Signed-off-by: yanzhang <yanzhang.wang@arm.com>	2025-07-10 15:42:55 +08:00
yanzhang	5e3c8a3c12	Integrate the KleidiAI imatmul with fp32. Note, - No support for fp16 and int8 currently. Signed-off-by: yanzhang <yanzhang.wang@arm.com> Change-Id: If17c911977dd7eb0603f41d64b8ba879f468ab98	2025-07-10 15:42:19 +08:00
SixtyWang	b5b5845787	Refactor Kleidiai code to fix MNN unit test issues - Refactor getInstance function - Add 1x1 convolution check in canAccelerate - Use NHWC as input/output format for Kleidiai and convert format in onExecute - Remove KAI_CONV_NCHW_IN_OUT macro - Fix SME build issue on M4	2025-07-08 16:22:23 +08:00
SixtyWang	da8b7337c4	Refactor the KleidiAI integration ConvInt8 code in MNN	2025-07-08 16:21:26 +08:00
Shuheng Deng	875814bfb9	Refactor the KleidiAI integration convolution code in MNN Change-Id: I2e45fb7ec10793afffad8c25c305c1cb1e260753	2025-07-04 09:54:00 +08:00
Jules	7a08d2e2ee	Fix address offset overflow caused by using int	2025-05-30 10:38:23 +00:00
jxt1234	97223628bc	Merge pull request #3569 from linlinxu-1989/sme_fp16_fp32_new Bump MNN KleidiAI ukernel to fp16_sme2 fp32_sme2 ukernel	2025-05-27 15:37:07 +08:00
jxt1234	eef4a726cd	Merge pull request #3546 from HenryDen/buf_fix Bugs fix for the KleidiAi integration of MNN	2025-05-27 14:54:37 +08:00
linlin.xu	556acbfea2	Add macro for kleidiai only resource	2025-05-27 14:46:50 +08:00
linlin.xu	a8f2d34b15	Bump MNN KleidiAI ukernel to fp16_sme2 fp32_sme2 ukernel Add fp16 and fp32 sme2 kernel 1. Add rhs pack nxk kernel lhs pack kernel 2. Add fp16 GEMM kernel and GEMV kernel 3. Add fp32 GEMM kernel and GEMV kernel	2025-05-27 10:21:24 +08:00
zhaode.wzd	bd36a3f749	[MNN:Sync] Sync internal: 1. SmolVLM, FastVLM support. 2. QNN backend init. 3. Qwen3 MoE support. 4. Speculative decodeing init. 5. Some bugfix.	2025-05-23 15:24:18 +08:00
Shuheng Deng	986a065fff	Bugs fix for the KleidiAi integration of MNN	2025-05-22 17:35:57 +08:00
xiaying	d2477b3091	MNN:Bugfix: Fix bug for ios Compile error	2025-05-09 14:19:41 +08:00
zhaode.wzd	a019d971ad	[MNN:Sync] Sync Internal 3.1.4.	2025-05-08 12:39:44 +08:00
jxt1234	f65ff90302	Merge pull request #3397 from xhzheng1895/kai_1.7.0_integration android / android_build (push) Waiting to run Details ios / ios_build (push) Waiting to run Details linux / linux_buil_test (push) Waiting to run Details macos / macos_buil_test (push) Waiting to run Details windows / windows_build_test (push) Waiting to run Details Integrate Asym int4 ukernels to MNN	2025-04-30 11:15:41 +08:00
xiaying	0769b81b58	MNN:Sync: Sync Internal 3.1.3	2025-04-28 11:50:24 +08:00
xinhao.zheng	9471946f9a	Refine some code to avoid redundant data reorder.	2025-04-27 15:57:31 +08:00
xinhao.zheng	c043aed737	Fix thread num calculate bug	2025-04-25 08:27:01 +08:00
xinhao.zheng	e376109335	Integrate Asym int4 ukernels to MNN Integrate two kernels matmul_clamp_f16_qsi8d32p_qai4c32p and matmul_clamp_f32_qsi8d32p_qai4c32p. Now KleidiAI support asymmetric & block wise quantified & f32/f16activation. Remove .tar.gz package from source code. Will download from https://gitlab.arm.com/kleidi/kleidiai/-/releases when cmake.	2025-04-24 08:32:00 +08:00
Xi Guo	b41edfc29e	Merge branch 'master' into build-qnx	2025-03-12 13:07:45 +08:00
xiaying	c0247c6998	MNN:Sync: Sync Internal 3.1.1	2025-03-12 11:35:16 +08:00
Xi Guo	db0311da39	feat: add compatibility support for QNX Neutrino	2025-03-10 19:14:25 +08:00
xiaying	3b6ddc0341	MNN:Sync: Sync Internal 3.0.5	2025-02-12 11:14:19 +08:00
xinhao.zheng	332912cb6b	Integrate KleidiAI sme int4 kernel Add logic to select micro kernel functions when SME2 is enable. Thread number will be forced to 1 when run matmul, for better energy efficiency ratio.	2025-02-11 14:23:54 +08:00
xhzheng1895	444816b145	Merge branch 'alibaba:master' into mnn_kai_interface_refactor	2025-02-11 10:21:56 +08:00
xiaying	766815282f	MNN:Sync: Sync Internal 3.0.4	2025-01-22 16:28:36 +08:00
xinhao.zheng	73344cd4d0	Refine some functions.	2025-01-09 09:58:26 +08:00
xinhao.zheng	82efb15c3f	Refactor MNN KleidiAI interface Refactor MNN_KleidiAI interface to support more model types, and facilitate subsequent KleidiAI ukernels' integration. Re-abstract information stored in class KleidiAI: 1) static info: not related to loaded model, initialized when interface is constructed and never changed. 2) status: will change while pipeline is running. Let interface and loaded model decouple for more complex mix of multiple types of models. Add mAccelType in MNN data structure, kleidiAI interface will rely on this type to decide which branch to go. Move some pack functions to mnn_kleidiai_util.cpp. Add CPU feature detection in source/backend/cpu/CPURuntime.hpp. Subsequent ukernels need SME information.	2025-01-07 10:18:20 +08:00
jxt1234	d6e10c25cb	Merge pull request #3145 from alibaba/feature/sync MNN:Sync: sync internal 3.0.3	2024-12-31 17:20:33 +08:00
xiaying	27da5e7d48	MNN:Sync: sync internal 3.0.3	2024-12-31 15:34:41 +08:00
kekxv	0f1f18ae5e	fix: ssize_t 旧版本gcc未定义问题添加 #include <sys/types.h> 修复 ssize_t 旧版本gcc未定义问题	2024-12-30 09:13:43 +08:00
xiaying	da4023c222	MNN:Sync: Sync Interal 3.0.2	2024-12-19 20:34:17 +08:00
xiaying	809bff1b30	MNN:Sync: Sync Internal 3.0.1	2024-12-02 10:12:08 +08:00
xiaying	5b901d9d87	MNN:Sync: Sync Internal 3.0.0	2024-11-18 14:40:27 +08:00
yanxing	8f6a1234ae	add acthalf and blockwise condition in canAccelerate.	2024-10-28 17:11:28 +08:00
xinhao.zheng	28115248d7	Refine rhs pack	2024-10-24 14:07:29 +08:00
xinhao.zheng	39dadd08d2	Refine some code	2024-10-22 15:11:21 +08:00
xinhao.zheng	6f5be724a9	Update MNN to latest version	2024-10-22 14:47:33 +08:00
xinhao.zheng	644f22f028	Update mnn_kleidiai interface.	2024-10-22 14:30:07 +08:00
王召德	95a6e4190a	Bugfix of thread workload.	2024-10-22 14:29:55 +08:00
xinhao.zheng	f075372c14	Integrate kleidiAI release v0.1.0 into MNN 2.9.3 Put KleidiAI files in folder source/backend/cpu/arm/kleidiAI/kai, download from arm gitlab and remain unchanged. Maybe will remove these files and download them when build. MNNKleidiAI.cpp is interface between MNN and KleidiAI. Rewrite function in class DenseConvInt8TiledExecutor , in ConvInt8TiledExecutor.cpp, to call KleidiAI functions. Maybe implement a new execution later. Changes to GeometryConvUtils.cpp and ShapeTensorConvert.cpp are for the input and output of DenseConvInt8TiledExecutor is NCHW, rather than NC4HW4, to avoid redundant pack/unpack and get better performance.	2024-10-22 14:29:13 +08:00

1 2 3 4

193 Commits