Commit Graph

658 Commits

Author SHA1 Message Date
xiaying 0d1f5fcf72 Fix bug for MatMulExecution 2023-03-25 10:02:06 +08:00
xiaying a931d40ddc Optimize for one-broadcast matmul 2023-03-25 10:02:06 +08:00
xiaying 4935bfc1ae Optimize cuda gather for region x,y,z large 2023-03-25 10:02:06 +08:00
xiaying 7629ba674e [MNN:Sync] Sync Internal 2.4.1 2023-03-20 11:32:29 +08:00
Zeekim a92de8a5bc fix(opencl): fix mLWS dimension bigger than gws 2023-03-08 11:22:52 +08:00
xiaying de94d480fd [MNN:Bugfix] Fix ssize_t compile error in windows 2023-02-28 12:48:55 +08:00
xiaying 4e2ad365e8 [MNN:Sync] Sync Internal Gitlab 2023-02-28 10:41:24 +08:00
javer aeb0ff897d add create nativeFromBuffer interface 2023-02-20 10:35:44 +08:00
王召德 86080f24c1
Merge pull request #2226 from DaydreamCoding/feature/fix_msvc_bf16
MSVC adapt for BF16
2023-02-17 10:09:56 +08:00
xiaying 4a609006eb [MNN:Sync] Sync Internal 2.3.1 2023-02-15 10:30:27 +08:00
zhaode.wzd 03008e04b7 fix&add package_scripts 2023-01-11 17:33:00 +08:00
zhaode.wzd b142695010 [Sync] Sync Internal changes. 2023-01-11 15:08:58 +08:00
xiaying d46b6b998d [MNN:Sync] Sync Internal 2.3.0 2022-12-30 15:18:58 +08:00
xiaying b1f5664ced [MNN:Internal] Sync to 2.2.3 2022-12-24 09:42:39 +08:00
xiaying ad5d243c9f [MNN:Sync] A few bugfixes
1. 支持 Onnx If 空子图的情况(这种情况是条件判断一定为真或假)
    2. 修正 Where 算子在 zeroshape 下维度计算出错的问题
    3. 修正 Reduce 计算 zeroshape 的非 prod 情况
    4. 修正 arch64-linux 上编译错误
    5. 修正 头文件 NNAPI 的注释错误
    6, 部分训练相关问题修正
2022-12-04 15:17:36 +08:00
xiaying 8908e80d44 [Sync] Sync Internal 2.2.2 2022-11-18 22:35:31 +08:00
zhaode.wzd c683c5c6c2 [Sync] Sync Internal Gitlab 2.2.1 2022-11-08 17:05:14 +08:00
xiaying acb3bb6c62 [Sync] Sync Internal Gitlab 2.2.0 2022-10-30 08:44:24 +08:00
xiaying aa25623600 Fix bug for remain area for GridSampler 2022-10-13 11:13:19 +08:00
xiaying db53f951e6 [Sync] Sync Internal 2.1.2 2022-09-30 10:02:52 +08:00
wtiandong 2768a397bd Merge remote-tracking branch 'origin/master' into interp3D 2022-09-27 17:13:37 +08:00
wtiandong 9e284352b5 merge interp3d_op_param into interp_op_param
merge interp3d_op_param into interp_op_param
2022-09-27 17:07:23 +08:00
jxt1234 58901eefd0
Merge pull request #2075 from NiuCY/ConvTranspose3D
Add ConvTranspose3D
2022-09-27 15:25:16 +08:00
jxt1234 a5ed4942e8
Merge pull request #2079 from jokerz0624/accelerate/TensorConvert
accelerate MNNPackTranspose
2022-09-27 09:59:54 +08:00
zw22zw22 634474cc85 fix bugs for leakrelu in hiai ir for mnn ir 2022-09-27 07:53:27 +08:00
jokerz0624 71d0975d37 improvement(TensorConvert): accelerate MNNPackTranspose with SIMD when channel=3 2022-09-24 16:49:20 +08:00
NiuChenyu 5c67cfa8d6 Add ConvTranspose3D 2022-09-23 17:59:27 +08:00
zw22zw22 299525c7c7 fix binaryOP bugs
fix compiling error for hiai
2022-09-23 16:05:08 +08:00
jxt1234 f001a65c81
Merge pull request #2051 from DaydreamCoding/patch-12
Fix ThreadPool behavior
2022-09-19 10:47:41 +08:00
jokerz0624 62d529379b feat(iOS/macOS): add Apple A16 and M2 support in CPU family 2022-09-18 12:58:17 +08:00
zhaode.wzd 4753255227 [MNN:Sync] Sync Internal 2.1.1 contain below changes.
[Pymnn:Bugfix] Fix usage and small bug in pymnn.
    [Docs:Update] Update docs/cpp markdown
    [Docs:Update] Add docs check.
    [MNN:Update] Update VecHalf.hpp
    [MNN:Bugfix] Fix compile errors caused by "#define MNN_THREAD_LOCK_CPU"
    [Geometry:Bugfix] Fix bug for resize of broadcastto: https://github.com/alibaba/MNN/issues/2040
    [Docs:Update] Update inference api usage.
    [Pymnn:Bugfix] Close hiai load to fix resource leak.
    [MNN:Update] Down gradle version for demo compile
2022-09-09 17:24:37 +08:00
DaydreamCoding d5e08a913f Fix ThreadPool behavior : Pipeline::encode may anr when other session call ThreadPool::active()
Pipeline::encode -> GeometryComputerUtils::shapeComputeAndGeometryTransform -> onExecute
2022-09-08 14:37:31 +08:00
wtiandong 71aae927ff Add Interp3D Support
1. add PyTorch interpolation 3D to Onnx to MNN converter
2. add interpolation3D nearest CPU/OpenCL implementation

all added OPs are verified
update opencl_program.cc

update opencl_program.cc
2022-09-07 15:56:08 +08:00
jxt1234 41db47f2c6
Merge pull request #2038 from DaydreamCoding/patch-11
Fix VecHalf.hpp not include neon header
2022-09-05 10:33:40 +08:00
jxt1234 5fbfa5a1d7
Merge pull request #2032 from MambaWong/master
Fix compile errors caused by "#define MNN_THREAD_LOCK_CPU"
2022-09-05 10:29:58 +08:00
DaydreamCoding 206f73eda7
Update VecHalf.hpp
fix header include
2022-09-01 15:33:13 +08:00
xiaying fafafef5c5 [MNN:Sync] Sync Internal 2.1.0 2022-08-31 20:11:16 +08:00
jason_w 282e771445
Fix compile errors caused by "#define MNN_THREAD_LOCK_CPU"
error: sort is not a member of std
error: gettid was not declared in this scope
error: __NR_sched_setaffinity was not declared in this scope
error: syscall was not declared in this scope
2022-08-25 12:25:30 +08:00
zhaode.wzd 76b8ace520 Sync Internal 2.0.5 2022-08-23 21:21:29 +08:00
xiaying 68708c5d66 Sync Internal 2.0.4 2022-08-12 10:30:48 +08:00
xiaying 719910c1c5 Fix bug for NC4HW4 broadcast not active 2022-08-10 11:13:50 +08:00
xiaying 8330da263a [Sync] Sync internal 2.0.3 2022-07-22 09:59:30 +08:00
xiaying eb51926f84 [MNN:Sync] Sync internal Gitlab to 2.0.2 2022-07-19 13:52:07 +08:00
xiaying 3cf5126828 Fix compute error for fp16 convolution dw 2022-07-15 12:47:48 +08:00
xiaying 8e0f544ea6 Fix Remain treat bug for AVX512-Int8 2022-07-15 12:47:48 +08:00
xiaying 89288cc509 Update README.md, fix CPU Runtime compile bug for Android - armv8.2 2022-07-12 12:43:06 +08:00
jxt1234 7102b4890b
Merge pull request #1821 from snadampal/aarch64_linux_fp16
backend: cpu: runtime: linux aarch64 hwcaps setting for ARMV82
2022-07-12 11:32:18 +08:00
Brian Li 2801621ea5 Add missing headers 2022-07-12 01:28:27 +08:00
xiaying 2ec9495719 [MNN:Sync] Sync 2.0. 2022-07-11 10:56:37 +08:00
hebin 4679f848c4 fix windows compile on avx512 2022-06-29 16:43:49 +08:00