xiaying
0d1f5fcf72
Fix bug for MatMulExecution
2023-03-25 10:02:06 +08:00
xiaying
a931d40ddc
Optimize for one-broadcast matmul
2023-03-25 10:02:06 +08:00
xiaying
4935bfc1ae
Optimize cuda gather for region x,y,z large
2023-03-25 10:02:06 +08:00
xiaying
7629ba674e
[MNN:Sync] Sync Internal 2.4.1
2023-03-20 11:32:29 +08:00
Zeekim
a92de8a5bc
fix(opencl): fix mLWS dimension bigger than gws
2023-03-08 11:22:52 +08:00
xiaying
4e2ad365e8
[MNN:Sync] Sync Internal Gitlab
2023-02-28 10:41:24 +08:00
王召德
86080f24c1
Merge pull request #2226 from DaydreamCoding/feature/fix_msvc_bf16
...
MSVC adapt for BF16
2023-02-17 10:09:56 +08:00
xiaying
4a609006eb
[MNN:Sync] Sync Internal 2.3.1
2023-02-15 10:30:27 +08:00
zhaode.wzd
03008e04b7
fix&add package_scripts
2023-01-11 17:33:00 +08:00
zhaode.wzd
b142695010
[Sync] Sync Internal changes.
2023-01-11 15:08:58 +08:00
xiaying
d46b6b998d
[MNN:Sync] Sync Internal 2.3.0
2022-12-30 15:18:58 +08:00
xiaying
b1f5664ced
[MNN:Internal] Sync to 2.2.3
2022-12-24 09:42:39 +08:00
xiaying
ad5d243c9f
[MNN:Sync] A few bugfixes
...
1. 支持 Onnx If 空子图的情况(这种情况是条件判断一定为真或假)
2. 修正 Where 算子在 zeroshape 下维度计算出错的问题
3. 修正 Reduce 计算 zeroshape 的非 prod 情况
4. 修正 arch64-linux 上编译错误
5. 修正 头文件 NNAPI 的注释错误
6, 部分训练相关问题修正
2022-12-04 15:17:36 +08:00
xiaying
8908e80d44
[Sync] Sync Internal 2.2.2
2022-11-18 22:35:31 +08:00
zhaode.wzd
c683c5c6c2
[Sync] Sync Internal Gitlab 2.2.1
2022-11-08 17:05:14 +08:00
xiaying
acb3bb6c62
[Sync] Sync Internal Gitlab 2.2.0
2022-10-30 08:44:24 +08:00
xiaying
aa25623600
Fix bug for remain area for GridSampler
2022-10-13 11:13:19 +08:00
xiaying
db53f951e6
[Sync] Sync Internal 2.1.2
2022-09-30 10:02:52 +08:00
wtiandong
2768a397bd
Merge remote-tracking branch 'origin/master' into interp3D
2022-09-27 17:13:37 +08:00
wtiandong
9e284352b5
merge interp3d_op_param into interp_op_param
...
merge interp3d_op_param into interp_op_param
2022-09-27 17:07:23 +08:00
jxt1234
a5ed4942e8
Merge pull request #2079 from jokerz0624/accelerate/TensorConvert
...
accelerate MNNPackTranspose
2022-09-27 09:59:54 +08:00
zw22zw22
634474cc85
fix bugs for leakrelu in hiai ir for mnn ir
2022-09-27 07:53:27 +08:00
jokerz0624
71d0975d37
improvement(TensorConvert): accelerate MNNPackTranspose with SIMD when channel=3
2022-09-24 16:49:20 +08:00
zw22zw22
299525c7c7
fix binaryOP bugs
...
fix compiling error for hiai
2022-09-23 16:05:08 +08:00
jxt1234
f001a65c81
Merge pull request #2051 from DaydreamCoding/patch-12
...
Fix ThreadPool behavior
2022-09-19 10:47:41 +08:00
jokerz0624
62d529379b
feat(iOS/macOS): add Apple A16 and M2 support in CPU family
2022-09-18 12:58:17 +08:00
DaydreamCoding
d5e08a913f
Fix ThreadPool behavior : Pipeline::encode may anr when other session call ThreadPool::active()
...
Pipeline::encode -> GeometryComputerUtils::shapeComputeAndGeometryTransform -> onExecute
2022-09-08 14:37:31 +08:00
wtiandong
71aae927ff
Add Interp3D Support
...
1. add PyTorch interpolation 3D to Onnx to MNN converter
2. add interpolation3D nearest CPU/OpenCL implementation
all added OPs are verified
update opencl_program.cc
update opencl_program.cc
2022-09-07 15:56:08 +08:00
jxt1234
41db47f2c6
Merge pull request #2038 from DaydreamCoding/patch-11
...
Fix VecHalf.hpp not include neon header
2022-09-05 10:33:40 +08:00
jxt1234
5fbfa5a1d7
Merge pull request #2032 from MambaWong/master
...
Fix compile errors caused by "#define MNN_THREAD_LOCK_CPU"
2022-09-05 10:29:58 +08:00
DaydreamCoding
206f73eda7
Update VecHalf.hpp
...
fix header include
2022-09-01 15:33:13 +08:00
xiaying
fafafef5c5
[MNN:Sync] Sync Internal 2.1.0
2022-08-31 20:11:16 +08:00
jason_w
282e771445
Fix compile errors caused by "#define MNN_THREAD_LOCK_CPU"
...
error: sort is not a member of std
error: gettid was not declared in this scope
error: __NR_sched_setaffinity was not declared in this scope
error: syscall was not declared in this scope
2022-08-25 12:25:30 +08:00
zhaode.wzd
76b8ace520
Sync Internal 2.0.5
2022-08-23 21:21:29 +08:00
xiaying
68708c5d66
Sync Internal 2.0.4
2022-08-12 10:30:48 +08:00
xiaying
8330da263a
[Sync] Sync internal 2.0.3
2022-07-22 09:59:30 +08:00
xiaying
eb51926f84
[MNN:Sync] Sync internal Gitlab to 2.0.2
2022-07-19 13:52:07 +08:00
xiaying
3cf5126828
Fix compute error for fp16 convolution dw
2022-07-15 12:47:48 +08:00
xiaying
8e0f544ea6
Fix Remain treat bug for AVX512-Int8
2022-07-15 12:47:48 +08:00
xiaying
89288cc509
Update README.md, fix CPU Runtime compile bug for Android - armv8.2
2022-07-12 12:43:06 +08:00
jxt1234
7102b4890b
Merge pull request #1821 from snadampal/aarch64_linux_fp16
...
backend: cpu: runtime: linux aarch64 hwcaps setting for ARMV82
2022-07-12 11:32:18 +08:00
Brian Li
2801621ea5
Add missing headers
2022-07-12 01:28:27 +08:00
xiaying
2ec9495719
[MNN:Sync] Sync 2.0.
2022-07-11 10:56:37 +08:00
hebin
4679f848c4
fix windows compile on avx512
2022-06-29 16:43:49 +08:00
雁行
6cf30db8f6
Opt DepthToSapce from raster to coreml execution.
2022-06-29 14:59:46 +08:00
xiaying
c02c8cc145
Fix compile bug for ios simulator for m1
2022-06-28 14:10:51 +08:00
xiaying
2d13d6a495
2022-06-27 10:52:11 +08:00
xiaying
d3ffdf4229
[MNN:Sync] Sync internal gitlab
2022-06-24 18:30:05 +08:00
xiaying
aeaac3fde3
[MNN:Sync] Sync internal gitlab
2022-06-10 10:39:50 +08:00
xiaying
c5592d284b
Fix bug for buffer not alloc enough
2022-06-09 15:57:17 +08:00