Commit Graph

88 Commits

Author SHA1 Message Date
xiaying 56255c7d84 [MNN:Bugfix] Bugfix for quan x86 2021-06-24 14:06:10 +08:00
tianhang.yth 4eb1096b9c fix alpha div by zero bug and arm server compile bug 2021-06-24 10:38:55 +08:00
Joker 4184860ae4 feat(arm82): add GridSample op support in arm82 backend 2021-06-23 14:10:31 +08:00
xiaying f6422c315c [MNN:Bugfix] Fix bug for ConvInt8TiledExecutor onClone 2021-06-16 16:20:42 +08:00
xiaying 8d9f86bc4a fix compile bug for c++ < 14 2021-06-16 15:24:46 +08:00
hush-alibaba 58545d6ca1
Synchronize internal github for version 1.2.0 (#1518) 2021-06-11 17:17:13 +08:00
xiaying b62c2eb687 [BF16:Bugfix] Fix compile bug for BF16 in NO SSE and NO NEON 2021-04-21 15:54:01 +08:00
xiaying 3c4ba7c595 [MNN:Sync] Sync internal gitlab 2021-04-16 14:50:43 +08:00
弗人 95bcb842a0 [PATCH 08/36] [Train:Feature:Bugfix] train quant support full quant 2021-04-16 14:29:36 +08:00
jason_w 29a7128bdb
Remove unnecessary '+'
Remove unnecessary '+'
2021-04-13 08:12:38 +08:00
xiaying d91fc63976 [MNN:Sync] Sync internal Gitlab 2021-04-08 15:34:23 +08:00
xiaying 5e127496fc Sync Internal Github 2021-02-07 10:47:03 +08:00
Joker 3a121b71ff improvement(arm82): accelerate data format converting between NC8HW8 and NCHW in armv82 2021-01-30 16:06:59 +08:00
xiaying 7f94e02410 [PATCH 04/19] [Converter:Bugfix] Support group convolution for PB 2021-01-08 14:16:24 +08:00
xiaying 2d1b129121 [MNN:Sync] Sync internal git 2021-01-06 16:29:37 +08:00
xiaying aedc8f6a68 [PATCH 341/350] Add avx512 patch 2021-01-06 15:57:22 +08:00
xiaying 0fe2b0dfee [PATCH 278/350] [MNN:Speed] Support OneDNN for MNN Convolution 2021-01-06 15:57:17 +08:00
houjiang e3262eac4c [PATCH 243/350] [MNN::Refine] Refine code style. 2021-01-06 15:57:14 +08:00
houjiang d7d7ece8f4 [PATCH 214/350] [MNN::Refine] Rearrange weights for 1x1 and generic convolution. 2021-01-06 15:57:12 +08:00
houjiang 5cb6071469 [PATCH 203/350] [MNN::Feature] Rearrange weights. 2021-01-06 15:57:11 +08:00
弗人 74fd76e72c [PATCH 201/350] [MNN:Feature] add support for sparse+convint8 2021-01-06 15:57:11 +08:00
xiaying 57d67dde2d [PATCH 191/350] Speed up Int8Tofloat for x86-sse 2021-01-06 15:57:10 +08:00
xiaying a131c8cd26 [PATCH 188/350] [MNN:Speed] Optimize ConvInt8 for remain case 2021-01-06 15:57:09 +08:00
xiaying a943d45796 [PATCH 187/350] Float2Int8 opt for x86-sse 2021-01-06 15:57:09 +08:00
Hui Shu ab711d484c Synchronize internal master to Github 2020-12-15 14:12:35 +08:00
xiaying 703697d720 [PATCH 61/78] [MNN:Refractor] move unuseful code to backupcode 2020-11-25 18:57:55 +08:00
xiaying 1bd8d27131 [PATCH 25/78] [MNN:Speed] Add asm for avx int8 2020-11-25 18:57:52 +08:00
Hui Shu d6795ad031 Github release 1.1.0 2020-11-05 16:49:17 +08:00
Evgeny Proydakov 06db7ab189 Fixed sevaral clang warnings for ios build. (-Wshadow, -Wunused-variable) 2020-09-30 14:46:48 +03:00
Evgeny Proydakov a3998d638b Fixed several compile warnins in .cpp (-Wunused-variable) 2020-09-25 14:17:51 +03:00
xiaying 7c0b04cb01 [MNN:Sync] Sync internal github 2020-07-23 10:35:12 +08:00
xiaying 4ddb4408eb Reduce memory alloc / release in ConvolutionTiledExecutor 2020-07-23 10:26:24 +08:00
xiaying 26842dc60f [MNN:Sync] Delete unuseful code: Convolution3x3 2020-07-04 01:28:31 +08:00
xiaying 255db932eb [MNN:Sync] Sync Internal Github 2020-07-04 01:21:30 +08:00
xiaying 49ca95571d Revert "[MNN:Speed] Optmize winograd convolution"
This reverts commit 9e34b9a856ccf9d2a81bc9387a1c7dfbc6a12e5d.
2020-07-04 01:06:21 +08:00
xiaying e708cff674 Optmize winograd convolution 2020-07-04 01:06:20 +08:00
xiaying dfe1d06c08 Support multi-thread for 1x1 convolution 2020-07-04 01:06:19 +08:00
xiaying c7051d367c Temply forbid not im2col case for 1x1 conv 2020-07-04 01:06:19 +08:00
xiaying 93ea95ff30 Add MNNUnPackC4ForMatMul_C 2020-07-04 01:06:19 +08:00
xiaying a38a551993 Add MNNPackC4ForMatMul_A 2020-07-04 01:06:18 +08:00
xiaying a750fe0956 Rename _AVX_MNNGemm16x6 as _AVX_MNNPackedMatMul 2020-07-04 01:06:18 +08:00
xiaying 77f44dc1af Rename NHWC<->NC4HW4 as pack/unpack transpose 2020-07-04 01:06:18 +08:00
xiaying d13f1bc0b6 Optimize Strassen Merge C Function for x86 2020-07-04 01:06:18 +08:00
xiaying 1567e74e40 Optmize NHWCToNC4HW4 and NC4HW4ToNHWC 2020-07-04 01:06:18 +08:00
xiaying d88cde6237 Use strassen for Convolution1x1Strassen 2020-07-04 01:06:18 +08:00
xiaying ae91cab1b8 Support Strassen for new matmul 2020-07-04 01:06:18 +08:00
xiaying ac15e9fcec Support get pack mode for each platform 2020-07-04 01:06:18 +08:00
xiaying 5fc7acd37e support transpose, fix bug for not align 2020-07-04 01:06:18 +08:00
xiaying 3dc9cbb740 Optmize CPUMatMul for x86 avx256 by 16x6 540, 320, 540 from 3.8 ms -> 2.5 ms 1024, 1024, 1024 from 39 ms -> 31 ms 2020-07-04 01:06:18 +08:00
Interfish af807b050b Update: fix release buffer 2020-06-14 17:11:37 +08:00