Commit Graph

599 Commits

Author SHA1 Message Date
xiaying 2a0427775e Fix bug for Expr16 2022-06-07 11:29:05 +08:00
xiaying f093e0d170 [Bugfix] fix bug for Arm82Interp two quote 2022-05-30 17:54:34 +08:00
xiaying c95df2a932 Fix compile bug for arm82-armv7a 2022-05-30 17:24:20 +08:00
xiaying d98e274ab1 fix compile bug for build mini 2022-05-30 17:24:20 +08:00
Yulv-git 8a43ea2011 Fix some typos in source/. 2022-05-27 23:46:44 +08:00
xiaying 1f0c6d4f21 Fix bug for metal use float binary intead of int binary 2022-05-18 14:31:04 +08:00
tianbu.xsw 5bb30c3c93 opencl bugfix for binary and geometryGather 2022-05-18 14:31:04 +08:00
xiaying 30c66fc79c Fix bug for Metal backend copy crash, reduce memory alloc when freq resize 2022-05-18 14:31:04 +08:00
xiaying 44f0fa62be Fix compile bug for windows compile 2022-05-13 14:02:37 +08:00
jxt1234 1311bc9255
Merge pull request #1843 from jokerz0624/arm82_support
feat(ROI): add ARM82/AVX support for ROIPooling/ROIAlign
2022-05-06 20:00:59 +08:00
jxt1234 6487dd8a10
Merge pull request #1892 from xthan/patch-1
Fix bug in MNNSigmoidLowp
2022-05-06 19:56:38 +08:00
xiaying c0aee19d32 [Sync] Sync internal gitlab 2022-05-06 19:51:20 +08:00
Xintong Han 35b1ef182c
Fix bug in MNNSigmoidLowp
When dataSize is not a multiple of 4, the calculation is wrong as it does not move the dst address after the for loop.
2022-03-28 13:23:01 +08:00
xiaying 0c718e552b [Sync] Sync internal Gitlab 2022-02-18 11:30:27 +08:00
Joker 02a1565bbb feat(ROI): add ARM82/AVX support for ROIPooling/ROIAlign 2022-01-29 18:03:25 +08:00
xiaying 1b626d72c3 [MNN:Sync] Sync internal gitlab 2022-01-04 10:50:40 +08:00
xiaying a2e1ed4c67 [MNN:Bugfix] Fix compile bug for Arm82 - armv7a 2021-12-13 11:20:49 +08:00
xiaying b3c5feefdb [Converter:Bugfix] Support Onnx::TopK for dynamic shape 2021-12-10 15:16:28 +08:00
tianhang.yth a14ef5e265 update MetalLib.h for low version macos 2021-12-01 12:13:50 +08:00
xiaying 69dba73dc7 [MNN:Sync] Sync internal gitlab
Main Feature:
1. Add OpenCV API and Numpy API Support
2. Protobuf move into MNN
3. Add more op for torchscript convert
4. Add recompute to speed up geometry compute
5. Add ModuleBasic Test
2021-11-30 10:10:53 +08:00
xiaying 71cd04e91c Fix compile bug for sse fma 2021-11-19 10:23:50 +08:00
aaron-wu f995ca6a8f fix(op): replace the _mm_load_ps and _mm_store_ps with _mm_loadu_ps and _mm_storeu_ps, to avoid segment errors when not aligned 2021-11-16 16:07:50 +08:00
aaron-wu e35ea54638 feat(op): Add SSE instruction set optimization for ROIAlign and ROIPooling op 2021-11-15 14:53:12 +08:00
xiaying 95402e79b4 [MNN:Bugfix] Fix Compile bug for other backends 2021-11-12 17:49:50 +08:00
jxt1234 e86c0ba30a
Merge pull request #1746 from no5-aaron-wu/dev_aaron_wu
add CPUROIAlign op and unit-test and so on
2021-11-12 17:13:04 +08:00
xiaying 361bbc90d5 Fix bug for DenseConvolutionTiledExecutor opt not care width = 1, but kernel X >1 and padX > 0 2021-11-12 09:56:59 +08:00
xiaying 0bcc70922d [MNN:Bugfix] Fix compile bug for gnu of arm82 /bf16 2021-11-10 17:52:30 +08:00
aaron-wu 074bf5e275 fix(op): add assert to var samplingRatioW and samplingRatioH 2021-11-09 11:20:22 +08:00
aaron-wu 8e773602bf fix(schema): merge parameters for RoiPooling and RoiAlign into one table as RoiParameters 2021-11-09 11:11:27 +08:00
aaron-wu 7afb6abd1b fix(op): precalculate pos and area which shared by all channels; add defense programming for boundary case 2021-11-09 10:00:51 +08:00
aaron-wu 094d5697ae feat(op): add neon realization of CPUROIAlign op 2021-11-09 10:00:51 +08:00
aaron-wu 1af7d6f4d1 fix(op): fix compile error in linux system 2021-11-09 10:00:50 +08:00
aaron-wu cfac71f919 feat(op): add CPUROIAlign op and uint test 2021-11-09 10:00:50 +08:00
xiaying 75413768b0 Fix bug for onResize of CPURNNSequenceGRU 2021-11-04 12:55:59 +08:00
xiaying 2fdd11e718 [MNN:Bugfix] Use fabsf instead of abs 2021-11-02 12:06:10 +08:00
jxt1234 0b69ba78d2
Merge pull request #1739 from jun-lv-17/fix-depthwiseconvint8-issue
Fix conv1d depthwise conv int8 calculation issue.
2021-11-02 11:39:32 +08:00
xiaying b1d923e76c Fix compile bug for bf16 when sse / neon is close 2021-11-02 11:34:14 +08:00
xiaying ed8a2da0b4 [MNN:Bugfix] Fix bug for CPURaster for fuse singleConvert of dim == 3 2021-11-02 10:56:35 +08:00
xiaying 0fdb9d768f Add Clamp for fp32 -> fp16 2021-11-01 14:25:34 +08:00
aaron-wu 9acad284fa fix(op): increase compatibility of NCHW format for inputs[1](rois) in CPUROIPooling op 2021-10-30 15:18:38 +08:00
jun.lv 0b299e951c Fix conv1d depthwise conv int8 calculation issue. 2021-10-29 18:58:58 +08:00
jxt1234 e121c1527a
Merge pull request #1718 from jokerz0624/acc/GridSample
improvement(GridSample): give areaRemain one better handle in Arm82
2021-10-25 10:56:14 +08:00
jxt1234 70cd0c5b27
Merge pull request #1724 from DaydreamCoding/patch-10
fix memory leak
2021-10-25 10:55:21 +08:00
Joker af9c543115 improvement(ConvWino): use fma to accelerate computation 2021-10-22 14:24:29 +08:00
xiaying da3688119d [MNN:Bugfix] Fix shape compute and content bug for batch > 1's rnngru 2021-10-20 11:56:16 +08:00
DaydreamCoding 0ec11813f0
fix memory leak 2021-10-15 13:39:20 +08:00
xiaying 7f50ae689d Fix zero shape bug for TensorArray 2021-10-14 15:00:01 +08:00
Joker 6f8dafdd5b improvement(GridSample): give areaRemain one better handle in Arm82 2021-10-12 12:15:31 +08:00
jxt1234 1b2e168d6e
Merge pull request #1678 from jokerz0624/acc/GridSample
improvement(GridSample): accelerate GridSample in CPU/Arm82/AVX2/AVX512
2021-10-08 19:34:24 +08:00
jxt1234 f5101c9b2b
Merge pull request #1712 from jun-lv-17/o-master
Fix CPUEltwiseInt8Add calculation issue.
2021-10-08 19:33:28 +08:00