xiaying
2a0427775e
Fix bug for Expr16
2022-06-07 11:29:05 +08:00
xiaying
f093e0d170
[Bugfix] fix bug for Arm82Interp two quote
2022-05-30 17:54:34 +08:00
xiaying
c95df2a932
Fix compile bug for arm82-armv7a
2022-05-30 17:24:20 +08:00
xiaying
d98e274ab1
fix compile bug for build mini
2022-05-30 17:24:20 +08:00
Yulv-git
8a43ea2011
Fix some typos in source/.
2022-05-27 23:46:44 +08:00
xiaying
1f0c6d4f21
Fix bug for metal use float binary intead of int binary
2022-05-18 14:31:04 +08:00
tianbu.xsw
5bb30c3c93
opencl bugfix for binary and geometryGather
2022-05-18 14:31:04 +08:00
xiaying
30c66fc79c
Fix bug for Metal backend copy crash, reduce memory alloc when freq resize
2022-05-18 14:31:04 +08:00
xiaying
44f0fa62be
Fix compile bug for windows compile
2022-05-13 14:02:37 +08:00
jxt1234
1311bc9255
Merge pull request #1843 from jokerz0624/arm82_support
...
feat(ROI): add ARM82/AVX support for ROIPooling/ROIAlign
2022-05-06 20:00:59 +08:00
jxt1234
6487dd8a10
Merge pull request #1892 from xthan/patch-1
...
Fix bug in MNNSigmoidLowp
2022-05-06 19:56:38 +08:00
xiaying
c0aee19d32
[Sync] Sync internal gitlab
2022-05-06 19:51:20 +08:00
Xintong Han
35b1ef182c
Fix bug in MNNSigmoidLowp
...
When dataSize is not a multiple of 4, the calculation is wrong as it does not move the dst address after the for loop.
2022-03-28 13:23:01 +08:00
xiaying
0c718e552b
[Sync] Sync internal Gitlab
2022-02-18 11:30:27 +08:00
Joker
02a1565bbb
feat(ROI): add ARM82/AVX support for ROIPooling/ROIAlign
2022-01-29 18:03:25 +08:00
xiaying
1b626d72c3
[MNN:Sync] Sync internal gitlab
2022-01-04 10:50:40 +08:00
xiaying
a2e1ed4c67
[MNN:Bugfix] Fix compile bug for Arm82 - armv7a
2021-12-13 11:20:49 +08:00
xiaying
b3c5feefdb
[Converter:Bugfix] Support Onnx::TopK for dynamic shape
2021-12-10 15:16:28 +08:00
tianhang.yth
a14ef5e265
update MetalLib.h for low version macos
2021-12-01 12:13:50 +08:00
xiaying
69dba73dc7
[MNN:Sync] Sync internal gitlab
...
Main Feature:
1. Add OpenCV API and Numpy API Support
2. Protobuf move into MNN
3. Add more op for torchscript convert
4. Add recompute to speed up geometry compute
5. Add ModuleBasic Test
2021-11-30 10:10:53 +08:00
xiaying
71cd04e91c
Fix compile bug for sse fma
2021-11-19 10:23:50 +08:00
aaron-wu
f995ca6a8f
fix(op): replace the _mm_load_ps and _mm_store_ps with _mm_loadu_ps and _mm_storeu_ps, to avoid segment errors when not aligned
2021-11-16 16:07:50 +08:00
aaron-wu
e35ea54638
feat(op): Add SSE instruction set optimization for ROIAlign and ROIPooling op
2021-11-15 14:53:12 +08:00
xiaying
95402e79b4
[MNN:Bugfix] Fix Compile bug for other backends
2021-11-12 17:49:50 +08:00
jxt1234
e86c0ba30a
Merge pull request #1746 from no5-aaron-wu/dev_aaron_wu
...
add CPUROIAlign op and unit-test and so on
2021-11-12 17:13:04 +08:00
xiaying
361bbc90d5
Fix bug for DenseConvolutionTiledExecutor opt not care width = 1, but kernel X >1 and padX > 0
2021-11-12 09:56:59 +08:00
xiaying
0bcc70922d
[MNN:Bugfix] Fix compile bug for gnu of arm82 /bf16
2021-11-10 17:52:30 +08:00
aaron-wu
074bf5e275
fix(op): add assert to var samplingRatioW and samplingRatioH
2021-11-09 11:20:22 +08:00
aaron-wu
8e773602bf
fix(schema): merge parameters for RoiPooling and RoiAlign into one table as RoiParameters
2021-11-09 11:11:27 +08:00
aaron-wu
7afb6abd1b
fix(op): precalculate pos and area which shared by all channels; add defense programming for boundary case
2021-11-09 10:00:51 +08:00
aaron-wu
094d5697ae
feat(op): add neon realization of CPUROIAlign op
2021-11-09 10:00:51 +08:00
aaron-wu
1af7d6f4d1
fix(op): fix compile error in linux system
2021-11-09 10:00:50 +08:00
aaron-wu
cfac71f919
feat(op): add CPUROIAlign op and uint test
2021-11-09 10:00:50 +08:00
xiaying
75413768b0
Fix bug for onResize of CPURNNSequenceGRU
2021-11-04 12:55:59 +08:00
xiaying
2fdd11e718
[MNN:Bugfix] Use fabsf instead of abs
2021-11-02 12:06:10 +08:00
jxt1234
0b69ba78d2
Merge pull request #1739 from jun-lv-17/fix-depthwiseconvint8-issue
...
Fix conv1d depthwise conv int8 calculation issue.
2021-11-02 11:39:32 +08:00
xiaying
b1d923e76c
Fix compile bug for bf16 when sse / neon is close
2021-11-02 11:34:14 +08:00
xiaying
ed8a2da0b4
[MNN:Bugfix] Fix bug for CPURaster for fuse singleConvert of dim == 3
2021-11-02 10:56:35 +08:00
xiaying
0fdb9d768f
Add Clamp for fp32 -> fp16
2021-11-01 14:25:34 +08:00
aaron-wu
9acad284fa
fix(op): increase compatibility of NCHW format for inputs[1](rois) in CPUROIPooling op
2021-10-30 15:18:38 +08:00
jun.lv
0b299e951c
Fix conv1d depthwise conv int8 calculation issue.
2021-10-29 18:58:58 +08:00
jxt1234
e121c1527a
Merge pull request #1718 from jokerz0624/acc/GridSample
...
improvement(GridSample): give areaRemain one better handle in Arm82
2021-10-25 10:56:14 +08:00
jxt1234
70cd0c5b27
Merge pull request #1724 from DaydreamCoding/patch-10
...
fix memory leak
2021-10-25 10:55:21 +08:00
Joker
af9c543115
improvement(ConvWino): use fma to accelerate computation
2021-10-22 14:24:29 +08:00
xiaying
da3688119d
[MNN:Bugfix] Fix shape compute and content bug for batch > 1's rnngru
2021-10-20 11:56:16 +08:00
DaydreamCoding
0ec11813f0
fix memory leak
2021-10-15 13:39:20 +08:00
xiaying
7f50ae689d
Fix zero shape bug for TensorArray
2021-10-14 15:00:01 +08:00
Joker
6f8dafdd5b
improvement(GridSample): give areaRemain one better handle in Arm82
2021-10-12 12:15:31 +08:00
jxt1234
1b2e168d6e
Merge pull request #1678 from jokerz0624/acc/GridSample
...
improvement(GridSample): accelerate GridSample in CPU/Arm82/AVX2/AVX512
2021-10-08 19:34:24 +08:00
jxt1234
f5101c9b2b
Merge pull request #1712 from jun-lv-17/o-master
...
Fix CPUEltwiseInt8Add calculation issue.
2021-10-08 19:33:28 +08:00