root/MNN - MNN - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
xiaying	8330da263a	[Sync] Sync internal 2.0.3	2022-07-22 09:59:30 +08:00
xiaying	eb51926f84	[MNN:Sync] Sync internal Gitlab to 2.0.2	2022-07-19 13:52:07 +08:00
xiaying	3cf5126828	Fix compute error for fp16 convolution dw	2022-07-15 12:47:48 +08:00
xiaying	8e0f544ea6	Fix Remain treat bug for AVX512-Int8	2022-07-15 12:47:48 +08:00
xiaying	89288cc509	Update README.md, fix CPU Runtime compile bug for Android - armv8.2	2022-07-12 12:43:06 +08:00
jxt1234	7102b4890b	Merge pull request #1821 from snadampal/aarch64_linux_fp16 backend: cpu: runtime: linux aarch64 hwcaps setting for ARMV82	2022-07-12 11:32:18 +08:00
Brian Li	2801621ea5	Add missing headers	2022-07-12 01:28:27 +08:00
xiaying	2ec9495719	[MNN:Sync] Sync 2.0.	2022-07-11 10:56:37 +08:00
hebin	4679f848c4	fix windows compile on avx512	2022-06-29 16:43:49 +08:00
雁行	6cf30db8f6	Opt DepthToSapce from raster to coreml execution.	2022-06-29 14:59:46 +08:00
xiaying	c02c8cc145	Fix compile bug for ios simulator for m1	2022-06-28 14:10:51 +08:00
xiaying	2d13d6a495		2022-06-27 10:52:11 +08:00
xiaying	d3ffdf4229	[MNN:Sync] Sync internal gitlab	2022-06-24 18:30:05 +08:00
xiaying	aeaac3fde3	[MNN:Sync] Sync internal gitlab	2022-06-10 10:39:50 +08:00
xiaying	c5592d284b	Fix bug for buffer not alloc enough	2022-06-09 15:57:17 +08:00
xiaying	eac58ab5f5	Fix bug for expanddim < -1	2022-06-08 15:39:04 +08:00
Sunita Nadampalli	aa7dc95047	backend: cpu: runtime: linux aarch64 hwcaps setting for ARMV82	2022-06-07 17:33:27 +00:00
xiaying	2a0427775e	Fix bug for Expr16	2022-06-07 11:29:05 +08:00
xiaying	f093e0d170	[Bugfix] fix bug for Arm82Interp two quote	2022-05-30 17:54:34 +08:00
xiaying	c95df2a932	Fix compile bug for arm82-armv7a	2022-05-30 17:24:20 +08:00
xiaying	d98e274ab1	fix compile bug for build mini	2022-05-30 17:24:20 +08:00
Yulv-git	8a43ea2011	Fix some typos in source/.	2022-05-27 23:46:44 +08:00
xiaying	1f0c6d4f21	Fix bug for metal use float binary intead of int binary	2022-05-18 14:31:04 +08:00
tianbu.xsw	5bb30c3c93	opencl bugfix for binary and geometryGather	2022-05-18 14:31:04 +08:00
xiaying	30c66fc79c	Fix bug for Metal backend copy crash, reduce memory alloc when freq resize	2022-05-18 14:31:04 +08:00
xiaying	44f0fa62be	Fix compile bug for windows compile	2022-05-13 14:02:37 +08:00
jxt1234	1311bc9255	Merge pull request #1843 from jokerz0624/arm82_support feat(ROI): add ARM82/AVX support for ROIPooling/ROIAlign	2022-05-06 20:00:59 +08:00
jxt1234	6487dd8a10	Merge pull request #1892 from xthan/patch-1 Fix bug in MNNSigmoidLowp	2022-05-06 19:56:38 +08:00
xiaying	c0aee19d32	[Sync] Sync internal gitlab	2022-05-06 19:51:20 +08:00
Xintong Han	35b1ef182c	Fix bug in MNNSigmoidLowp When dataSize is not a multiple of 4, the calculation is wrong as it does not move the dst address after the for loop.	2022-03-28 13:23:01 +08:00
xiaying	0c718e552b	[Sync] Sync internal Gitlab	2022-02-18 11:30:27 +08:00
Joker	02a1565bbb	feat(ROI): add ARM82/AVX support for ROIPooling/ROIAlign	2022-01-29 18:03:25 +08:00
xiaying	1b626d72c3	[MNN:Sync] Sync internal gitlab	2022-01-04 10:50:40 +08:00
xiaying	a2e1ed4c67	[MNN:Bugfix] Fix compile bug for Arm82 - armv7a	2021-12-13 11:20:49 +08:00
xiaying	b3c5feefdb	[Converter:Bugfix] Support Onnx::TopK for dynamic shape	2021-12-10 15:16:28 +08:00
tianhang.yth	a14ef5e265	update MetalLib.h for low version macos	2021-12-01 12:13:50 +08:00
xiaying	69dba73dc7	[MNN:Sync] Sync internal gitlab Main Feature: 1. Add OpenCV API and Numpy API Support 2. Protobuf move into MNN 3. Add more op for torchscript convert 4. Add recompute to speed up geometry compute 5. Add ModuleBasic Test	2021-11-30 10:10:53 +08:00
xiaying	71cd04e91c	Fix compile bug for sse fma	2021-11-19 10:23:50 +08:00
aaron-wu	f995ca6a8f	fix(op): replace the _mm_load_ps and _mm_store_ps with _mm_loadu_ps and _mm_storeu_ps, to avoid segment errors when not aligned	2021-11-16 16:07:50 +08:00
aaron-wu	e35ea54638	feat(op): Add SSE instruction set optimization for ROIAlign and ROIPooling op	2021-11-15 14:53:12 +08:00
xiaying	95402e79b4	[MNN:Bugfix] Fix Compile bug for other backends	2021-11-12 17:49:50 +08:00
jxt1234	e86c0ba30a	Merge pull request #1746 from no5-aaron-wu/dev_aaron_wu add CPUROIAlign op and unit-test and so on	2021-11-12 17:13:04 +08:00
xiaying	361bbc90d5	Fix bug for DenseConvolutionTiledExecutor opt not care width = 1, but kernel X >1 and padX > 0	2021-11-12 09:56:59 +08:00
xiaying	0bcc70922d	[MNN:Bugfix] Fix compile bug for gnu of arm82 /bf16	2021-11-10 17:52:30 +08:00
aaron-wu	074bf5e275	fix(op): add assert to var samplingRatioW and samplingRatioH	2021-11-09 11:20:22 +08:00
aaron-wu	8e773602bf	fix(schema): merge parameters for RoiPooling and RoiAlign into one table as RoiParameters	2021-11-09 11:11:27 +08:00
aaron-wu	7afb6abd1b	fix(op): precalculate pos and area which shared by all channels; add defense programming for boundary case	2021-11-09 10:00:51 +08:00
aaron-wu	094d5697ae	feat(op): add neon realization of CPUROIAlign op	2021-11-09 10:00:51 +08:00
aaron-wu	1af7d6f4d1	fix(op): fix compile error in linux system	2021-11-09 10:00:50 +08:00
aaron-wu	cfac71f919	feat(op): add CPUROIAlign op and uint test	2021-11-09 10:00:50 +08:00
xiaying	75413768b0	Fix bug for onResize of CPURNNSequenceGRU	2021-11-04 12:55:59 +08:00
xiaying	2fdd11e718	[MNN:Bugfix] Use fabsf instead of abs	2021-11-02 12:06:10 +08:00
jxt1234	0b69ba78d2	Merge pull request #1739 from jun-lv-17/fix-depthwiseconvint8-issue Fix conv1d depthwise conv int8 calculation issue.	2021-11-02 11:39:32 +08:00
xiaying	b1d923e76c	Fix compile bug for bf16 when sse / neon is close	2021-11-02 11:34:14 +08:00
xiaying	ed8a2da0b4	[MNN:Bugfix] Fix bug for CPURaster for fuse singleConvert of dim == 3	2021-11-02 10:56:35 +08:00
xiaying	0fdb9d768f	Add Clamp for fp32 -> fp16	2021-11-01 14:25:34 +08:00
aaron-wu	9acad284fa	fix(op): increase compatibility of NCHW format for inputs[1](rois) in CPUROIPooling op	2021-10-30 15:18:38 +08:00
jun.lv	0b299e951c	Fix conv1d depthwise conv int8 calculation issue.	2021-10-29 18:58:58 +08:00
jxt1234	e121c1527a	Merge pull request #1718 from jokerz0624/acc/GridSample improvement(GridSample): give areaRemain one better handle in Arm82	2021-10-25 10:56:14 +08:00
jxt1234	70cd0c5b27	Merge pull request #1724 from DaydreamCoding/patch-10 fix memory leak	2021-10-25 10:55:21 +08:00
Joker	af9c543115	improvement(ConvWino): use fma to accelerate computation	2021-10-22 14:24:29 +08:00
xiaying	da3688119d	[MNN:Bugfix] Fix shape compute and content bug for batch > 1's rnngru	2021-10-20 11:56:16 +08:00
DaydreamCoding	0ec11813f0	fix memory leak	2021-10-15 13:39:20 +08:00
xiaying	7f50ae689d	Fix zero shape bug for TensorArray	2021-10-14 15:00:01 +08:00
Joker	6f8dafdd5b	improvement(GridSample): give areaRemain one better handle in Arm82	2021-10-12 12:15:31 +08:00
jxt1234	1b2e168d6e	Merge pull request #1678 from jokerz0624/acc/GridSample improvement(GridSample): accelerate GridSample in CPU/Arm82/AVX2/AVX512	2021-10-08 19:34:24 +08:00
jxt1234	f5101c9b2b	Merge pull request #1712 from jun-lv-17/o-master Fix CPUEltwiseInt8Add calculation issue.	2021-10-08 19:33:28 +08:00
jxt1234	f3bf2e2a3f	Merge pull request #1713 from jokerz0624/feat/support_A15 feat(ios): add Apple A15 support in CPU family	2021-10-08 19:32:54 +08:00
Joker	5feeb1beb9	feat(ios): add Apple A15 support in CPU family	2021-09-30 20:34:08 +08:00
jun.lv	a5fa7eb446	Fix CPUEltwiseInt8Add calculation issue.	2021-09-30 14:58:01 +08:00
Joker	640993bc56	improvement(GridSample): give areaRemain one better handle in AVX2/AVX512	2021-09-30 11:01:23 +08:00
tianbu.xsw	04bb665a00	fix raster op type print error	2021-09-28 11:35:24 +08:00
tianbu.xsw	18ba76bea6	file write bugfix	2021-09-28 11:35:18 +08:00
jxt1234	ec16cae757	Merge pull request #1705 from jokerz0624/fix/AVX512_detecting fix(AVX512): fix detecting AVX512 features on Darwin	2021-09-26 17:17:44 +08:00
xiaying	a867e23543	Fix compile bug for some ndk	2021-09-25 15:23:06 +08:00
xiaying	b550871abb	Fix bug for raw cpu winograd crash	2021-09-24 16:05:38 +08:00
周科	e4f0fd58cc	fix(AVX512): fix detecting AVX512 features on Darwin	2021-09-24 10:06:47 +08:00
Joker	c41503556d	improvement(GridSample): accelerate GridSample in CPU/Arm82/AVX2/AVX512	2021-09-22 16:04:07 +08:00
xiaying	03c7b5347b	[MNN:Sync] Sync internal Gitlab	2021-09-18 15:52:30 +08:00
xiaying	d4d040c57e	Fix bug for single convert for NHWC <-> NC4HW4 don't care stride	2021-09-17 15:22:44 +08:00
jxt1234	bc355e84ca	Merge pull request #1635 from DaydreamCoding/patch-7 fix BF16 x86_64 Apple M1 failed	2021-09-15 14:13:53 +08:00
jxt1234	40526149f2	Merge pull request #1640 from jiuzhuanzhuan/master fix bug of header file missing when build for aarch64 with open ARM82	2021-09-15 14:12:06 +08:00
xiaying	0c26d47a84	[MNN:Bugfix] Fix bug for Squeeze for axis < 0	2021-09-14 21:02:11 +08:00
shufu	8e464d290c	feat(OpenCL):add GridSample support on OpenCL Backend	2021-09-08 14:11:25 +08:00
xiaying	3a0fb480d2	Fix crash bug for origin quan model	2021-09-06 17:18:29 +08:00
xiaying	55ce936d9c	Support scale input for Interp	2021-09-06 17:18:29 +08:00
jxt1234	d21fd2a910	Merge pull request #1650 from jokerz0624/vulkan_GridSample Vulkan GridSample	2021-09-04 07:49:32 +08:00
tianbu.xsw	612199d0ee	quant weight valid range issue	2021-09-03 17:06:24 +08:00
Joker	3461faa390	feat(Vulkan): add GridSample op support in Vulkan backend	2021-09-01 15:00:56 +08:00
xiaying	5dfe97e4c8	Fix bug for dim = 0's shape compute	2021-08-30 13:34:11 +08:00
xiaying	575a2c97dd	Fix bug for CH Fused but W not fused fastblit error	2021-08-25 17:23:36 +08:00
qingzhu	4eda674234	fix bug of header file missing when build for aarch64 with open ARM82	2021-08-24 20:06:46 +08:00
DaydreamCoding	264a6039bb	fix BF16 x86_64 Apple M1 failed	2021-08-19 18:39:51 +08:00
yuyang	f1997b9a5f	[MNN:Bugfix]fix illegal opcode(MV) bug for some GNU compiler	2021-08-18 18:13:00 +08:00
DaydreamCoding	955c213661	MSVC adapt for BF16	2021-08-15 17:17:32 +08:00
jxt1234	6ad2a632fd	Merge pull request #1609 from DaydreamCoding/patch-5 adapt MSVC	2021-08-12 13:00:14 +08:00
jxt1234	b8d8fc9d73	Merge pull request #1612 from MambaWong/master fix libMNN.so: undefined symbol	2021-08-12 12:56:32 +08:00
xiaying	e9d38acd6b	Fix Prelu bug for multi-batch	2021-08-11 10:21:31 +08:00
xiaying	312b003f4c	[MNN:Bugfix] Fix bug for ARM TopKV2 different	2021-08-05 19:41:37 +08:00
jason_w	6aca4ea6b9	fix symbol lookup error: libMNN.so: undefined symbol: option(MNN_SUPPORT_TFLITE_QUAN "Enable MNN's tflite quantized op" OFF)	2021-08-05 11:08:13 +08:00
DaydreamCoding	ca9700ec42	adapt MSVC	2021-08-03 17:29:54 +08:00
xiaying	1a7d0a6173	Optimize for Transpose compute	2021-08-03 15:50:39 +08:00
xiaying	9c5e6e13b5	Fix bug for fuse for OpCommonUtils	2021-08-03 15:50:06 +08:00
xiaying	d8fc15d84b	[MNN:Sync] Sync internal github Commits: 8148ae75c 弗人 bugfix 14cb8ec7f 弗人 [Converter:Bugfix] bugfix for onnx depthwise convtranspose 476fbcd90 雁行 [MNN:Feature] Open AVX cast and bugfix for contentCFG. 5e26b9fd3 雁行 [Test:Feature] Add android test. 37e147b25 雁行 [MNN:Bugfix] Bugfix for floordiv. 144c185f5 tianbu.xsw hangxing fix hiai b4fd429d6 tianbu.xsw updateCacheFile bugfix -- update cache size d4ba572a8 雁行 [MNN:Bugfix] Support int8 in AVX2 and some Bugfix. 43061f07e xiaying [MNN:Bugfix] Fix bug for module mode run part of model 398cc5ab6 tianhang.yth refactor demo 736380600 xiaying [Express:Bugfix] Fix memory leak for copy branch b8dab0a27 tianhang.yth MNNFloat2Int8 sizeQuad=0 crash fix 94b95bfed ghz [BugFix]1.Better method for fast pack valid check 6a921f85e xiaying [Converter:Bugfix] Fix bug for Fuseconsttosubgraph 5f77ae889 tianhang.yth numThread bugfix a807ef879 tianhang.yth add createSession(configs, runtimeinfo) API, add pymnn demo, pymnn logcat bugfix ad05409d3 xiaying [MNN:Bugfix] Fix bug for StaticModule's sizecompute overflow, add error print for module mode 9d81b8299 xiaying [MNN:Bugfix] Fix bug for Unique op for output size = 1 03b15e9af xiaying [Test:Feature] Add MatMulBConst Test, Fix bug for single Convert c944a76ee tianhang.yth add auto backend and getSessionInfo @tianbu 91fa7267b ghz [BugFix]1.fix the error in eP check bf0041f77 ghz [BugFix]1.Fix the logic error in eP check. 2.Fix the sp align error 693871672 雁行 [CPU:Bugfix] rm adrp instruction for clang compiler bug. 1b8f6b3d8 ghz 1.Fix the wronly use of r13 in arm32 version. 2.Fix the missing callee register save and restore process. feb7ecc4c 弗人 modify log of python offline quant 040c04811 ghz [BufFix]1.replace platform-related regs. 2.fix the same problem in arm32 version 609f37db8 弗人 add log for python quant, python convert 5511dd30a ghz [BugFix]1.Add testcases in SparseConv to check all functional code branch. 2. Fix the bug in "MNNPackC4ForMatMul_A.S" in arm64, which is caused by the missing check of eReal parameter. a93ff9280 tianhang.yth add tf.Unique op support 9729ff773 allen.lk [Bugfix] Fix one arm32 instruction syntax that clang works but gcc DOES NOT work. use index instruction instead. 297c1ad14 雁行 [Expr:Bugfix] bugfix for tensor content used by shape compute. ef8c369e3 弗人 catch exception 07c2dd670 弗人 add dependence to setup, base64 encode url, add time log 177e590c1 弗人 [Python:Feature] add aliyun log for python quant tool 40a7928cf allen.lk [Debug:Sparse] 1.Add group parameter in torchscript converter. 2. Stop split running to avoid memory corruption when check failed in TransformGroupConvolution 3. fix Op split issue in TransformGroupConvolution 3bdea84a1 allen.lk [Debug:Sparse] Fix and warning one kind of segmentfault cause by memory corruption when resize ConvolutionWinograd. Avoid to use some registers as arm restriction. c3c6fbdbd allen.lk [Debug:Sparse] Fix and warning one kind of segmentfault cause by memory corruption when resize ConvolutionWinograd. Avoid to use some registers as arm restriction. bc590eee4 雁行 [Converter:Bugfix] bugfix for onnx instancenormalization convert. d8918593f tianhang.yth add auto backend and getSessionInfo @tianbu 83a198ed7 杭行 update d0dd3e09b 杭行 update 99540202e xiaying [Converter:Optimize] Opt the tensor convert insert 333d8db82 allen.lk [Debug:Sparse] Fix All platform-register r9 / x18 issue on arm32 and arm64. db5994672 杭行 merge 6293de7b8 tianbu.xsw fix pymnn updateCacheFile 5c2e11cb1 tianbu.xsw do updateCache in createSession 6e7641ff4 tianbu.xsw do not limit cacheFile for a model 5287a65e4 tianbu.xsw bugfix 52ba53a91 tianbu.xsw revert pymnn api 60284d830 tianbu.xsw bugfix 6d8077490 tianbu.xsw rename updateCacheFile api params 3cb172710 tianhang.yth updateCacheFile API size default value is 0 c5b69aabf tianbu.xsw updateCacheFile python api fix 5d5da7aa5 tianbu.xsw reflector code 5707877a4 雁行 [MNN:Speed] Speedup for softmax in x86 and arm. 2a211825c tianbu.xsw reflector code for updateCacheFile 76db3a835 tianbu.xsw [Cache Feature]: Add updateCacheFile API for increment cache b06b0fd43 allen.lk [Debug:Sparse] Fix and warning one kind of segmentfault cause by memory corruption when resize ConvolutionWinograd. Avoid to use some registers as arm restriction. e68bfa495 雁行 [Converter:Feature] Add UUID when model convert. a9cb935dc xiaying [MNN:Speed] Support c4nhwc for more fastblit 019f40353 xiaying [Converter:Refractor] Reduce memory used by MNNConvert(bert from 5G -> 1G) d2a6d3d05 xiaying [MNN:Bugfix] Fix bug for identity output not find 604d0801b xiaying [Converter:Bugfix] Fix bug for FuseGeLu 4bada2367 xiaying [MNN:Refractor] SegmentMean rewrite as segment 82070e708 xiaying [MNN:Bugfix] Fix bug for GeometryBinary e8ea4266e xiaying Fix bug for ShapeTensorConvert compute for dim = 1 error 1f1cf1991 xiaying [Tools:Bugfix] Fix system compability for fastTestOnnx 6f422efe2 xiaying [Tools:Bugfix] Remove color for checkDir for easy to dump 968f7ec88 xiaying [MNN:Speed] Support turn broadcast binary to loop 3e7aaf46f xiaying [MNN:Refractor] Set Convolution1x1Strassen support variable input/output ptr 1f65ab163 xiaying [MNN:Bugfix] Fix bug for mini mnn can't convert model d65953d47 xiaying [MNN:Bugfix] Fix bug for armv7a - android-14 + ARM82 8b68be45c xiaying [MNN:Feature] Add segment 8a8f264f5 xiaying [Vulkan:Bugfix] Remove unuseful print 025bb0fda xiaying [Converter:Bugfix] Fix bug for oneof don't support 43900251e tianbu.xsw enable setCacheFile python API ebfb05c74 tianbu.xsw [Metal Feature] support metallib obtain from walle transfer task 9665c0a79 弗人 add check for path in json file c66fef224 xiaying [Converter:Bugfix] Fix bug for oneof don't support 42f192852 xiaying [MNN:Bugfix] Fix bug for not set output / saveTensor into origin Schedule's outputs 1b95354ff 雁行 [Feature]: Support shape compute for SetDiff1D, and null input for Prod. 83966d043 xiaying [Test:Feature] Add test for static module 42d1be933 xiaying [Converter:Bugfix] Fix bug for mnn convert and static model add more outputs for origin model 9067531c3 xiaying [Converter:Refractor] formatLicence 99558bed9 xiaying [Converter:Bugfix] Count the op for unuseful and controlflow 4f6da0fa7 allen.lk [Feature:GRUMultiOutput] fix multi output dimension type c6b219bce xiaying [Converter:Feature] Turn torch converter to object dd4e68a37 xiaying [Converter:Feature] Support dump supported ops 80b6a60a3 xiaying [Converter:Info] If has output name, print output name instead of computed 015278fc3 xiaying [MNN:Refractor] Revert IfModule's debug info 23ac967c4 xiaying Don't transform for multi-input convolution/deconvolution b02b0d4de xiaying Fix bug for multi-input for conv1d 254d8b1d4 xiaying Fix bug for Conv1dSqueezeMove for multi input convolution 1d d47d0b9ca xiaying Fix bug for CPURaster's fuse nc4hw4 357c5bd33 xiaying Fix ConvBiasAdd for conv's inputs op > 1 55b1f0c9c xiaying [Converter:Bugfix] Don't transform for multi-input convolution/deconvolution 1902a30f5 xiaying [Converter:Bugfix] Fix bug for Conv1dSqueezeMove for multi input convolution 1d c23fe617b xiaying [MNN:Bugfix] Fix bug for multi-input for conv1d 8ff018426 xiaying [MNN:Bugfix] Fix bug for CPURaster's fuse nc4hw4 d4e8cd602 xiaying [Converter:Bugfix] Fix ConvBiasAdd for conv's inputs op > 1 846266b42 tianbu.xsw return when program and tune both nullptr fd67c76a9 xiaying [Converter:Bugfix] DepthwiseConvWeightMerge only valid for tflite e77a242c4 xiaying [Converter:Feature] Support tflite's half pixel be054c377 tianbu.xsw [OpenCL Bugfix] do not rewrite cache when binary program is produced 51e65aa35 xiaying [Converter:Feature] Support tflite for fp16 and multi-input convolution 1ccdfdeb5 tianbu.xsw redefine svm macro name 31234d372 tianbu.xsw [OpenCL SVM] add macro for only use wrapper d739e35da xiaying [MNN:Bugfix] Fix compile bug for grid op 24ab13c79 Joker feat(arm82): add GridSample op support in arm82 backend, AVX(by xiaying) 7b142978e xiaying [AVX512:Speed] Optimize for e <= 8 5f6febe7b tianbu.xsw code refactor 998d91b57 xiaying [Express:Speed] Merge submodule for speed 22c89146f tianhang.yth fix alpha div by zero bug and arm server compile bug 8f829a170 tianbu.xsw [OpenCL Pad] unify conv/deconv pad computing 4a28f603e xiaying [Express:Speed] Shared Const for All Submodule c74cf28f3 xiaying [MNN:Refractor] Seperate Const init and schedule 2a1eebb7a xiaying [Tools:Bugfix] Fix bug for modelTest.py count size 72f04008c xiaying [MNN:Refractor] Delete unuseful const op 1e735d03c xiaying [Converter:Bugfix] Fix bug for static module gen 4dfadbc6e xiaying [MNN:Refractor] Rewrite const init mode 1fcf0417a xiaying [MNN:Bugfix] Fix bug for deconvolutin multi-input for multi-batch 41d429cfd xiaying [Train:Bugfix] Revert convert NCHW for mnistTrain f947a5f01 xiaying [Test:Feature] Add testTrain dad59b6f6 tianbu.xsw move realize code from Backend.hpp to Tensor.cpp cf4473ad1 xiaying [Train:Bugfix] Support pad for GeometryPoolGrad 91ab13734 xiaying [MNN:Bugfix] Fix compile bug for avx512 742e80f47 xiaying [MNN:Refractor] Opt the logic for checknan judge 12543b841 xiaying [ARM82:Bugfix] Fix compile bug for ios 3a2b0a49f xiaying [ARM82:Speed] Opt Pack / Unpack for armv8 c0f1995cd xiaying [ARM82:Speed] Opt MNNPackC8FP16 and MNNUnpackC8FP16 by asm e0fc77dcf xiaying [MNN:Speed] Fix bug for DeconvolutionWithStride for C4HW4, open it 584bec578 xiaying [MNN:Bugfix] Fix bug for format set error for onnx d5bd4148d xiaying [MNN:Bugfix] Fix bug for format set error for onnx b00265841 xiaying [MNN:Bugfix] Fix bug for SparseConvolutionTiledExecutor bb09188ac xiaying [Test:Bugfix] Fix bug for run into sparse auto 426d1babd xiaying [MNN:Refractor] Small bugfix for Group convolution and pack 7d0ea1c46 tianbu.xsw [testModel Feature] support testModel.out input resize 4169c54ce xiaying [MNN:Bugfix] Fix bug for checkNAN for origin 412a82222 xiaying [Test:Bugfix] Fix bug for CheckNAN's error of matmul 319b1d425 xiaying [MNN:Bugfix] Fix bug for multi-batch for ConvInt8 050b728a6 xiaying [Test:Bugfix] Use NCHW for ConvInt8Test 7db3423a1 xiaying [OpenCL:Bugfix] Fix bug for opencl::image,opencl::buffer for C4HW4 adcec6a7f xiaying [Vulkan:Bugfix] Fix bug for invalid tensor size limit d2a7cf4e9 xiaying [Vulkan:Bugfix] Fix bug for onCopyBuffer of nc4hw4 557bebdd3 xiaying [MNN:Bugfix] Fix bug for BF16-ARM32 bbe186649 tianbu.xsw [Update AUTO mode]: fix MNN_FORWARD_AUTO choose priority 6deb23439 xiaying [MNN:Bugfix] Fix bug for GeometryBinary don't care about NC4HW4 same size b137590e4 xiaying [MNN:Bugfix] Fix bug for GeometryBinary don't care about NC4HW4 same size 7003558ea xiaying [Converter:Bugfix] Fix bug for onnx pad for serveral case b5f8cae5a xiaying [Converter:Bugfix] Fix bug for onnx pad for serveral case 29b09e125 xiaying [MNN:Bugfix] Fix bug for arm64-bf16 42ce00770 xiaying [MNN:Bugfix] Fix bug for ARM64 - float a2d89fc18 雁行 [Converter:Feature] Support Binary Unary for Torch. 7f1c0deb1 xiaying [MNN:Bugfix] Fix bug for Raster for Int8 8335a6f18 tianbu.xsw [OpenCL Shared Memory] modify data_format method b359e031b xiaying [ARM82:Bugfix] Fix bug for arm82 and speed up pack / unpack c8 24bf3fc88 雁行 [Convert:Feature] Support LayerNormFuse without gamma beta. 3e629624b xiaying [MNN:Bugfix] Fix bug for float - armv7a 2b7908ec7 tianbu.xsw modify workItemSize 3cee0d413 xiaying [MNN:Bugfix] test wrong clear 9cbbfb998 xiaying [MNN:Bugfix] fix compile bug for c++ < 14 2d7a44484 xiaying [MNN:Bugfix] fix compile bug for c++ < 14 eb7d0cb53 xiaying [Test:Bugfix] Don't test for NC4HW4 directly 7b40ca8d1 xiaying [MNN:Bugfix] Fix bug for ConvolutionGroup 2694d8a91 xiaying [MNN:Bugfix] Fix bug for CPUGridSample f89af60f6 xiaying [MNN:Bugfix] Fix compile bug for arm a151abcdd xiaying [MNN:Bugfix] Fix bug for convert for int8 / int16 b254dbe61 雁行 [MNN:Bugfix] Bugfix for Conv onClone. d08150631 xiaying [MNN:Bugfix] Fix bug for fast rcnn e5568a0df xiaying [MNN:Bugfix] Fix bug for CPURaster treat NC4HW4 fast blit 128318933 雁行 [Raster:Bugfix] bugfix for Raster merge onResize. 03caacbea xiaying [MNN:Bugfix] fix bug for CPUDeconvolution and Convolution1x1Strassen for iw != ow e1e3c245c xiaying [MNN:Bugfix] Fix bug for ConvolutionWinograd 2524cbc6d xiaying [MNN:Bugfix] Fix bug for CPUSoftmax 44ec79b8f xiaying [MNN:Bugfix] Fix bug for CPUConvolutionDepthwise / Scale / DeconvolutionDW 21ae956ce xiaying [MNN:Bugfix] Fix bug for Multi-Batch-TiledExecutor 09a5069c7 xiaying [MNN:Speed] Add offset for src and dst 6776c6784 xiaying [MNN:Bugfix] Fix bug for trainable model cc83ae30b xiaying [MNN:Bugfix] Fix bug for trainable model	2021-07-29 11:47:13 +08:00
allen.lk	36da4f10ec	Fix one arm32 instruction syntax that clang works but gcc DOES NOT work. use index instruction instead.	2021-07-12 14:34:53 +08:00
xiaying	f0f961fb21	[MNN:Bugfix] Fix bug for ShapeTensorConvert compute for dim = 1 error	2021-07-01 16:06:33 +08:00
xiaying	8bf22bca17	[MNN:Bugfix] Fix bug for rearrange for convint8 crash	2021-06-29 12:13:33 +08:00
xiaying	56255c7d84	[MNN:Bugfix] Bugfix for quan x86	2021-06-24 14:06:10 +08:00
xiaying	01c8d87189	[ARM82:Bugfix] Fix compile bug for 32 bit so open arm82	2021-06-24 11:53:13 +08:00
tianbu.xsw	a7981e2180	unify conv/deconv pad computing	2021-06-24 10:40:40 +08:00
xiaying	3c8d3d11e0	Optimize for e <= 8	2021-06-24 10:39:07 +08:00
tianhang.yth	4eb1096b9c	fix alpha div by zero bug and arm server compile bug	2021-06-24 10:38:55 +08:00
Joker	df80f7328b	improvement(arm82): recover the accelerating code for MNNPackUNIT/MNNUnpackUNIT	2021-06-23 15:27:52 +08:00
Joker	4184860ae4	feat(arm82): add GridSample op support in arm82 backend	2021-06-23 14:10:31 +08:00
xiaying	935f70e790	Fix bug for deconvolutin multi-input for multi-batch	2021-06-22 20:40:36 +08:00
xiaying	2733909863	Support pad for GeometryPoolGrad	2021-06-22 19:17:05 +08:00
xiaying	f6422c315c	[MNN:Bugfix] Fix bug for ConvInt8TiledExecutor onClone	2021-06-16 16:20:42 +08:00
xiaying	8d9f86bc4a	fix compile bug for c++ < 14	2021-06-16 15:24:46 +08:00
xiaying	02741a55ff	[MNN:Bugfix] Fix bug for StridedSlice for begin shape << 0	2021-06-15 21:49:46 +08:00
hush-alibaba	58545d6ca1	Synchronize internal github for version 1.2.0 (#1518 )	2021-06-11 17:17:13 +08:00
jxt1234	dba3085e3b	Merge pull request #1492 from Napoleon-Jm/fix/unused_tmp_obj fix: remove unused tmp obj.	2021-05-25 13:28:10 +08:00
jxt1234	f60567b45b	Merge pull request #1491 from alibaba/feature/bugfix Fix bug for newAxis stridedslice compute shape error	2021-05-25 13:27:21 +08:00
恺心	8494d4ef72	fix: remove unused tmp obj.	2021-05-24 18:35:55 +08:00
xiaying	5e7cce05ef	Fix bug for newAxis stridedslice compute shape error	2021-05-24 15:12:27 +08:00
恺心	3fe3faab29	fix: buffer's management on gl backend need follow the rule from storage type.	2021-05-20 14:35:27 +08:00
xiaying	6277ad84d8	Fix bug for corner data not right for cuda-bilinear	2021-05-18 16:21:28 +08:00
jxt1234	3ab8725569	Merge pull request #1395 from WillTao-RD/master fix opencl runtime error of MSVC; add 'clGetMemObjectInfo' wrapper	2021-05-08 19:52:33 +08:00
tianhang.yth	d85952d826	sync from internal repo	2021-04-28 18:02:10 +08:00
DaydreamCoding	84ede68d86	bugfix for schedule by path fix for setup initPipelineInfosFromNet when schedule by path	2021-04-25 19:38:43 +08:00
DaydreamCoding	ff07663ca8	fix Schedule judge variable	2021-04-25 19:04:32 +08:00
雁行	183f0f803d	[PATCH 7/7] [Arm82:Bugfix] Add HardSwish and fix NEON Bug.	2021-04-22 13:51:14 +08:00
雁行	453264cb5e	[PATCH 6/7] [QAUNT:Bugfix] Bugfix for IDST encode when weight value = 1.	2021-04-22 13:51:14 +08:00
tianbu.xsw	f26783a84e	[PATCH 2/7] [OpenCL Feature] bugfix for HardSwish	2021-04-22 13:51:13 +08:00
雁行	ef2f7503a1	[PATCH 1/7] [Arm82:Bugfix] Add HardSwish and fix NEON Bug.	2021-04-22 13:51:13 +08:00
xiaying	b62c2eb687	[BF16:Bugfix] Fix compile bug for BF16 in NO SSE and NO NEON	2021-04-21 15:54:01 +08:00
xiaying	c2a2a24e8e	[IOS:Bugfix] Fix compile bug for IOS Demo	2021-04-21 15:01:56 +08:00
xiaying	3c4ba7c595	[MNN:Sync] Sync internal gitlab	2021-04-16 14:50:43 +08:00
tianbu	9f693b108e	[PATCH 27/36] [CUDA Feature] bugfix for multi-input depthwiseconv	2021-04-16 14:29:38 +08:00
tianbu.xsw	0c3cb3c689	[PATCH 26/36] [OpenCL Feature] bugfix for resizeSession	2021-04-16 14:29:38 +08:00
xiaying	3e06cabf38	[PATCH 22/36] Fix bug for CPUScatterNd crash for invalid input	2021-04-16 14:29:37 +08:00
tianbu.xsw	089253a9a0	[PATCH 21/36] delete deconv_2d_buf kernel	2021-04-16 14:29:37 +08:00
tianbu.xsw	1c003f8af8	[PATCH 20/36] merge image and buffer kernel	2021-04-16 14:29:37 +08:00
tianbu.xsw	64ab57c5f4	[PATCH 18/36] delete unused log	2021-04-16 14:29:37 +08:00
tianbu.xsw	7422afe0ff	[PATCH 17/36] add MNN_OPENCL_BUFFER_CLOSED macro	2021-04-16 14:29:37 +08:00
tianbu.xsw	7a314796e3	[PATCH 16/36] reback some revises	2021-04-16 14:29:37 +08:00
tianbu.xsw	d9786e5351	[PATCH 15/36] [OpenCL Feature] support deconvolution for OpenCL Buffer	2021-04-16 14:29:36 +08:00
弗人	9621a1f8ad	[PATCH 14/36] bug fix for old type 4 models	2021-04-16 14:29:36 +08:00
弗人	a3bbcd01b9	[PATCH 12/36] [Train:Featue] support full quant for train quant, encode when save	2021-04-16 14:29:36 +08:00
弗人	5aae654351	[PATCH 09/36] remove useless code	2021-04-16 14:29:36 +08:00
弗人	95bcb842a0	[PATCH 08/36] [Train:Feature:Bugfix] train quant support full quant	2021-04-16 14:29:36 +08:00

1 2 3 4 5 ...

717 Commits