root/MNN - MNN - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
tianbu.xsw	f8702ceaca	add opencl kernel profile & revise some info in onExecute to onResize stage	2020-07-04 01:06:21 +08:00
xiaying	49ca95571d	Revert "[MNN:Speed] Optmize winograd convolution" This reverts commit 9e34b9a856ccf9d2a81bc9387a1c7dfbc6a12e5d.	2020-07-04 01:06:21 +08:00
xiaying	c536436f64	Remove useless asm for easy compability for window	2020-07-04 01:06:20 +08:00
xiaying	e708cff674	Optmize winograd convolution	2020-07-04 01:06:20 +08:00
xiaying	8e79b4abc4	Revert "[MNN:Speed] Add AVX512 MNNConvSlideWindowMiddle" This reverts commit 498b977df2db2ddbd9e6938f8cd2a0c3d5b616d7.	2020-07-04 01:06:20 +08:00
root	85c2baaf6a	Revert "[CV:Bugfix] Avoid use sse 4.1" This reverts commit c2fc280ffd8504c3c8b499b38fba57ba7eb4e349.	2020-07-04 01:06:20 +08:00
xiaying	6396ec97bb	Avoid use sse 4.1	2020-07-04 01:06:20 +08:00
xiaying	73b1a97315	Fix bug for compile error in linux	2020-07-04 01:06:20 +08:00
xiaying	97f8b91fee	Add AVX512 MNNConvSlideWindowMiddle	2020-07-04 01:06:20 +08:00
root	868115665b	Fix bug for _AVX512_MNNGemmFloatUnit_4.S's error	2020-07-04 01:06:19 +08:00
xiaying	a718ef7382	Add YUV_I420	2020-07-04 01:06:19 +08:00
xiaying	2428ab001d	Optmize YUV -> RGBA	2020-07-04 01:06:19 +08:00
xiaying	96f40fddcd	Add sse opt for blitter	2020-07-04 01:06:19 +08:00
xiaying	c6abcb9088	Fix bug for asm of _AVX512_MNNGemmFloatUnit_4	2020-07-04 01:06:19 +08:00
xiaying	d2dd9ae22a	Add _AVX512_MNNGemmFloatUnit_4	2020-07-04 01:06:19 +08:00
xiaying	746b50a56d	Small opt _AVX_MNNPackedMatMul	2020-07-04 01:06:19 +08:00
xiaying	9f4f6c091d	Add ../source/backend/cpu/x86_x64/avx/_AVX512_MNNPackedMatMul.S	2020-07-04 01:06:19 +08:00
xiaying	acb7ca17aa	Fix bug for asm align number	2020-07-04 01:06:19 +08:00
xiaying	dfe1d06c08	Support multi-thread for 1x1 convolution	2020-07-04 01:06:19 +08:00
xiaying	c7051d367c	Temply forbid not im2col case for 1x1 conv	2020-07-04 01:06:19 +08:00
xiaying	93ea95ff30	Add MNNUnPackC4ForMatMul_C	2020-07-04 01:06:19 +08:00
xiaying	a38a551993	Add MNNPackC4ForMatMul_A	2020-07-04 01:06:18 +08:00
xiaying	a750fe0956	Rename _AVX_MNNGemm16x6 as _AVX_MNNPackedMatMul	2020-07-04 01:06:18 +08:00
xiaying	77f44dc1af	Rename NHWC<->NC4HW4 as pack/unpack transpose	2020-07-04 01:06:18 +08:00
xiaying	d13f1bc0b6	Optimize Strassen Merge C Function for x86	2020-07-04 01:06:18 +08:00
xiaying	1567e74e40	Optmize NHWCToNC4HW4 and NC4HW4ToNHWC	2020-07-04 01:06:18 +08:00
xiaying	d88cde6237	Use strassen for Convolution1x1Strassen	2020-07-04 01:06:18 +08:00
xiaying	ae91cab1b8	Support Strassen for new matmul	2020-07-04 01:06:18 +08:00
xiaying	ac15e9fcec	Support get pack mode for each platform	2020-07-04 01:06:18 +08:00
houjiang	91c70ba559	Fix cpu binary.	2020-07-04 01:06:18 +08:00
xiaying	5fc7acd37e	support transpose, fix bug for not align	2020-07-04 01:06:18 +08:00
xiaying	a3a43a6d9b	Add prefetch for _AVX_MNNGemm16x6.S , from 31 ms -> 29 ms	2020-07-04 01:06:18 +08:00
xiaying	3dc9cbb740	Optmize CPUMatMul for x86 avx256 by 16x6 540, 320, 540 from 3.8 ms -> 2.5 ms 1024, 1024, 1024 from 39 ms -> 31 ms	2020-07-04 01:06:18 +08:00
xiaying	cf0896e71a	Add 16x6 GEMM	2020-07-04 01:06:17 +08:00
xiaying	8ea506cb57	Add asm for _AVX_MNNGemmFloatUnit_4, 1024x1024x1024 from 39 ms -> 37 ms	2020-07-04 01:06:17 +08:00
xiaying	8bce0519af	Use AVX to optimize mnnmatrix add, but make slow in mac	2020-07-04 01:06:17 +08:00
xiaying	976e6d0e6f	Use ASM MNNMatrixAdd instead of C	2020-07-04 01:06:17 +08:00
jxt1234	e9cde2ffe4	Merge pull request #945 from krayzemli/fix_BufferAllocator Don't increase reference count when extracting a block from a non-splitable freelist	2020-07-03 11:11:59 +08:00
jxt1234	47af4892d0	Merge pull request #941 from krayzemli/fix_CpuQuantizedAdd Fix out-of-bounds access in CPUQuantizedAdd::onExecute	2020-07-03 10:29:59 +08:00
jxt1234	234f423e54	Merge pull request #942 from krayzemli/fix_CPUQuantizedLogistic Fix CPUQuantizedLogistic::onExecute access to the model which could have been released	2020-07-03 10:07:11 +08:00
Roman Maltsev	b750d419b8	Fix memory leak in CPUDetectionPostProcess	2020-07-02 17:45:32 +07:00
Roman Maltsev	36bd8f1a35	Fix CPUQuantizedLogistic::onExecute access to the model which could have been released	2020-07-02 17:06:58 +07:00
Roman Maltsev	2d3d4a2242	Don't increase reference count when extracting a block from a non-splitable free list, since returning a block to a non-mergeable free list does not increment this count.	2020-07-02 17:00:33 +07:00
Roman Maltsev	98bac405be	Fix out-of-bounds access in CPUQuantizedAdd::onExecute	2020-07-02 16:42:09 +07:00
jxt1234	f3dd23a048	Merge pull request #848 from Interfish/master Add BlstmComputer	2020-06-23 20:53:31 +08:00
誉阳	0d84ab23c5	[PATCH 9/9] armv82 support prelu	2020-06-19 16:48:01 +08:00
誉阳	510ef0fe11	[PATCH 8/9] fix some bugs	2020-06-19 16:48:01 +08:00
誉阳	3ed28acab1	[PATCH 7/9] fix compile bug in android studio	2020-06-19 16:48:01 +08:00
誉阳	7a1f7a03d7	[PATCH 6/9] fix android compile bug for armv7	2020-06-19 16:48:01 +08:00
誉阳	103d8a04dc	[PATCH 5/9] fix bug	2020-06-19 16:48:00 +08:00

1 2 3 4 5

231 Commits