root/MNN - MNN - Gitea: Git with a cup of tea

Commit Graph

Author	SHA1	Message	Date
xiaying	57d67dde2d	[PATCH 191/350] Speed up Int8Tofloat for x86-sse	2021-01-06 15:57:10 +08:00
xiaying	a131c8cd26	[PATCH 188/350] [MNN:Speed] Optimize ConvInt8 for remain case	2021-01-06 15:57:09 +08:00
xiaying	a943d45796	[PATCH 187/350] Float2Int8 opt for x86-sse	2021-01-06 15:57:09 +08:00
Hui Shu	ab711d484c	Synchronize internal master to Github	2020-12-15 14:12:35 +08:00
xiaying	14510593f8	[PATCH 26/78] [MNN:Speed] Add avx2 expc8	2020-11-25 18:57:52 +08:00
Hui Shu	d6795ad031	Github release 1.1.0	2020-11-05 16:49:17 +08:00
xiaying	67298f2ec5	[MNN:Bugfix] Remove _AVX512_MNNGemmFloatUnit_4 temply	2020-07-04 13:18:05 +08:00
xiaying	3b3ebb72b9	Remove usefula code of x86 opt	2020-07-04 11:39:31 +08:00
xiaying	255db932eb	[MNN:Sync] Sync Internal Github	2020-07-04 01:21:30 +08:00
xiaying	8e79b4abc4	Revert "[MNN:Speed] Add AVX512 MNNConvSlideWindowMiddle" This reverts commit 498b977df2db2ddbd9e6938f8cd2a0c3d5b616d7.	2020-07-04 01:06:20 +08:00
xiaying	97f8b91fee	Add AVX512 MNNConvSlideWindowMiddle	2020-07-04 01:06:20 +08:00
xiaying	d2dd9ae22a	Add _AVX512_MNNGemmFloatUnit_4	2020-07-04 01:06:19 +08:00
xiaying	9f4f6c091d	Add ../source/backend/cpu/x86_x64/avx/_AVX512_MNNPackedMatMul.S	2020-07-04 01:06:19 +08:00
xiaying	dfe1d06c08	Support multi-thread for 1x1 convolution	2020-07-04 01:06:19 +08:00
xiaying	93ea95ff30	Add MNNUnPackC4ForMatMul_C	2020-07-04 01:06:19 +08:00
xiaying	a38a551993	Add MNNPackC4ForMatMul_A	2020-07-04 01:06:18 +08:00
xiaying	a750fe0956	Rename _AVX_MNNGemm16x6 as _AVX_MNNPackedMatMul	2020-07-04 01:06:18 +08:00
xiaying	d13f1bc0b6	Optimize Strassen Merge C Function for x86	2020-07-04 01:06:18 +08:00
xiaying	ac15e9fcec	Support get pack mode for each platform	2020-07-04 01:06:18 +08:00
xiaying	5fc7acd37e	support transpose, fix bug for not align	2020-07-04 01:06:18 +08:00
xiaying	3dc9cbb740	Optmize CPUMatMul for x86 avx256 by 16x6 540, 320, 540 from 3.8 ms -> 2.5 ms 1024, 1024, 1024 from 39 ms -> 31 ms	2020-07-04 01:06:18 +08:00
xiaying	bf6285a178	[MNN:Sync] Sync internal github	2020-04-29 10:12:16 +08:00
xiaying	a76be60722	[MNN:Sync] Fix compile bug for windows, fix bug for device not support fma	2020-04-14 22:52:24 +08:00
xiaying	cd26aab2da	Add gemm_unit for x86, mla first use mul	2020-04-14 22:39:40 +08:00
xiaying	3f99ae2a0d	Optimize x86 by reorder weight	2020-04-14 22:39:40 +08:00
xiaying	48c92a41e7	[MNN:Sync] Sync internal git for remain patch	2020-03-22 20:33:03 +08:00
海境	90e06944db	Update	2020-02-26 09:57:17 +08:00
Zhang	002ac367e4	Update	2019-12-27 22:16:57 +08:00
liqing	73ad3413cc	- dynamic computation graph (beta) - add supports (/express) - add tests - add benchmarks with it (/benchmark/exprModels) - Python - MNN engine and tools were submitted to pip - available on Windows/macOS/Linux - Engine/Converter - add supports for each op benchmarking - refactor optimizer by separating steps - CPU - add supports for Conv3D, Pool3D, ELU, ReverseSequence - fix ArgMax, Permute, Scale, BinaryOp, Slice, SliceTf - OpenCL - add half transform in CPU - add broadcast supports for binary - optimize Conv2D, Reshape, Eltwise, Gemm, etc. - OpenGL - add sub, real div supports for binary - add supports for unary - optimize Conv2D, Reshape - Vulkan - add max supports for eltwise - Metal - fix metallib missing problem - Train/Quantization - use express to refactor training codes	2019-09-26 21:02:07 +08:00
liqing	487a0fbd0a	beta 0.2.0.9 - fix quantization tool compiling on Windows - fix converter compiling on Windows - fix eltwise optimization on Windows - separate sse & avx for Windows - add LeakyReLU support for TensorFlow - fix reshape, const for TensorFlow - fix dimension format error for ONNX ops - optimize winograd, ReLU for OpenCL - add fp16 availability & dimensions size check-up for OpenCL - optimize GEMM for arm32 - fix ExpandDims shape calculation when inputs size == 1	2019-09-01 19:25:26 +08:00

30 Commits