Commit Graph

17 Commits

Author SHA1 Message Date
Jules 110ae7f645 Implement cpuMask-based near-singleton global Thread Pool 2025-07-16 06:14:14 +00:00
xiaying c6f25cafc6 MNN:Sync: Sync Internal 2.9.3 2024-07-22 20:51:06 +08:00
Colin Ian King 6fc74e29a0 Only yield at the end if the completion loop needs another pass
Currently the yield is occuring every time a completion loop
iterates and this is quite an expensive kernel system call. It is
not really required if we break out of the loop, so move the
yield to the end of the do-while loop to reduce the yielding overhead

Perf metrics show that the current code eats up ~2.4% CPU yielding
whereas this change reduces this down to ~0.6% of the total CPU run
time.

Signed-off-by: Colin Ian King <colin.king@intel.com>
2023-04-14 15:57:16 +01:00
jason_w 282e771445
Fix compile errors caused by "#define MNN_THREAD_LOCK_CPU"
error: sort is not a member of std
error: gettid was not declared in this scope
error: __NR_sched_setaffinity was not declared in this scope
error: syscall was not declared in this scope
2022-08-25 12:25:30 +08:00
DaydreamCoding 5a9d779237 [PATCH 196/350] fix threadpool may not destroyed 2021-01-06 15:57:10 +08:00
Hui Shu ab711d484c Synchronize internal master to Github 2020-12-15 14:12:35 +08:00
DaydreamCoding d4316ada8f
fix threadpool may not destroyed 2020-12-03 15:25:45 +08:00
xiaying abb1d47641 Revert "[MNN:Bugfix] Fix bug for ThreadPool destroy dead lock"
This reverts commit 07e85594366c6b48b6f44293cde02420215ab32d.
2020-11-06 10:21:49 +08:00
Hui Shu d6795ad031 Github release 1.1.0 2020-11-05 16:49:17 +08:00
guanmoyu 3e4fd79997 fix ThreadPool bug 2019-12-30 15:56:55 +08:00
Zhang 002ac367e4
Update 2019-12-27 22:16:57 +08:00
liqing d6b00d04f4 - build:
- unify schema building in core and converter;
	- add more build script for android;
	- add linux build script for python;

- ops impl:
	- add floor mod support in binary;
	- use eltwise impl in add/max/sub/mul binary for optimization;
	- remove fake double support in cast;
	- fix 5d support for concat;
	- add adjX and adjY support for batch matmul;
	- optimize conv2d back prop filter;
	- add pad mode support for conv3d;
	- fix bug in conv2d & conv depthwise with very small feature map;
	- optimize binary without broacast;
	- add data types support for gather;
	- add gather ND support;
	- use uint8 data type in gather v2;
	- add transpose support for matmul;
	- add matrix band part;
	- add dim != 4 support for padding, reshape & tensor convert;
	- add pad type support for pool3d;
	- make ops based on TensorFlow Lite quantization optional;
	- add all & any support for reduction;
	- use type in parameter as output type in reduction;
	- add int support for unary;
	- add variable weight support for conv2d;
	- fix conv2d depthwise weights initialization;
	- fix type support for transpose;
	- fix grad outputs count for  reduce grad and reshape grad;
	- fix priorbox & detection output;
	- fix metal softmax error;

- python:
	- add runSessionWithCallBackInfo interface;
	- add max nodes limit (1400) for visualization tool;
	- fix save error in python3;
	- align default dim;

- convert:
	- add extra design for optimization;
	- add more post converting optimizers;
	- add caffe v1 weights blob support;
	- add cast, unary, conv transpose support for onnx model;
	- optimize batchnorm, conv with variable weights, prelu, reshape, slice, upsample for onnx model;
	- add cos/sin/atan/tan support for unary for tensorflow model;
	- add any/all support for reduction for tensorflow model;
	- add elu, conv3d, pool3d support for tensorflow model;
	- optimize argmax, batchnorm, concat, batch to space, conv with variable weights, prelu, slice for tensorflow model;

- others:
	- fix size computer lock;
	- fix thread pool deadlock;
	- add express & parameters in express;
	- rewrite blitter chooser without static map;
	- add tests for expr;
2019-10-29 13:37:26 +08:00
liqing f085106da9 release 0.2.0.6
- fix bugs in quantization
- add evaluating tool for quantization
- add ADMM support in quantization
- fix lock in thread pool
- fix fusing for deconv
- fix reshape converting from ONNX to MNN
- turn off blob size checking by default
2019-08-07 16:44:09 +08:00
alikesun 33558a192a
Update ThreadPool.cpp
Replacing the unique_lock to the lock_guard, in the right places.
2019-07-30 01:12:52 -08:00
liqing 7bb0df92dc beta 0.2.0.5
- CPU
	- add support for DepthToSpace & SpaceToDepth ops
- OpenGL
	- add Android demo
	- add half / float runtime option
	- add support for ROIPooling, Squeeze
	- fix bugs in conv im2col
- OpenCL
	- fix Concat, Eltwise, Reshape bugs
- Tools
	- add KL threshold method in quantization tool
	- support optimization for graph with multiple rnn
2019-07-25 13:36:35 +08:00
如幻 732ba68b19 beta 0.2.0.4
- bug fix for quantization tool
    - bug fix/performance update for thread pool
    - bug fix for converters
    - tutorial/doc update
    - more op support
2019-07-19 17:36:12 +08:00
liqing a367406308 beta 0.2.0.3
- add quantization tool & cpu impl & demo/exec
- add thread pool
- add tests
- fix onnx converter tensor name mismatch
- optimize cpu performance with SSE for windows
2019-07-11 13:56:52 +08:00