Main Feature:
1. Add OpenCV API and Numpy API Support
2. Protobuf move into MNN
3. Add more op for torchscript convert
4. Add recompute to speed up geometry compute
5. Add ModuleBasic Test
# integration
- add travis CI
- fix building parameters for python
# converter
- add half storage option for MNN converter
- fix op name lost in converter
- fix converter bug for print input output, identity remove output
# ops
- add quantized Convolution & Deconvolution support on OpenCL
- add more expression supports
- add DetectionPostProcess Op for TensorFlow Lite (ssd is supported directly now)
- add supports for LSTM & ELU for ONNX
- add support for Convolution that weights is not constant for ONNX
- fix Unary Op compile error on Linux
- fix Metal backend buffer reuse after resize
- fix Metal raw memory access after model releasing
- fix redundant transpose in Winograd generater
- add supports (/express)
- add tests
- add benchmarks with it (/benchmark/exprModels)
- Python
- MNN engine and tools were submitted to pip
- available on Windows/macOS/Linux
- Engine/Converter
- add supports for each op benchmarking
- refactor optimizer by separating steps
- CPU
- add supports for Conv3D, Pool3D, ELU, ReverseSequence
- fix ArgMax, Permute, Scale, BinaryOp, Slice, SliceTf
- OpenCL
- add half transform in CPU
- add broadcast supports for binary
- optimize Conv2D, Reshape, Eltwise, Gemm, etc.
- OpenGL
- add sub, real div supports for binary
- add supports for unary
- optimize Conv2D, Reshape
- Vulkan
- add max supports for eltwise
- Metal
- fix metallib missing problem
- Train/Quantization
- use express to refactor training codes
- add NaN check-up
- add quantification support for ScaleAdd Op
- add binary to eltwise optimization
- add console logs for quantization tool
- better document for quantization tool
- replace redundant dimension flags with dimension format
- optimize performance of TensorFlow Lite Quantized Convolution
- fix axis support for ONNX softmax
- fix get performance compile error on Windows
- add quantization tool & cpu impl & demo/exec
- add thread pool
- add tests
- fix onnx converter tensor name mismatch
- optimize cpu performance with SSE for windows
- replace FreeImage with stb_image
- warn unicode error in Windows compiling
- separate clang/gcc build script for android
- add default values in fbs
- optimize CPU conv / conv depthwise / deconv / deconv depthwise / lstm / sigmoid
- add sub support in eltwise
- add reciprocal / log1p / log in unary
- add zero like / select / set diff 1d
- add batch support for permute
- add training codes
- fix metal error in dynamic separate storage type handling