- Refactor getInstance function
- Add 1x1 convolution check in canAccelerate
- Use NHWC as input/output format for Kleidiai and convert format in onExecute
- Remove KAI_CONV_NCHW_IN_OUT macro
- Fix SME build issue on M4
Refactor MNN_KleidiAI interface to support more model types,
and facilitate subsequent KleidiAI ukernels' integration.
Re-abstract information stored in class KleidiAI:
1) static info: not related to loaded model, initialized when
interface is constructed and never changed.
2) status: will change while pipeline is running.
Let interface and loaded model decouple for more complex mix
of multiple types of models. Add mAccelType in MNN data structure,
kleidiAI interface will rely on this type to decide which branch
to go.
Move some pack functions to mnn_kleidiai_util.cpp.
Add CPU feature detection in source/backend/cpu/CPURuntime.hpp.
Subsequent ukernels need SME information.
Put KleidiAI files in folder source/backend/cpu/arm/kleidiAI/kai,
download from arm gitlab and remain unchanged. Maybe will remove
these files and download them when build.
MNNKleidiAI.cpp is interface between MNN and KleidiAI.
Rewrite function in class DenseConvInt8TiledExecutor
, in ConvInt8TiledExecutor.cpp, to call KleidiAI functions.
Maybe implement a new execution later.
Changes to GeometryConvUtils.cpp and ShapeTensorConvert.cpp are for
the input and output of DenseConvInt8TiledExecutor is NCHW,
rather than NC4HW4, to avoid redundant pack/unpack and get better
performance.
Main Feature:
1. Add OpenCV API and Numpy API Support
2. Protobuf move into MNN
3. Add more op for torchscript convert
4. Add recompute to speed up geometry compute
5. Add ModuleBasic Test