Merge pull request #3616 from alibaba/feature/sherpa-mnn

Apps:Feature: Add sherpa-mnn
2025-06-11 19:16:20 +08:00 · 2025-06-11 19:16:20 +08:00 · 1a3ed2bc14
parent ad125cfb96 8eb52d1126
commit 1a3ed2bc14
773 changed files with 94869 additions and 0 deletions
--- a/apps/frameworks/sherpa-mnn/.gitignore
+++ b/apps/frameworks/sherpa-mnn/.gitignore
@ -0,0 +1,5 @@
+SourcePackages
+build-*
+*.xcworkspace
+!build-*.sh
+*.lock
--- a/apps/frameworks/sherpa-mnn/CHANGELOG.md
+++ b/apps/frameworks/sherpa-mnn/CHANGELOG.md
@ -0,0 +1,475 @@
+## 1.10.46
+
+# Fix kokoro lexicon. (#1886)
+# speaker-identification-with-vad-non-streaming-asr.py Lack of support for sense_voice. (#1884)
+# Fix generating Chinese lexicon for Kokoro TTS 1.0 (#1888)
+# Reduce vad-whisper-c-api example code. (#1891)
+# JNI Exception Handling (#1452)
+# Fix #1901: UnicodeEncodeError running export_bpe_vocab.py (#1902)
+# Fix publishing pre-built windows libraries (#1905)
+# Fixing Whisper Model Token Normalization (#1904)
+# feat: add mic example for better compatibility (#1909)
+# Add onnxruntime 1.18.1 for Linux aarch64 GPU (#1914)
+# Add C++ API for streaming zipformer ASR on RK NPU (#1908)
+# change [1<<28] to [1<<10], to fix build issues on GOARCH=386 that [1<<28] too large (#1916)
+# Flutter Config toJson/fromJson (#1893)
+# Fix publishing linux pre-built artifacts (#1919)
+# go.mod set to use go 1.17, and use unsafe.Slice to optimize the code (#1920)
+# fix: AddPunct panic for Go(#1921)
+# Fix publishing macos pre-built artifacts (#1922)
+# Minor fixes for rknn (#1925)
+# Build wheels for rknn linux aarch64 (#1928)
+
+## 1.10.45
+
+* [update] fixed bug: create golang instance succeed while the c struct create failed (#1860)
+* fixed typo in RTF calculations (#1861)
+* Export FireRedASR to sherpa-onnx. (#1865)
+* Add C++ and Python API for FireRedASR AED models (#1867)
+* Add Kotlin and Java API for FireRedAsr AED model (#1870)
+* Add C API for FireRedAsr AED model. (#1871)
+* Add CXX API for FireRedAsr (#1872)
+* Add JavaScript API (node-addon) for FireRedAsr (#1873)
+* Add JavaScript API (WebAssembly) for FireRedAsr model. (#1874)
+* Add C# API for FireRedAsr Model (#1875)
+* Add C# API for FireRedAsr Model (#1875)
+* Add Swift API for FireRedAsr AED Model (#1876)
+* Add Dart API for FireRedAsr AED Model (#1877)
+* Add Go API for FireRedAsr AED Model (#1879)
+* Add Pascal API for FireRedAsr AED Model (#1880)
+
+## 1.10.44
+
+* Export MatchaTTS fa-en model to sherpa-onnx (#1832)
+* Add C++ support for MatchaTTS models not from icefall. (#1834)
+* OfflineRecognizer supports create stream with hotwords (#1833)
+* Add PengChengStarling models to sherpa-onnx (#1835)
+* Support specifying voice in espeak-ng for kokoro tts models. (#1836)
+* Fix: made print sherpa_onnx_loge when it is in debug mode (#1838)
+* Add Go API for audio tagging (#1840)
+* Fix CI (#1841)
+* Update readme to contain links for pre-built Apps (#1853)
+* Modify the model used (#1855)
+* Flutter OnlinePunctuation (#1854)
+* Fix spliting text by languages for kokoro tts. (#1849)
+
+## 1.10.43
+
+* Add MFC example for Kokoro TTS 1.0 (#1815)
+* Update sherpa-onnx-tts.js VitsModelConfig.model can be none (#1817)
+* Fix passing gb2312 encoded strings to tts on Windows (#1819)
+* Support scaling the duration of a pause in TTS. (#1820)
+* Fix building wheels for linux aarch64. (#1821)
+* Fix CI for Linux aarch64. (#1822)
+
+## 1.10.42
+
+* Fix publishing wheels (#1746)
+* Update README to include https://github.com/xinhecuican/QSmartAssistant (#1755)
+* Add Kokoro TTS to MFC examples (#1760)
+* Refactor node-addon C++ code. (#1768)
+* Add keyword spotter C API for HarmonyOS (#1769)
+* Add ArkTS API for Keyword spotting. (#1775)
+* Add Flutter example for Kokoro TTS (#1776)
+* Initialize the audio session for iOS ASR example (#1786)
+* Fix: Prepend 0 to tokenization to prevent word skipping for Kokoro. (#1787)
+* Export Kokoro 1.0 to sherpa-onnx (#1788)
+* Add C++ and Python API for Kokoro 1.0 multilingual TTS model (#1795)
+* Add Java and Koltin API for Kokoro TTS 1.0 (#1798)
+* Add Android demo for Kokoro TTS 1.0 (#1799)
+* Add C API for Kokoro TTS 1.0 (#1801)
+* Add CXX API for Kokoro TTS 1.0 (#1802)
+* Add Swift API for Kokoro TTS 1.0 (#1803)
+* Add Go API for Kokoro TTS 1.0 (#1804)
+* Add C# API for Kokoro TTS 1.0 (#1805)
+* Add Dart API for Kokoro TTS 1.0 (#1806)
+* Add Pascal API for Kokoro TTS 1.0 (#1807)
+* Add JavaScript API (node-addon) for Kokoro TTS 1.0 (#1808)
+* Add JavaScript API (WebAssembly) for Kokoro TTS 1.0 (#1809)
+* Add Flutter example for Kokoro TTS 1.0 (#1810)
+* Add iOS demo for Kokoro TTS 1.0 (#1812)
+* Add HarmonyOS demo for Kokoro TTS 1.0 (#1813)
+
+## 1.10.41
+
+* Fix UI for Android TTS Engine. (#1735)
+* Add iOS TTS example for MatchaTTS (#1736)
+* Add iOS example for Kokoro TTS (#1737)
+* Fix dither binding in Pybind11 to ensure independence from high_freq in FeatureExtractorConfig (#1739)
+* Fix keyword spotting. (#1689)
+* Update readme to include https://github.com/hfyydd/sherpa-onnx-server (#1741)
+* Reduce vad-moonshine-c-api example code. (#1742)
+* Support Kokoro TTS for HarmonyOS. (#1743)
+
+## 1.10.40
+
+* Fix building wheels (#1703)
+* Export kokoro to sherpa-onnx (#1713)
+* Add C++ and Python API for Kokoro TTS models. (#1715)
+* Add C API for Kokoro TTS models (#1717)
+* Fix style issues (#1718)
+* Add C# API for Kokoro TTS models (#1720)
+* Add Swift API for Kokoro TTS models (#1721)
+* Add Go API for Kokoro TTS models (#1722)
+* Add Dart API for Kokoro TTS models (#1723)
+* Add Pascal API for Kokoro TTS models (#1724)
+* Add JavaScript API (node-addon) for Kokoro TTS models (#1725)
+* Add JavaScript (WebAssembly) API for Kokoro TTS models. (#1726)
+* Add Koltin and Java API for Kokoro TTS models (#1728)
+* Update README.md for KWS to not use git lfs. (#1729)
+
+
+
+
+## 1.10.39
+
+* Fix building without TTS (#1691)
+* Add README for android libs. (#1693)
+* Fix: export-onnx.py(expected all tensors to be on the same device) (#1699)
+* Fix passing strings from C# to C. (#1701)
+
+## 1.10.38
+
+* Fix initializing TTS in Python. (#1664)
+* Remove spaces after punctuations for TTS (#1666)
+* Add constructor fromPtr() for all flutter class with factory ctor. (#1667)
+* Add Kotlin API for Matcha-TTS models. (#1668)
+* Support Matcha-TTS models using espeak-ng (#1672)
+* Add Java API for Matcha-TTS models. (#1673)
+* Avoid adding tail padding for VAD in generate-subtitles.py (#1674)
+* Add C API for MatchaTTS models (#1675)
+* Add CXX API for MatchaTTS models (#1676)
+* Add JavaScript API (node-addon-api) for MatchaTTS models. (#1677)
+* Add HarmonyOS examples for MatchaTTS. (#1678)
+* Upgraded to .NET 8 and made code style a little more internally consistent. (#1680)
+* Update workflows to use .NET 8.0 also. (#1681)
+* Add C# and JavaScript (wasm) API for MatchaTTS models (#1682)
+* Add Android demo for MatchaTTS models. (#1683)
+* Add Swift API for MatchaTTS models. (#1684)
+* Add Go API for MatchaTTS models (#1685)
+* Add Pascal API for MatchaTTS models. (#1686)
+* Add Dart API for MatchaTTS models (#1687)
+
+## 1.10.37
+
+* Add new tts models for Latvia and Persian+English (#1644)
+* Add a byte-level BPE Chinese+English non-streaming zipformer model (#1645)
+* Support removing invalid utf-8 sequences. (#1648)
+* Add TeleSpeech CTC to non_streaming_server.py (#1649)
+* Fix building macOS libs (#1656)
+* Add Go API for Keyword spotting (#1662)
+* Add Swift online punctuation (#1661)
+* Add C++ runtime for Matcha-TTS (#1627)
+
+## 1.10.36
+
+* Update AAR version in Android Java demo (#1618)
+* Support linking onnxruntime statically for Android (#1619)
+* Update readme to include Open-LLM-VTuber (#1622)
+* Rename maxNumStences to maxNumSentences (#1625)
+* Support using onnxruntime 1.16.0 with CUDA 11.4 on Jetson Orin NX (Linux arm64 GPU). (#1630)
+* Update readme to include jetson orin nx and nano b01 (#1631)
+* feat: add checksum action (#1632)
+* Support decoding with byte-level BPE (bbpe) models. (#1633)
+* feat: enable c api for android ci (#1635)
+* Update README.md (#1640)
+* SherpaOnnxVadAsr: Offload runSecondPass to background thread for improved real-time audio processing (#1638)
+* Fix GitHub actions. (#1642)
+
+
+## 1.10.35
+
+* Add missing changes about speaker identfication demo for HarmonyOS (#1612)
+* Provide sherpa-onnx.aar for Android (#1615)
+* Use aar in Android Java demo. (#1616)
+
+## 1.10.34
+
+* Fix building node-addon package (#1598)
+* Update doc links for HarmonyOS (#1601)
+* Add on-device real-time ASR demo for HarmonyOS (#1606)
+* Add speaker identification APIs for HarmonyOS (#1607)
+* Add speaker identification demo for HarmonyOS (#1608)
+* Add speaker diarization API for HarmonyOS. (#1609)
+* Add speaker diarization demo for HarmonyOS (#1610)
+
+## 1.10.33
+
+* Add non-streaming ASR support for HarmonyOS. (#1564)
+* Add streaming ASR support for HarmonyOS. (#1565)
+* Fix building for Android (#1568)
+* Publish `sherpa_onnx.har` for HarmonyOS (#1572)
+* Add VAD+ASR demo for HarmonyOS (#1573)
+* Fix publishing har packages for HarmonyOS (#1576)
+* Add CI to build HAPs for HarmonyOS (#1578)
+* Add microphone demo about VAD+ASR for HarmonyOS (#1581)
+* Fix getting microphone permission for HarmonyOS VAD+ASR example (#1582)
+* Add HarmonyOS support for text-to-speech. (#1584)
+* Fix: support both old and new websockets request headers format (#1588)
+* Add on-device tex-to-speech (TTS) demo for HarmonyOS (#1590)
+
+## 1.10.32
+
+* Support cross-compiling for HarmonyOS (#1553)
+* HarmonyOS support for VAD. (#1561)
+* Fix publishing flutter iOS app to appstore (#1563).
+
+## 1.10.31
+
+* Publish pre-built wheels for Python 3.13 (#1485)
+* Publish pre-built macos xcframework (#1490)
+* Fix reading tokens.txt on Windows. (#1497)
+* Add two-pass ASR Android APKs for Moonshine models. (#1499)
+* Support building GPU-capable sherpa-onnx on Linux aarch64. (#1500)
+* Publish pre-built wheels with CUDA support for Linux aarch64. (#1507)
+* Export the English TTS model from MeloTTS (#1509)
+* Add Lazarus example for Moonshine models. (#1532)
+* Add isolate_tts demo (#1529)
+* Add WebAssembly example for VAD + Moonshine models. (#1535)
+* Add Android APK for streaming Paraformer ASR (#1538)
+* Support static build for windows arm64. (#1539)
+* Use xcframework for Flutter iOS plugin to support iOS simulators.
+
+## 1.10.30
+
+* Fix building node-addon for Windows x86. (#1469)
+* Begin to support https://github.com/usefulsensors/moonshine (#1470)
+* Publish pre-built JNI libs for Linux aarch64 (#1472)
+* Add C++ runtime and Python APIs for Moonshine models (#1473)
+* Add Kotlin and Java API for Moonshine models (#1474)
+* Add C and C++ API for Moonshine models (#1476)
+* Add Swift API for Moonshine models. (#1477)
+* Add Go API examples for adding punctuations to text. (#1478)
+* Add Go API for Moonshine models (#1479)
+* Add JavaScript API for Moonshine models (#1480)
+* Add Dart API for Moonshine models. (#1481)
+* Add Pascal API for Moonshine models (#1482)
+* Add C# API for Moonshine models. (#1483)
+
+## 1.10.29
+
+* Add Go API for offline punctuation models (#1434)
+* Support https://huggingface.co/Revai/reverb-diarization-v1 (#1437)
+* Add more models for speaker diarization (#1440)
+* Add Java API example for hotwords. (#1442)
+* Add java android demo (#1454)
+* Add C++ API for streaming ASR. (#1455)
+* Add C++ API for non-streaming ASR (#1456)
+* Handle NaN embeddings in speaker diarization. (#1461)
+* Add speaker identification with VAD and non-streaming ASR using ALSA (#1463)
+* Support GigaAM CTC models for Russian ASR (#1464)
+* Add GigaAM NeMo transducer model for Russian ASR (#1467)
+
+## 1.10.28
+
+* Fix swift example for generating subtitles. (#1362)
+* Allow more online models to load tokens file from the memory (#1352)
+* Fix CI errors introduced by supporting loading keywords from buffers (#1366)
+* Fix running MeloTTS models on GPU. (#1379)
+* Support Parakeet models from NeMo (#1381)
+* Export Pyannote speaker segmentation models to onnx (#1382)
+* Support Agglomerative clustering. (#1384)
+* Add Python API for clustering (#1385)
+* support whisper turbo (#1390)
+* context_state is not set correctly when previous context is passed after reset (#1393)
+* Speaker diarization example with onnxruntime Python API (#1395)
+* C++ API for speaker diarization (#1396)
+* Python API for speaker diarization. (#1400)
+* C API for speaker diarization (#1402)
+* docs(nodejs-addon-examples): add guide for pnpm user (#1401)
+* Go API for speaker diarization (#1403)
+* Swift API for speaker diarization (#1404)
+* Update readme to include more external projects using sherpa-onnx (#1405)
+* C# API for speaker diarization (#1407)
+* JavaScript API (node-addon) for speaker diarization (#1408)
+* WebAssembly exmaple for speaker diarization (#1411)
+* Handle audio files less than 10s long for speaker diarization. (#1412)
+* JavaScript API with WebAssembly for speaker diarization (#1414)
+* Kotlin API for speaker diarization (#1415)
+* Java API for speaker diarization (#1416)
+* Dart API for speaker diarization (#1418)
+* Pascal API for speaker diarization (#1420)
+* Android JNI support for speaker diarization (#1421)
+* Android demo for speaker diarization (#1423)
+
+## 1.10.27
+
+* Add non-streaming ONNX models for Russian ASR (#1358)
+* Fix building Flutter TTS examples for Linux (#1356)
+* Support passing utf-8 strings from JavaScript to C++. (#1355)
+* Fix sherpa_onnx.go to support returning empty recognition results (#1353)
+
+## 1.10.26
+
+* Add links to projects using sherpa-onnx. (#1345)
+* Support lang/emotion/event results from SenseVoice in Swift API. (#1346)
+* Support specifying max speech duration for VAD. (#1348)
+* Add APIs about max speech duration in VAD for various programming languages (#1349)
+
+## 1.10.25
+
+* Allow tokens and hotwords to be loaded from buffered string driectly (#1339)
+* Fix computing features for CED audio tagging models. (#1341)
+* Preserve previous result as context for next segment (#1335)
+* Add Python binding for online punctuation models (#1312)
+* Fix vad.Flush(). (#1329)
+* Fix wasm app for streaming paraformer (#1328)
+* Build websocket related binaries for embedded systems. (#1327)
+* Fixed the C api calls and created the TTS project file (#1324)
+* Re-implement LM rescore for online transducer (#1231)
+
+## 1.10.24
+
+* Add VAD and keyword spotting for the Node package with WebAssembly (#1286)
+* Fix releasing npm package and fix building Android VAD+ASR example (#1288)
+* add Tokens []string, Timestamps []float32, Lang string, Emotion string, Event string (#1277)
+* add vad+sense voice example for C API (#1291)
+* ADD VAD+ASR example for dart with CircularBuffer. (#1293)
+* Fix VAD+ASR example for Dart API. (#1294)
+* Avoid SherpaOnnxSpeakerEmbeddingManagerFreeBestMatches freeing null. (#1296)
+* Fix releasing wasm app for vad+asr (#1300)
+* remove extra files from linux/macos/windows jni libs (#1301)
+* two-pass Android APK for SenseVoice (#1302)
+* Downgrade flutter sdk versions. (#1305)
+* Reduce onnxruntime log output. (#1306)
+* Provide prebuilt .jar files for different java versions. (#1307)
+
+
+## 1.10.23
+
+* flutter: add lang, emotion, event to OfflineRecognizerResult (#1268)
+* Use a separate thread to initialize models for lazarus examples. (#1270)
+* Object pascal examples for recording and playing audio with portaudio. (#1271)
+* Text to speech API for Object Pascal. (#1273)
+* update kotlin api for better release native object and add user-friendly apis. (#1275)
+* Update wave-reader.cc to support 8/16/32-bit waves (#1278)
+* Add WebAssembly for VAD (#1281)
+* WebAssembly example for VAD + Non-streaming ASR (#1284)
+
+## 1.10.22
+
+* Add Pascal API for reading wave files (#1243)
+* Pascal API for streaming ASR (#1246)
+* Pascal API for non-streaming ASR (#1247)
+* Pascal API for VAD (#1249)
+* Add more C API examples (#1255)
+* Add emotion, event of SenseVoice. (#1257)
+* Support reading multi-channel wave files with 8/16/32-bit encoded samples (#1258)
+* Enable IPO only for Release build. (#1261)
+* Add Lazarus example for generating subtitles using Silero VAD with non-streaming ASR (#1251)
+* Fix looking up OOVs in lexicon.txt for MeloTTS models. (#1266)
+
+
+## 1.10.21
+
+* Fix ffmpeg c api example (#1185)
+* Fix splitting sentences for MeloTTS (#1186)
+* Non-streaming WebSocket client for Java. (#1190)
+* Fix copying asset files for flutter examples. (#1191)
+* Add Chinese+English tts example for flutter (#1192)
+* Add speaker identification and verification exmaple for Dart API (#1194)
+* Fix reading non-standard wav files. (#1199)
+* Add ReazonSpeech Japanese pre-trained model (#1203)
+* Describe how to add new words for MeloTTS models (#1209)
+* Remove libonnxruntime_providers_cuda.so as a dependency. (#1210)
+* Fix setting SenseVoice language. (#1214)
+* Support passing TTS callback in Swift API (#1218)
+* Add MeloTTS example for ios (#1223)
+* Add online punctuation and casing prediction model for English language (#1224)
+* Fix python two pass ASR examples (#1230)
+* Add blank penalty for various language bindings
+
+## 1.10.20
+
+* Add Dart API for audio tagging
+* Add Dart API for adding punctuations to text
+
+## 1.10.19
+
+* Prefix all C API functions with SherpaOnnx
+
+## 1.10.18
+
+* Fix the case when recognition results contain the symbol `"`. It caused
+  issues when converting results to a json string.
+
+## 1.10.17
+
+* Support SenseVoice CTC models.
+* Add Dart API for keyword spotter.
+
+## 1.10.16
+
+* Support zh-en TTS model from MeloTTS.
+
+## 1.10.15
+
+* Downgrade onnxruntime from v1.18.1 to v1.17.1
+
+## 1.10.14
+
+* Support whisper large v3
+* Update onnxruntime from v1.18.0 to v1.18.1
+* Fix invalid utf8 sequence from Whisper for Dart API.
+
+## 1.10.13
+
+* Update onnxruntime from 1.17.1 to 1.18.0
+* Add C# API for Keyword spotting
+
+## 1.10.12
+
+* Add Flush to VAD so that the last speech segment can be detected. See also
+  https://github.com/k2-fsa/sherpa-onnx/discussions/1077#discussioncomment-9979740
+
+## 1.10.11
+
+* Support the iOS platform for Flutter.
+
+## 1.10.10
+
+* Build sherpa-onnx into a single shared library.
+
+## 1.10.9
+
+* Fix released packages. piper-phonemize was not included in v1.10.8.
+
+## 1.10.8
+
+* Fix released packages. There should be a lib directory.
+
+## 1.10.7
+
+* Support Android for Flutter.
+
+## 1.10.2
+
+* Fix passing C# string to C++
+
+## 1.10.1
+
+* Enable to stop TTS generation
+
+## 1.10.0
+
+* Add inverse text normalization
+
+## 1.9.30
+
+* Add TTS
+
+## 1.9.29
+
+* Publish with CI
+
+## 0.0.3
+
+* Fix path separator on Windows.
+
+## 0.0.2
+
+* Support specifying lib path.
+
+## 0.0.1
+
+* Initial release.
--- a/apps/frameworks/sherpa-mnn/CMakeLists.txt
+++ b/apps/frameworks/sherpa-mnn/CMakeLists.txt
@ -0,0 +1,452 @@
+cmake_minimum_required(VERSION 3.13 FATAL_ERROR)
+
+set(CMAKE_OSX_DEPLOYMENT_TARGET "10.14" CACHE STRING "Minimum OS X deployment version. Used only for macOS")
+
+set(CMAKE_POLICY_DEFAULT_CMP0063 NEW)
+set(CMAKE_POLICY_DEFAULT_CMP0069 NEW)
+
+project(sherpa-mnn)
+
+message(STATUS "MNN's dir: ${MNN_LIB_DIR}")
+include_directories(${MNN_LIB_DIR}/include)
+link_directories(${MNN_LIB_DIR}/lib)
+
+# Remember to update
+# ./CHANGELOG.md
+# ./new-release.sh
+set(SHERPA_MNN_VERSION "1.10.46")
+
+# Disable warning about
+#
+# "The DOWNLOAD_EXTRACT_TIMESTAMP option was not given and policy CMP0135 is
+#  not set.
+if (CMAKE_VERSION VERSION_GREATER_EQUAL "3.24.0")
+  cmake_policy(SET CMP0135 NEW)
+endif()
+
+option(SHERPA_MNN_ENABLE_PYTHON "Whether to build Python" OFF)
+option(SHERPA_MNN_ENABLE_TESTS "Whether to build tests" OFF)
+option(SHERPA_MNN_ENABLE_CHECK "Whether to build with assert" OFF)
+option(BUILD_SHARED_LIBS "Whether to build shared libraries" OFF)
+option(SHERPA_MNN_ENABLE_PORTAUDIO "Whether to build with portaudio" ON)
+option(SHERPA_MNN_ENABLE_JNI "Whether to build JNI internface" OFF)
+option(SHERPA_MNN_ENABLE_C_API "Whether to build C API" ON)
+option(SHERPA_MNN_ENABLE_WEBSOCKET "Whether to build webscoket server/client" ON)
+option(SHERPA_MNN_ENABLE_GPU "Enable ONNX Runtime GPU support" OFF)
+option(SHERPA_MNN_ENABLE_DIRECTML "Enable ONNX Runtime DirectML support" OFF)
+option(SHERPA_MNN_ENABLE_WASM "Whether to enable WASM" OFF)
+option(SHERPA_MNN_ENABLE_WASM_SPEAKER_DIARIZATION "Whether to enable WASM for speaker diarization" OFF)
+option(SHERPA_MNN_ENABLE_WASM_TTS "Whether to enable WASM for TTS" OFF)
+option(SHERPA_MNN_ENABLE_WASM_ASR "Whether to enable WASM for ASR" OFF)
+option(SHERPA_MNN_ENABLE_WASM_KWS "Whether to enable WASM for KWS" OFF)
+option(SHERPA_MNN_ENABLE_WASM_VAD "Whether to enable WASM for VAD" OFF)
+option(SHERPA_MNN_ENABLE_WASM_VAD_ASR "Whether to enable WASM for VAD+ASR" OFF)
+option(SHERPA_MNN_ENABLE_WASM_NODEJS "Whether to enable WASM for NodeJS" OFF)
+option(SHERPA_MNN_ENABLE_BINARY "Whether to build binaries" ON)
+option(SHERPA_MNN_ENABLE_TTS "Whether to build TTS related code" ON)
+option(SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION "Whether to build speaker diarization related code" ON)
+option(SHERPA_MNN_LINK_LIBSTDCPP_STATICALLY "True to link libstdc++ statically. Used only when BUILD_SHARED_LIBS is OFF on Linux" ON)
+option(SHERPA_MNN_USE_PRE_INSTALLED_ONNXRUNTIME_IF_AVAILABLE "True to use pre-installed onnxruntime if available" ON)
+option(SHERPA_MNN_ENABLE_SANITIZER "Whether to enable ubsan and asan" OFF)
+option(SHERPA_MNN_BUILD_C_API_EXAMPLES "Whether to enable C API examples" ON)
+option(SHERPA_MNN_ENABLE_RKNN "Whether to build for RKNN NPU " OFF)
+
+set(SHERPA_MNN_LINUX_ARM64_GPU_ONNXRUNTIME_VERSION "1.11.0" CACHE STRING "Used only for Linux ARM64 GPU. If you use Jetson nano b01, then please set it to 1.11.0. If you use Jetson Orin NX, then set it to 1.16.0.If you use NVIDIA Jetson Orin Nano Engineering Reference Developer Kit
+Super - Jetpack 6.2 [L4T 36.4.3], then set it to 1.18.1")
+
+
+set(CMAKE_ARCHIVE_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib")
+set(CMAKE_LIBRARY_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/lib")
+set(CMAKE_RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/bin")
+
+if(NOT WIN32)
+  set(CMAKE_SKIP_BUILD_RPATH FALSE)
+  set(BUILD_RPATH_USE_ORIGIN TRUE)
+  set(CMAKE_INSTALL_RPATH_USE_LINK_PATH TRUE)
+endif()
+
+if(NOT APPLE)
+  set(SHERPA_MNN_RPATH_ORIGIN "$ORIGIN")
+else()
+  set(SHERPA_MNN_RPATH_ORIGIN "@loader_path")
+endif()
+
+if(NOT WIN32)
+  set(CMAKE_INSTALL_RPATH ${SHERPA_MNN_RPATH_ORIGIN})
+  set(CMAKE_BUILD_RPATH ${SHERPA_MNN_RPATH_ORIGIN})
+endif()
+
+if(NOT CMAKE_BUILD_TYPE)
+  message(STATUS "No CMAKE_BUILD_TYPE given, default to Release")
+  set(CMAKE_BUILD_TYPE Release)
+endif()
+
+if(DEFINED ANDROID_ABI AND NOT SHERPA_MNN_ENABLE_JNI AND NOT SHERPA_MNN_ENABLE_C_API)
+  message(STATUS "Set SHERPA_MNN_ENABLE_JNI to ON for Android")
+  set(SHERPA_MNN_ENABLE_JNI ON CACHE BOOL "" FORCE)
+endif()
+
+if(SHERPA_MNN_ENABLE_PYTHON AND NOT BUILD_SHARED_LIBS)
+  message(STATUS "Set BUILD_SHARED_LIBS to ON since SHERPA_MNN_ENABLE_PYTHON is ON")
+  set(BUILD_SHARED_LIBS ON CACHE BOOL "" FORCE)
+endif()
+
+if(SHERPA_MNN_ENABLE_GPU)
+  message(WARNING "\
+Compiling for NVIDIA GPU is enabled. Please make sure cudatoolkit
+is installed on your system. Otherwise, you will get errors at runtime.
+Hint: You don't need sudo permission to install CUDA toolkit. Please refer to
+  https://k2-fsa.github.io/k2/installation/cuda-cudnn.html
+to install CUDA toolkit if you have not installed it.")
+  if(NOT BUILD_SHARED_LIBS)
+    message(STATUS "Set BUILD_SHARED_LIBS to ON since SHERPA_MNN_ENABLE_GPU is ON")
+    set(BUILD_SHARED_LIBS ON CACHE BOOL "" FORCE)
+  endif()
+endif()
+
+if(SHERPA_MNN_ENABLE_DIRECTML)
+  message(WARNING "\
+Compiling with DirectML enabled. Please make sure Windows 10 SDK
+is installed on your system. Otherwise, you will get errors at runtime.
+Please refer to
+  https://onnxruntime.ai/docs/execution-providers/DirectML-ExecutionProvider.html#requirements
+to install Windows 10 SDK if you have not installed it.")
+  if(NOT BUILD_SHARED_LIBS)
+    message(STATUS "Set BUILD_SHARED_LIBS to ON since SHERPA_MNN_ENABLE_DIRECTML is ON")
+    set(BUILD_SHARED_LIBS ON CACHE BOOL "" FORCE)
+  endif()
+endif()
+
+# see https://cmake.org/cmake/help/latest/prop_tgt/MSVC_RUNTIME_LIBRARY.html
+# https://stackoverflow.com/questions/14172856/compile-with-mt-instead-of-md-using-cmake
+if(MSVC)
+  add_compile_options(
+      $<$<CONFIG:>:/MT> #---------|
+      $<$<CONFIG:Debug>:/MTd> #---|-- Statically link the runtime libraries
+      $<$<CONFIG:Release>:/MT> #--|
+      $<$<CONFIG:RelWithDebInfo>:/MT>
+      $<$<CONFIG:MinSizeRel>:/MT>
+  )
+endif()
+
+if(CMAKE_SYSTEM_NAME STREQUAL OHOS)
+  set(CMAKE_CXX_FLAGS "-Wno-unused-command-line-argument ${CMAKE_CXX_FLAGS}")
+  set(CMAKE_C_FLAGS "-Wno-unused-command-line-argument ${CMAKE_C_FLAGS}")
+endif()
+
+message(STATUS "CMAKE_BUILD_TYPE: ${CMAKE_BUILD_TYPE}")
+message(STATUS "CMAKE_INSTALL_PREFIX: ${CMAKE_INSTALL_PREFIX}")
+message(STATUS "BUILD_SHARED_LIBS ${BUILD_SHARED_LIBS}")
+message(STATUS "SHERPA_MNN_ENABLE_PYTHON ${SHERPA_MNN_ENABLE_PYTHON}")
+message(STATUS "SHERPA_MNN_ENABLE_TESTS ${SHERPA_MNN_ENABLE_TESTS}")
+message(STATUS "SHERPA_MNN_ENABLE_CHECK ${SHERPA_MNN_ENABLE_CHECK}")
+message(STATUS "SHERPA_MNN_ENABLE_PORTAUDIO ${SHERPA_MNN_ENABLE_PORTAUDIO}")
+message(STATUS "SHERPA_MNN_ENABLE_JNI ${SHERPA_MNN_ENABLE_JNI}")
+message(STATUS "SHERPA_MNN_ENABLE_C_API ${SHERPA_MNN_ENABLE_C_API}")
+message(STATUS "SHERPA_MNN_ENABLE_WEBSOCKET ${SHERPA_MNN_ENABLE_WEBSOCKET}")
+message(STATUS "SHERPA_MNN_ENABLE_GPU ${SHERPA_MNN_ENABLE_GPU}")
+message(STATUS "SHERPA_MNN_ENABLE_WASM ${SHERPA_MNN_ENABLE_WASM}")
+message(STATUS "SHERPA_MNN_ENABLE_WASM_SPEAKER_DIARIZATION ${SHERPA_MNN_ENABLE_WASM_SPEAKER_DIARIZATION}")
+message(STATUS "SHERPA_MNN_ENABLE_WASM_TTS ${SHERPA_MNN_ENABLE_WASM_TTS}")
+message(STATUS "SHERPA_MNN_ENABLE_WASM_ASR ${SHERPA_MNN_ENABLE_WASM_ASR}")
+message(STATUS "SHERPA_MNN_ENABLE_WASM_KWS ${SHERPA_MNN_ENABLE_WASM_KWS}")
+message(STATUS "SHERPA_MNN_ENABLE_WASM_VAD ${SHERPA_MNN_ENABLE_WASM_VAD}")
+message(STATUS "SHERPA_MNN_ENABLE_WASM_VAD_ASR ${SHERPA_MNN_ENABLE_WASM_VAD_ASR}")
+message(STATUS "SHERPA_MNN_ENABLE_WASM_NODEJS ${SHERPA_MNN_ENABLE_WASM_NODEJS}")
+message(STATUS "SHERPA_MNN_ENABLE_BINARY ${SHERPA_MNN_ENABLE_BINARY}")
+message(STATUS "SHERPA_MNN_ENABLE_TTS ${SHERPA_MNN_ENABLE_TTS}")
+message(STATUS "SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION ${SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION}")
+message(STATUS "SHERPA_MNN_LINK_LIBSTDCPP_STATICALLY ${SHERPA_MNN_LINK_LIBSTDCPP_STATICALLY}")
+message(STATUS "SHERPA_MNN_USE_PRE_INSTALLED_ONNXRUNTIME_IF_AVAILABLE ${SHERPA_MNN_USE_PRE_INSTALLED_ONNXRUNTIME_IF_AVAILABLE}")
+message(STATUS "SHERPA_MNN_ENABLE_SANITIZER: ${SHERPA_MNN_ENABLE_SANITIZER}")
+message(STATUS "SHERPA_MNN_BUILD_C_API_EXAMPLES: ${SHERPA_MNN_BUILD_C_API_EXAMPLES}")
+message(STATUS "SHERPA_MNN_ENABLE_RKNN: ${SHERPA_MNN_ENABLE_RKNN}")
+
+if(BUILD_SHARED_LIBS OR SHERPA_MNN_ENABLE_JNI)
+  set(CMAKE_CXX_VISIBILITY_PRESET hidden)
+  set(CMAKE_VISIBILITY_INLINES_HIDDEN 1)
+  set(CMAKE_POSITION_INDEPENDENT_CODE ON)
+endif()
+
+if(BUILD_SHARED_LIBS AND NOT CMAKE_SYSTEM_NAME STREQUAL iOS AND CMAKE_BUILD_TYPE STREQUAL Release)
+  # Don't use LTO for iOS since it causes the following error
+  # error: unable to find any architecture information in the binary
+  # at '/Users/fangjun/open-source/sherpa-onnx/build-ios/build/os64/sherpa-onnx.a':
+  # Unknown header: 0xb17c0de
+  # See also https://forums.developer.apple.com/forums/thread/714324
+
+  include(CheckIPOSupported)
+  check_ipo_supported(RESULT ipo)
+  if(ipo)
+    message(STATUS "IPO is enabled")
+    set(CMAKE_INTERPROCEDURAL_OPTIMIZATION ON)
+  else()
+    message(STATUS "IPO is not available")
+  endif()
+endif()
+
+if(SHERPA_MNN_ENABLE_TTS)
+  message(STATUS "TTS is enabled")
+  add_definitions(-DSHERPA_MNN_ENABLE_TTS=1)
+else()
+  message(WARNING "TTS is disabled")
+  add_definitions(-DSHERPA_MNN_ENABLE_TTS=0)
+endif()
+
+if(SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION)
+  message(STATUS "speaker diarization is enabled")
+  add_definitions(-DSHERPA_MNN_ENABLE_SPEAKER_DIARIZATION=1)
+else()
+  message(WARNING "speaker diarization is disabled")
+  add_definitions(-DSHERPA_MNN_ENABLE_SPEAKER_DIARIZATION=0)
+endif()
+
+if(SHERPA_MNN_ENABLE_DIRECTML)
+  message(STATUS "DirectML is enabled")
+  add_definitions(-DSHERPA_MNN_ENABLE_DIRECTML=1)
+else()
+  message(STATUS "DirectML is disabled")
+  add_definitions(-DSHERPA_MNN_ENABLE_DIRECTML=0)
+endif()
+
+if(SHERPA_MNN_ENABLE_WASM_SPEAKER_DIARIZATION)
+  if(NOT SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION)
+    message(FATAL_ERROR "Please set SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION to ON if you want to build WASM for speaker diarization")
+  endif()
+
+  if(NOT SHERPA_MNN_ENABLE_WASM)
+    message(FATAL_ERROR "Please set SHERPA_MNN_ENABLE_WASM to ON if you enable WASM for speaker diarization")
+  endif()
+endif()
+
+if(SHERPA_MNN_ENABLE_WASM_TTS)
+  if(NOT SHERPA_MNN_ENABLE_TTS)
+    message(FATAL_ERROR "Please set SHERPA_MNN_ENABLE_TTS to ON if you want to build WASM for TTS")
+  endif()
+
+  if(NOT SHERPA_MNN_ENABLE_WASM)
+    message(FATAL_ERROR "Please set SHERPA_MNN_ENABLE_WASM to ON if you enable WASM for TTS")
+  endif()
+endif()
+
+if(SHERPA_MNN_ENABLE_WASM_ASR)
+  if(NOT SHERPA_MNN_ENABLE_WASM)
+    message(FATAL_ERROR "Please set SHERPA_MNN_ENABLE_WASM to ON if you enable WASM for ASR")
+  endif()
+endif()
+
+if(SHERPA_MNN_ENABLE_WASM_NODEJS)
+  if(NOT SHERPA_MNN_ENABLE_WASM)
+    message(FATAL_ERROR "Please set SHERPA_MNN_ENABLE_WASM to ON if you enable WASM for NodeJS")
+  endif()
+  add_definitions(-DSHERPA_MNN_ENABLE_WASM_KWS=1)
+endif()
+
+if(SHERPA_MNN_ENABLE_WASM)
+  add_definitions(-DSHERPA_MNN_ENABLE_WASM=1)
+endif()
+
+if(SHERPA_MNN_ENABLE_WASM_KWS)
+  if(NOT SHERPA_MNN_ENABLE_WASM)
+    message(FATAL_ERROR "Please set SHERPA_MNN_ENABLE_WASM to ON if you enable WASM for KWS")
+  endif()
+  add_definitions(-DSHERPA_MNN_ENABLE_WASM_KWS=1)
+endif()
+
+if(SHERPA_MNN_ENABLE_WASM_VAD)
+  if(NOT SHERPA_MNN_ENABLE_WASM)
+    message(FATAL_ERROR "Please set SHERPA_MNN_ENABLE_WASM to ON if you enable WASM for VAD")
+  endif()
+endif()
+
+if(SHERPA_MNN_ENABLE_WASM_VAD_ASR)
+  if(NOT SHERPA_MNN_ENABLE_WASM)
+    message(FATAL_ERROR "Please set SHERPA_MNN_ENABLE_WASM to ON if you enable WASM for VAD+ASR")
+  endif()
+endif()
+
+if(NOT CMAKE_CXX_STANDARD)
+  set(CMAKE_CXX_STANDARD 17 CACHE STRING "The C++ version to be used.")
+endif()
+set(CMAKE_CXX_EXTENSIONS OFF)
+message(STATUS "C++ Standard version: ${CMAKE_CXX_STANDARD}")
+
+include(CheckIncludeFileCXX)
+
+if(SHERPA_MNN_ENABLE_RKNN)
+  add_definitions(-DSHERPA_MNN_ENABLE_RKNN=1)
+endif()
+
+if(UNIX AND NOT APPLE AND NOT SHERPA_MNN_ENABLE_WASM AND NOT CMAKE_SYSTEM_NAME STREQUAL Android AND NOT CMAKE_SYSTEM_NAME STREQUAL OHOS)
+  check_include_file_cxx(alsa/asoundlib.h SHERPA_MNN_HAS_ALSA)
+  if(SHERPA_MNN_HAS_ALSA)
+    message(STATUS "With Alsa")
+    add_definitions(-DSHERPA_MNN_ENABLE_ALSA=1)
+  else()
+    message(WARNING "\
+Could not find alsa/asoundlib.h !
+We won't build sherpa-onnx-alsa
+To fix that, please do:
+  (1) sudo apt-get install alsa-utils libasound2-dev
+  (2) rm -rf build
+  (3) re-try
+  ")
+  endif()
+endif()
+
+check_include_file_cxx(cxxabi.h SHERPA_MNN_HAVE_CXXABI_H)
+check_include_file_cxx(execinfo.h SHERPA_MNN_HAVE_EXECINFO_H)
+
+if(WIN32)
+  add_definitions(-DNOMINMAX) # Otherwise, std::max() and std::min() won't work
+endif()
+
+if(WIN32 AND MSVC)
+  # disable various warnings for MSVC
+  # 4244: 'return': conversion from 'unsigned __int64' to 'int', possible loss of data
+  # 4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data
+  # 4305: 'argument': truncation from 'double' to 'const float'
+  # 4334: '<<': result of 32-bit shift implicitly converted to 64 bits
+  # 4800: 'int': forcing value to bool 'true' or 'false'
+  # 4996: 'fopen': This function or variable may be unsafe
+  set(disabled_warnings
+      /wd4244
+      /wd4267
+      /wd4305
+      /wd4334
+      /wd4800
+      /wd4996
+  )
+  message(STATUS "Disabled warnings: ${disabled_warnings}")
+  foreach(w IN LISTS disabled_warnings)
+    string(APPEND CMAKE_CXX_FLAGS " ${w} ")
+  endforeach()
+
+  add_compile_options("$<$<C_COMPILER_ID:MSVC>:/utf-8>")
+  add_compile_options("$<$<CXX_COMPILER_ID:MSVC>:/utf-8>")
+endif()
+
+list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake/Modules)
+list(APPEND CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake)
+
+if(SHERPA_MNN_ENABLE_WASM)
+  # Enable it for debugging in case there is something wrong.
+  # string(APPEND CMAKE_CXX_FLAGS " -g4 -s ASSERTIONS=2 -s SAFE_HEAP=1 -s STACK_OVERFLOW_CHECK=1 ")
+endif()
+
+if(NOT BUILD_SHARED_LIBS AND CMAKE_SYSTEM_NAME STREQUAL Linux)
+  if(SHERPA_MNN_LINK_LIBSTDCPP_STATICALLY)
+    message(STATUS "Link libstdc++ statically")
+    set(CMAKE_CXX_FLAGS " ${CMAKE_CXX_FLAGS} -static-libstdc++ -static-libgcc ")
+  else()
+    message(STATUS "Link libstdc++ dynamically")
+  endif()
+endif()
+
+include(kaldi-native-fbank)
+include(kaldi-decoder)
+include(simple-sentencepiece)
+set(ONNXRUNTIME_DIR ${onnxruntime_SOURCE_DIR})
+message(STATUS "ONNXRUNTIME_DIR: ${ONNXRUNTIME_DIR}")
+
+if(SHERPA_MNN_ENABLE_PORTAUDIO AND SHERPA_MNN_ENABLE_BINARY)
+  # portaudio is used only in building demo binaries and the sherpa-onnx-core
+  # library does not depend on it.
+  include(portaudio)
+endif()
+
+if(SHERPA_MNN_ENABLE_PYTHON)
+  include(pybind11)
+endif()
+
+if(SHERPA_MNN_ENABLE_TESTS)
+  enable_testing()
+  include(googletest)
+endif()
+
+if(SHERPA_MNN_ENABLE_WEBSOCKET)
+  include(websocketpp)
+  include(asio)
+endif()
+
+if(SHERPA_MNN_ENABLE_TTS)
+  include(espeak-ng-for-piper)
+  set(ESPEAK_NG_DIR ${espeak_ng_SOURCE_DIR})
+  message(STATUS "ESPEAK_NG_DIR: ${ESPEAK_NG_DIR}")
+  include(piper-phonemize)
+  include(cppjieba) # For Chinese TTS. It is a header-only C++ library
+endif()
+
+if(SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION)
+  include(hclust-cpp)
+endif()
+
+# if(NOT MSVC AND CMAKE_BUILD_TYPE STREQUAL Debug AND (CMAKE_CXX_COMPILER_ID STREQUAL "Clang" OR CMAKE_CXX_COMPILER_ID STREQUAL "AppleClang"))
+if(SHERPA_MNN_ENABLE_SANITIZER)
+  message(WARNING "enable ubsan and asan")
+  set(CMAKE_REQUIRED_LIBRARIES -lubsan -lasan)
+  include(CheckCCompilerFlag)
+
+  set(flags -fsanitize=undefined )
+  string(APPEND flags " -fno-sanitize-recover=undefined ")
+  string(APPEND flags " -fsanitize=integer ")
+  string(APPEND flags " -fsanitize=nullability ")
+  string(APPEND flags " -fsanitize=implicit-conversion ")
+  string(APPEND flags " -fsanitize=bounds ")
+  string(APPEND flags " -fsanitize=address ")
+
+  if(OFF)
+    set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${flags} -Wall -Wextra")
+    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${flags} -Wall -Wextra")
+  else()
+    set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} ${flags}")
+    set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${flags}")
+  endif()
+
+  set(CMAKE_EXECUTBLE_LINKER_FLAGS "${CMAKE_EXECUTBLE_LINKER_FLAGS} ${flags}")
+
+  add_compile_options(-fno-omit-frame-pointer)
+endif()
+
+add_subdirectory(sherpa-mnn)
+
+if(SHERPA_MNN_ENABLE_C_API AND SHERPA_MNN_ENABLE_BINARY AND SHERPA_MNN_BUILD_C_API_EXAMPLES)
+  set(SHERPA_MNN_PKG_WITH_CARGS "-lcargs")
+  add_subdirectory(c-api-examples)
+  add_subdirectory(cxx-api-examples)
+endif()
+
+if(SHERPA_MNN_ENABLE_WASM)
+  add_subdirectory(wasm)
+endif()
+
+message(STATUS "CMAKE_CXX_FLAGS: ${CMAKE_CXX_FLAGS}")
+
+if(NOT BUILD_SHARED_LIBS)
+  if(APPLE)
+    set(SHERPA_MNN_PKG_CONFIG_EXTRA_LIBS "-lc++ -framework Foundation")
+  endif()
+
+  if(UNIX AND NOT APPLE)
+    set(SHERPA_MNN_PKG_CONFIG_EXTRA_LIBS "-lstdc++ -lm -pthread -ldl")
+  endif()
+endif()
+
+if(NOT BUILD_SHARED_LIBS)
+# See https://people.freedesktop.org/~dbn/pkg-config-guide.html
+  if(SHERPA_MNN_ENABLE_TTS)
+    configure_file(cmake/sherpa-onnx-static.pc.in ${PROJECT_BINARY_DIR}/sherpa-onnx.pc @ONLY)
+  else()
+    configure_file(cmake/sherpa-onnx-static-no-tts.pc.in ${PROJECT_BINARY_DIR}/sherpa-onnx.pc @ONLY)
+  endif()
+else()
+  configure_file(cmake/sherpa-onnx-shared.pc.in ${PROJECT_BINARY_DIR}/sherpa-onnx.pc @ONLY)
+endif()
+
+install(
+  FILES
+    ${PROJECT_BINARY_DIR}/sherpa-onnx.pc
+  DESTINATION
+    ./
+)
+message(STATUS "CMAKE_CXX_FLAGS: ${CMAKE_CXX_FLAGS}")
--- a/apps/frameworks/sherpa-mnn/CPPLINT.cfg
+++ b/apps/frameworks/sherpa-mnn/CPPLINT.cfg
@ -0,0 +1 @@
+filter=-./mfc-examples
--- a/apps/frameworks/sherpa-mnn/LICENSE
+++ b/apps/frameworks/sherpa-mnn/LICENSE
@ -0,0 +1,202 @@
+
+                                 Apache License
+                           Version 2.0, January 2004
+                        http://www.apache.org/licenses/
+
+   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
+
+   1. Definitions.
+
+      "License" shall mean the terms and conditions for use, reproduction,
+      and distribution as defined by Sections 1 through 9 of this document.
+
+      "Licensor" shall mean the copyright owner or entity authorized by
+      the copyright owner that is granting the License.
+
+      "Legal Entity" shall mean the union of the acting entity and all
+      other entities that control, are controlled by, or are under common
+      control with that entity. For the purposes of this definition,
+      "control" means (i) the power, direct or indirect, to cause the
+      direction or management of such entity, whether by contract or
+      otherwise, or (ii) ownership of fifty percent (50%) or more of the
+      outstanding shares, or (iii) beneficial ownership of such entity.
+
+      "You" (or "Your") shall mean an individual or Legal Entity
+      exercising permissions granted by this License.
+
+      "Source" form shall mean the preferred form for making modifications,
+      including but not limited to software source code, documentation
+      source, and configuration files.
+
+      "Object" form shall mean any form resulting from mechanical
+      transformation or translation of a Source form, including but
+      not limited to compiled object code, generated documentation,
+      and conversions to other media types.
+
+      "Work" shall mean the work of authorship, whether in Source or
+      Object form, made available under the License, as indicated by a
+      copyright notice that is included in or attached to the work
+      (an example is provided in the Appendix below).
+
+      "Derivative Works" shall mean any work, whether in Source or Object
+      form, that is based on (or derived from) the Work and for which the
+      editorial revisions, annotations, elaborations, or other modifications
+      represent, as a whole, an original work of authorship. For the purposes
+      of this License, Derivative Works shall not include works that remain
+      separable from, or merely link (or bind by name) to the interfaces of,
+      the Work and Derivative Works thereof.
+
+      "Contribution" shall mean any work of authorship, including
+      the original version of the Work and any modifications or additions
+      to that Work or Derivative Works thereof, that is intentionally
+      submitted to Licensor for inclusion in the Work by the copyright owner
+      or by an individual or Legal Entity authorized to submit on behalf of
+      the copyright owner. For the purposes of this definition, "submitted"
+      means any form of electronic, verbal, or written communication sent
+      to the Licensor or its representatives, including but not limited to
+      communication on electronic mailing lists, source code control systems,
+      and issue tracking systems that are managed by, or on behalf of, the
+      Licensor for the purpose of discussing and improving the Work, but
+      excluding communication that is conspicuously marked or otherwise
+      designated in writing by the copyright owner as "Not a Contribution."
+
+      "Contributor" shall mean Licensor and any individual or Legal Entity
+      on behalf of whom a Contribution has been received by Licensor and
+      subsequently incorporated within the Work.
+
+   2. Grant of Copyright License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      copyright license to reproduce, prepare Derivative Works of,
+      publicly display, publicly perform, sublicense, and distribute the
+      Work and such Derivative Works in Source or Object form.
+
+   3. Grant of Patent License. Subject to the terms and conditions of
+      this License, each Contributor hereby grants to You a perpetual,
+      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
+      (except as stated in this section) patent license to make, have made,
+      use, offer to sell, sell, import, and otherwise transfer the Work,
+      where such license applies only to those patent claims licensable
+      by such Contributor that are necessarily infringed by their
+      Contribution(s) alone or by combination of their Contribution(s)
+      with the Work to which such Contribution(s) was submitted. If You
+      institute patent litigation against any entity (including a
+      cross-claim or counterclaim in a lawsuit) alleging that the Work
+      or a Contribution incorporated within the Work constitutes direct
+      or contributory patent infringement, then any patent licenses
+      granted to You under this License for that Work shall terminate
+      as of the date such litigation is filed.
+
+   4. Redistribution. You may reproduce and distribute copies of the
+      Work or Derivative Works thereof in any medium, with or without
+      modifications, and in Source or Object form, provided that You
+      meet the following conditions:
+
+      (a) You must give any other recipients of the Work or
+          Derivative Works a copy of this License; and
+
+      (b) You must cause any modified files to carry prominent notices
+          stating that You changed the files; and
+
+      (c) You must retain, in the Source form of any Derivative Works
+          that You distribute, all copyright, patent, trademark, and
+          attribution notices from the Source form of the Work,
+          excluding those notices that do not pertain to any part of
+          the Derivative Works; and
+
+      (d) If the Work includes a "NOTICE" text file as part of its
+          distribution, then any Derivative Works that You distribute must
+          include a readable copy of the attribution notices contained
+          within such NOTICE file, excluding those notices that do not
+          pertain to any part of the Derivative Works, in at least one
+          of the following places: within a NOTICE text file distributed
+          as part of the Derivative Works; within the Source form or
+          documentation, if provided along with the Derivative Works; or,
+          within a display generated by the Derivative Works, if and
+          wherever such third-party notices normally appear. The contents
+          of the NOTICE file are for informational purposes only and
+          do not modify the License. You may add Your own attribution
+          notices within Derivative Works that You distribute, alongside
+          or as an addendum to the NOTICE text from the Work, provided
+          that such additional attribution notices cannot be construed
+          as modifying the License.
+
+      You may add Your own copyright statement to Your modifications and
+      may provide additional or different license terms and conditions
+      for use, reproduction, or distribution of Your modifications, or
+      for any such Derivative Works as a whole, provided Your use,
+      reproduction, and distribution of the Work otherwise complies with
+      the conditions stated in this License.
+
+   5. Submission of Contributions. Unless You explicitly state otherwise,
+      any Contribution intentionally submitted for inclusion in the Work
+      by You to the Licensor shall be under the terms and conditions of
+      this License, without any additional terms or conditions.
+      Notwithstanding the above, nothing herein shall supersede or modify
+      the terms of any separate license agreement you may have executed
+      with Licensor regarding such Contributions.
+
+   6. Trademarks. This License does not grant permission to use the trade
+      names, trademarks, service marks, or product names of the Licensor,
+      except as required for reasonable and customary use in describing the
+      origin of the Work and reproducing the content of the NOTICE file.
+
+   7. Disclaimer of Warranty. Unless required by applicable law or
+      agreed to in writing, Licensor provides the Work (and each
+      Contributor provides its Contributions) on an "AS IS" BASIS,
+      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
+      implied, including, without limitation, any warranties or conditions
+      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
+      PARTICULAR PURPOSE. You are solely responsible for determining the
+      appropriateness of using or redistributing the Work and assume any
+      risks associated with Your exercise of permissions under this License.
+
+   8. Limitation of Liability. In no event and under no legal theory,
+      whether in tort (including negligence), contract, or otherwise,
+      unless required by applicable law (such as deliberate and grossly
+      negligent acts) or agreed to in writing, shall any Contributor be
+      liable to You for damages, including any direct, indirect, special,
+      incidental, or consequential damages of any character arising as a
+      result of this License or out of the use or inability to use the
+      Work (including but not limited to damages for loss of goodwill,
+      work stoppage, computer failure or malfunction, or any and all
+      other commercial damages or losses), even if such Contributor
+      has been advised of the possibility of such damages.
+
+   9. Accepting Warranty or Additional Liability. While redistributing
+      the Work or Derivative Works thereof, You may choose to offer,
+      and charge a fee for, acceptance of support, warranty, indemnity,
+      or other liability obligations and/or rights consistent with this
+      License. However, in accepting such obligations, You may act only
+      on Your own behalf and on Your sole responsibility, not on behalf
+      of any other Contributor, and only if You agree to indemnify,
+      defend, and hold each Contributor harmless for any liability
+      incurred by, or claims asserted against, such Contributor by reason
+      of your accepting any such warranty or additional liability.
+
+   END OF TERMS AND CONDITIONS
+
+   APPENDIX: How to apply the Apache License to your work.
+
+      To apply the Apache License to your work, attach the following
+      boilerplate notice, with the fields enclosed by brackets "[]"
+      replaced with your own identifying information. (Don't include
+      the brackets!)  The text should be enclosed in the appropriate
+      comment syntax for the file format. We also recommend that a
+      file or class name and description of purpose be included on the
+      same "printed page" as the copyright notice for easier
+      identification within third-party archives.
+
+   Copyright [yyyy] [name of copyright owner]
+
+   Licensed under the Apache License, Version 2.0 (the "License");
+   you may not use this file except in compliance with the License.
+   You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
--- a/apps/frameworks/sherpa-mnn/MANIFEST.in
+++ b/apps/frameworks/sherpa-mnn/MANIFEST.in
@ -0,0 +1,12 @@
+include LICENSE
+include README.md
+include CMakeLists.txt
+recursive-include c-api-examples *.*
+recursive-include sherpa-onnx *.*
+recursive-include cmake *.*
+prune */__pycache__
+prune android
+prune sherpa-onnx/java-api
+prune ios-swift
+prune ios-swiftui
+
--- a/apps/frameworks/sherpa-mnn/NOTICE
+++ b/apps/frameworks/sherpa-mnn/NOTICE
@ -0,0 +1,19 @@
+# NOTICE
+## Project Info
+
+- ** Name **:sherpa-mnn
+- **License**: Apache 2.0
+
+## Dependencies
+
+- [MNN](https://github.com/alibaba/MNN/)
+
+## Modifications
+This project is derived from sherpa-onnx (https://github.com/k2-fsa/sherpa-onnx)
+Key changes include:
+
+- Use MNN instead of onnxruntime to do deeplearning model inference
+- Rename sherpa-onnx to sherpa-mnn
+
+## Copyright
+Copyright (c)  2022-2023  Xiaomi Corporation. All rights reserved. Copyright (c) 2025, MNN Team.
--- a/apps/frameworks/sherpa-mnn/README.md
+++ b/apps/frameworks/sherpa-mnn/README.md
@ -0,0 +1,93 @@
+# sherpa-mnn
+
+本工程基于 sherpa-onnx 改造而得，将 onnxruntime 的调用全部替换为 MNN
+
+## MNN 环境和模型准备
+
+### MNN 编译
+
+下载 MNN : https://github.com/alibaba/MNN/
+
+在编译 MNN 时额外加上 `-DMNN_SEP_BUILD=OFF` 和 `-DCMAKE_INSTALL_PREFIX=.` ，:
+
+```
+mkdir build
+cd build
+cmake .. -DMNN_LOW_MEMORY=ON -DMNN_SEP_BUILD=OFF -DCMAKE_INSTALL_PREFIX=. -DMNN_BUILD_CONVERTER=ON
+make -j4
+make install
+```
+
+### 模型转换
+在 编译好 MNNConvert 的目录下（上文的build目录），按如下命令逐个把下载好的 onnx FP32 模型转换成 mnn ，建议转换时量化一下，可以降低模型大小，并在MNN库开启`MNN_LOW_MEMORY`编译的情况下降低运行内存并提升运行性能，不要直接转换 int8 的 onnx 模型。
+```
+mkdir sherpa-mnn-models
+./MNNConvert -f ONNX --modelFile  sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx --MNNModel sherpa-mnn-models/encode.mnn --weightQuantBits=8 --weightQuantBlock=64
+./MNNConvert -f ONNX --modelFile  sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx --MNNModel sherpa-mnn-models/decode.mnn --weightQuantBits=8 --weightQuantBlock=64
+./MNNConvert -f ONNX --modelFile  sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx --MNNModel sherpa-mnn-models/joiner.mnn --weightQuantBits=8 --weightQuantBlock=64
+```
+
+
+## 本地编译和运行测试
+
+### 编译
+回到 sherpa-mnn 根目录
+执行如下操作, `MNN_LIB_DIR`后面的内容按自己的编译目录修改
+
+```
+mkdir build
+cmake .. -DMNN_LIB_DIR=/Users/xtjiang/alicnn/AliNNPrivate/build
+make -j16
+```
+
+### 测试
+回到 sherpa-mnn 根目录，以sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 这个模型为例
+
+```
+./build/bin/sherpa-mnn  --tokens=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt   --encoder=./sherpa-mnn-models/encode.mnn   --decoder=./sherpa-mnn-models/decode.mnn   --joiner=./sherpa-mnn-models/joiner.mnn   ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/1.wav
+```
+
+正常的话会打印如下信息
+```
+Number of threads: 1, Elapsed seconds: 0.27, Audio duration (s): 5.1, Real time factor (RTF) = 0.27/5.1 = 0.053
+这是第一种第二种叫与 ALWAYS ALWAYS什么意思
+{ "text": "这是第一种第二种叫与 ALWAYS ALWAYS什么意思", "tokens": ["这", "是", "第", "一", "种", "第", "二", "种", "叫", "与", " ALWAYS", " ALWAYS", "什", "么", "意", "思"], "timestamps": [0.96, 1.04, 1.28, 1.40, 1.48, 1.72, 1.84, 2.04, 2.44, 3.64, 3.84, 4.36, 4.72, 4.76, 4.92, 5.04], "ys_probs": [-0.884769, -0.858386, -1.106216, -0.626572, -1.101773, -0.359962, -0.745972, -0.267809, -0.826859, -1.076653, -0.683002, -0.869667, -0.593140, -0.469688, -0.256882, -0.442532], "lm_probs": [], "context_scores": [], "segment": 0, "words": [], "start_time": 0.00, "is_final": false}
+```
+
+## 编译 Android
+### MNN Android 编译
+进入 MNN 目录后操作
+```
+cd project/android
+mkdir build_64
+../build_64.sh -DMNN_LOW_MEMORY=ON -DMNN_SEP_BUILD_OFF -DCMAKE_INSTALL_PREFIX=.
+make install
+```
+
+### sherpa-mnn Android 编译
+修改 build-android-arm64-v8a.sh 脚本
+把 `MNN_LIB_DIR`后面的内容修改为上面的编译目录
+
+然后执行 build-android-arm64-v8a.sh
+
+如果编译出来的 so 体积较大，可以用 android ndk 工具 strip 一下
+
+
+## 编译 iOS
+修改 build-ios.sh 脚本
+把 `MNN_LIB_DIR`后面的内容修改为 MNN 根目录（保证能找到 MNN 头文件即可）
+
+运行 build-ios.sh 脚本
+
+```
+export MNN_LIB_DIR=/path/to/MNN
+sh build-ios.sh
+```
+
+编译出 build-ios/sherpa-mnn.xcframework
+
+## 编译 MacOs framework
+类似 iOS 编译过程，修改 build-swift-macos.sh
+把 `MNN_LIB_DIR`后面的内容修改为 MNN 根目录（保证能找到 MNN 头文件即可）
+运行 build-swift-macos.sh
+编译出 build-swift-macos/sherpa-mnn.xcframework/
--- a/apps/frameworks/sherpa-mnn/README_ONNX.md
+++ b/apps/frameworks/sherpa-mnn/README_ONNX.md
@ -0,0 +1,446 @@
+### Supported functions
+
+|Speech recognition| Speech synthesis |
+|------------------|------------------|
+|   ✔️              |         ✔️        |
+
+|Speaker identification| Speaker diarization | Speaker verification |
+|----------------------|-------------------- |------------------------|
+|   ✔️                  |         ✔️           |            ✔️           |
+
+| Spoken Language identification | Audio tagging | Voice activity detection |
+|--------------------------------|---------------|--------------------------|
+|                 ✔️              |          ✔️    |                ✔️         |
+
+| Keyword spotting | Add punctuation | Speech enhancement |
+|------------------|-----------------|--------------------|
+|     ✔️            |       ✔️         |      ✔️             |
+
+### Supported platforms
+
+|Architecture| Android | iOS     | Windows    | macOS | linux | HarmonyOS |
+|------------|---------|---------|------------|-------|-------|-----------|
+|   x64      |  ✔️      |         |   ✔️        | ✔️     |  ✔️    |   ✔️       |
+|   x86      |  ✔️      |         |   ✔️        |       |       |           |
+|   arm64    |  ✔️      | ✔️       |   ✔️        | ✔️     |  ✔️    |   ✔️       |
+|   arm32    |  ✔️      |         |            |       |  ✔️    |   ✔️       |
+|   riscv64  |         |         |            |       |  ✔️    |           |
+
+### Supported programming languages
+
+| 1. C++ | 2. C  | 3. Python | 4. JavaScript |
+|--------|-------|-----------|---------------|
+|   ✔️    | ✔️     | ✔️         |    ✔️          |
+
+|5. Java | 6. C# | 7. Kotlin | 8. Swift |
+|--------|-------|-----------|----------|
+| ✔️      |  ✔️    | ✔️         |  ✔️       |
+
+| 9. Go | 10. Dart | 11. Rust | 12. Pascal |
+|-------|----------|----------|------------|
+| ✔️     |  ✔️       |   ✔️      |    ✔️       |
+
+For Rust support, please see [sherpa-rs][sherpa-rs]
+
+It also supports WebAssembly.
+
+## Introduction
+
+This repository supports running the following functions **locally**
+
+  - Speech-to-text (i.e., ASR); both streaming and non-streaming are supported
+  - Text-to-speech (i.e., TTS)
+  - Speaker diarization
+  - Speaker identification
+  - Speaker verification
+  - Spoken language identification
+  - Audio tagging
+  - VAD (e.g., [silero-vad][silero-vad])
+  - Keyword spotting
+
+on the following platforms and operating systems:
+
+  - x86, ``x86_64``, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64)
+  - Linux, macOS, Windows, openKylin
+  - Android, WearOS
+  - iOS
+  - HarmonyOS
+  - NodeJS
+  - WebAssembly
+  - [NVIDIA Jetson Orin NX][NVIDIA Jetson Orin NX] (Support running on both CPU and GPU)
+  - [NVIDIA Jetson Nano B01][NVIDIA Jetson Nano B01] (Support running on both CPU and GPU)
+  - [Raspberry Pi][Raspberry Pi]
+  - [RV1126][RV1126]
+  - [LicheePi4A][LicheePi4A]
+  - [VisionFive 2][VisionFive 2]
+  - [旭日X3派][旭日X3派]
+  - [爱芯派][爱芯派]
+  - etc
+
+with the following APIs
+
+  - C++, C, Python, Go, ``C#``
+  - Java, Kotlin, JavaScript
+  - Swift, Rust
+  - Dart, Object Pascal
+
+### Links for Huggingface Spaces
+
+<details>
+<summary>You can visit the following Huggingface spaces to try sherpa-onnx without
+installing anything. All you need is a browser.</summary>
+
+| Description                                           | URL                                     |
+|-------------------------------------------------------|-----------------------------------------|
+| Speaker diarization                                   | [Click me][hf-space-speaker-diarization]|
+| Speech recognition                                    | [Click me][hf-space-asr]                |
+| Speech recognition with [Whisper][Whisper]            | [Click me][hf-space-asr-whisper]        |
+| Speech synthesis                                      | [Click me][hf-space-tts]                |
+| Generate subtitles                                    | [Click me][hf-space-subtitle]           |
+| Audio tagging                                         | [Click me][hf-space-audio-tagging]      |
+| Spoken language identification with [Whisper][Whisper]| [Click me][hf-space-slid-whisper]       |
+
+We also have spaces built using WebAssembly. They are listed below:
+
+| Description                                                                              | Huggingface space| ModelScope space|
+|------------------------------------------------------------------------------------------|------------------|-----------------|
+|Voice activity detection with [silero-vad][silero-vad]                                    | [Click me][wasm-hf-vad]|[地址][wasm-ms-vad]|
+|Real-time speech recognition (Chinese + English) with Zipformer                           | [Click me][wasm-hf-streaming-asr-zh-en-zipformer]|[地址][wasm-hf-streaming-asr-zh-en-zipformer]|
+|Real-time speech recognition (Chinese + English) with Paraformer                          |[Click me][wasm-hf-streaming-asr-zh-en-paraformer]| [地址][wasm-ms-streaming-asr-zh-en-paraformer]|
+|Real-time speech recognition (Chinese + English + Cantonese) with [Paraformer-large][Paraformer-large]|[Click me][wasm-hf-streaming-asr-zh-en-yue-paraformer]| [地址][wasm-ms-streaming-asr-zh-en-yue-paraformer]|
+|Real-time speech recognition (English) |[Click me][wasm-hf-streaming-asr-en-zipformer]    |[地址][wasm-ms-streaming-asr-en-zipformer]|
+|VAD + speech recognition (Chinese + English + Korean + Japanese + Cantonese) with [SenseVoice][SenseVoice]|[Click me][wasm-hf-vad-asr-zh-en-ko-ja-yue-sense-voice]| [地址][wasm-ms-vad-asr-zh-en-ko-ja-yue-sense-voice]|
+|VAD + speech recognition (English) with [Whisper][Whisper] tiny.en|[Click me][wasm-hf-vad-asr-en-whisper-tiny-en]| [地址][wasm-ms-vad-asr-en-whisper-tiny-en]|
+|VAD + speech recognition (English) with [Moonshine tiny][Moonshine tiny]|[Click me][wasm-hf-vad-asr-en-moonshine-tiny-en]| [地址][wasm-ms-vad-asr-en-moonshine-tiny-en]|
+|VAD + speech recognition (English) with Zipformer trained with [GigaSpeech][GigaSpeech]    |[Click me][wasm-hf-vad-asr-en-zipformer-gigaspeech]| [地址][wasm-ms-vad-asr-en-zipformer-gigaspeech]|
+|VAD + speech recognition (Chinese) with Zipformer trained with [WenetSpeech][WenetSpeech]  |[Click me][wasm-hf-vad-asr-zh-zipformer-wenetspeech]| [地址][wasm-ms-vad-asr-zh-zipformer-wenetspeech]|
+|VAD + speech recognition (Japanese) with Zipformer trained with [ReazonSpeech][ReazonSpeech]|[Click me][wasm-hf-vad-asr-ja-zipformer-reazonspeech]| [地址][wasm-ms-vad-asr-ja-zipformer-reazonspeech]|
+|VAD + speech recognition (Thai) with Zipformer trained with [GigaSpeech2][GigaSpeech2]      |[Click me][wasm-hf-vad-asr-th-zipformer-gigaspeech2]| [地址][wasm-ms-vad-asr-th-zipformer-gigaspeech2]|
+|VAD + speech recognition (Chinese 多种方言) with a [TeleSpeech-ASR][TeleSpeech-ASR] CTC model|[Click me][wasm-hf-vad-asr-zh-telespeech]| [地址][wasm-ms-vad-asr-zh-telespeech]|
+|VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-large          |[Click me][wasm-hf-vad-asr-zh-en-paraformer-large]| [地址][wasm-ms-vad-asr-zh-en-paraformer-large]|
+|VAD + speech recognition (English + Chinese, 及多种中文方言) with Paraformer-small          |[Click me][wasm-hf-vad-asr-zh-en-paraformer-small]| [地址][wasm-ms-vad-asr-zh-en-paraformer-small]|
+|Speech synthesis (English)                                                                  |[Click me][wasm-hf-tts-piper-en]| [地址][wasm-ms-tts-piper-en]|
+|Speech synthesis (German)                                                                   |[Click me][wasm-hf-tts-piper-de]| [地址][wasm-ms-tts-piper-de]|
+|Speaker diarization                                                                         |[Click me][wasm-hf-speaker-diarization]|[地址][wasm-ms-speaker-diarization]|
+
+</details>
+
+### Links for pre-built Android APKs
+
+<details>
+
+<summary>You can find pre-built Android APKs for this repository in the following table</summary>
+
+| Description                            | URL                                | 中国用户                          |
+|----------------------------------------|------------------------------------|-----------------------------------|
+| Speaker diarization                    | [Address][apk-speaker-diarization] | [点此][apk-speaker-diarization-cn]|
+| Streaming speech recognition           | [Address][apk-streaming-asr]       | [点此][apk-streaming-asr-cn]      |
+| Text-to-speech                         | [Address][apk-tts]                 | [点此][apk-tts-cn]                |
+| Voice activity detection (VAD)         | [Address][apk-vad]                 | [点此][apk-vad-cn]                |
+| VAD + non-streaming speech recognition | [Address][apk-vad-asr]             | [点此][apk-vad-asr-cn]            |
+| Two-pass speech recognition            | [Address][apk-2pass]               | [点此][apk-2pass-cn]              |
+| Audio tagging                          | [Address][apk-at]                  | [点此][apk-at-cn]                 |
+| Audio tagging (WearOS)                 | [Address][apk-at-wearos]           | [点此][apk-at-wearos-cn]          |
+| Speaker identification                 | [Address][apk-sid]                 | [点此][apk-sid-cn]                |
+| Spoken language identification         | [Address][apk-slid]                | [点此][apk-slid-cn]               |
+| Keyword spotting                       | [Address][apk-kws]                 | [点此][apk-kws-cn]                |
+
+</details>
+
+### Links for pre-built Flutter APPs
+
+<details>
+
+#### Real-time speech recognition
+
+| Description                    | URL                                 | 中国用户                            |
+|--------------------------------|-------------------------------------|-------------------------------------|
+| Streaming speech recognition   | [Address][apk-flutter-streaming-asr]| [点此][apk-flutter-streaming-asr-cn]|
+
+#### Text-to-speech
+
+| Description                              | URL                                | 中国用户                           |
+|------------------------------------------|------------------------------------|------------------------------------|
+| Android (arm64-v8a, armeabi-v7a, x86_64) | [Address][flutter-tts-android]     | [点此][flutter-tts-android-cn]     |
+| Linux (x64)                              | [Address][flutter-tts-linux]       | [点此][flutter-tts-linux-cn]       |
+| macOS (x64)                              | [Address][flutter-tts-macos-x64]   | [点此][flutter-tts-macos-arm64-cn] |
+| macOS (arm64)                            | [Address][flutter-tts-macos-arm64] | [点此][flutter-tts-macos-x64-cn]   |
+| Windows (x64)                            | [Address][flutter-tts-win-x64]     | [点此][flutter-tts-win-x64-cn]     |
+
+> Note: You need to build from source for iOS.
+
+</details>
+
+### Links for pre-built Lazarus APPs
+
+<details>
+
+#### Generating subtitles
+
+| Description                    | URL                        | 中国用户                   |
+|--------------------------------|----------------------------|----------------------------|
+| Generate subtitles (生成字幕)  | [Address][lazarus-subtitle]| [点此][lazarus-subtitle-cn]|
+
+</details>
+
+### Links for pre-trained models
+
+<details>
+
+| Description                                 | URL                                                                                   |
+|---------------------------------------------|---------------------------------------------------------------------------------------|
+| Speech recognition (speech to text, ASR)    | [Address][asr-models]                                                                 |
+| Text-to-speech (TTS)                        | [Address][tts-models]                                                                 |
+| VAD                                         | [Address][vad-models]                                                                 |
+| Keyword spotting                            | [Address][kws-models]                                                                 |
+| Audio tagging                               | [Address][at-models]                                                                  |
+| Speaker identification (Speaker ID)         | [Address][sid-models]                                                                 |
+| Spoken language identification (Language ID)| See multi-lingual [Whisper][Whisper] ASR models from  [Speech recognition][asr-models]|
+| Punctuation                                 | [Address][punct-models]                                                               |
+| Speaker segmentation                        | [Address][speaker-segmentation-models]                                                |
+| Speech enhancement                          | [Address][speech-enhancement-models]                                                  |
+
+</details>
+
+#### Some pre-trained ASR models (Streaming)
+
+<details>
+
+Please see
+
+  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/index.html>
+  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-paraformer/index.html>
+  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-ctc/index.html>
+
+for more models. The following table lists only **SOME** of them.
+
+
+|Name | Supported Languages| Description|
+|-----|-----|----|
+|[sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20][sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20]| Chinese, English| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english)|
+|[sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16][sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16]| Chinese, English| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16-bilingual-chinese-english)|
+|[sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23][sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23]|Chinese| Suitable for Cortex A7 CPU. See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-zh-14m-2023-02-23)|
+|[sherpa-onnx-streaming-zipformer-en-20M-2023-02-17][sherpa-onnx-streaming-zipformer-en-20M-2023-02-17]|English|Suitable for Cortex A7 CPU. See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-en-20m-2023-02-17)|
+|[sherpa-onnx-streaming-zipformer-korean-2024-06-16][sherpa-onnx-streaming-zipformer-korean-2024-06-16]|Korean| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-korean-2024-06-16-korean)|
+|[sherpa-onnx-streaming-zipformer-fr-2023-04-14][sherpa-onnx-streaming-zipformer-fr-2023-04-14]|French| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#shaojieli-sherpa-onnx-streaming-zipformer-fr-2023-04-14-french)|
+
+</details>
+
+
+#### Some pre-trained ASR models (Non-Streaming)
+
+<details>
+
+Please see
+
+  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/index.html>
+  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/index.html>
+  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/index.html>
+  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/telespeech/index.html>
+  - <https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/index.html>
+
+for more models. The following table lists only **SOME** of them.
+
+|Name | Supported Languages| Description|
+|-----|-----|----|
+|[Whisper tiny.en](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2)|English| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/whisper/tiny.en.html)|
+|[Moonshine tiny][Moonshine tiny]|English|See [also](https://github.com/usefulsensors/moonshine)|
+|[sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17][sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17]|Chinese, Cantonese, English, Korean, Japanese| 支持多种中文方言. See [also](https://k2-fsa.github.io/sherpa/onnx/sense-voice/index.html)|
+|[sherpa-onnx-paraformer-zh-2024-03-09][sherpa-onnx-paraformer-zh-2024-03-09]|Chinese, English| 也支持多种中文方言. See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-paraformer/paraformer-models.html#csukuangfj-sherpa-onnx-paraformer-zh-2024-03-09-chinese-english)|
+|[sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01][sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01]|Japanese|See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01-japanese)|
+|[sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24][sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24]|Russian|See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/nemo-transducer-models.html#sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24-russian)|
+|[sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24][sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24]|Russian| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-ctc/nemo/russian.html#sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24)|
+|[sherpa-onnx-zipformer-ru-2024-09-18][sherpa-onnx-zipformer-ru-2024-09-18]|Russian|See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-ru-2024-09-18-russian)|
+|[sherpa-onnx-zipformer-korean-2024-06-24][sherpa-onnx-zipformer-korean-2024-06-24]|Korean|See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-korean-2024-06-24-korean)|
+|[sherpa-onnx-zipformer-thai-2024-06-20][sherpa-onnx-zipformer-thai-2024-06-20]|Thai| See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/offline-transducer/zipformer-transducer-models.html#sherpa-onnx-zipformer-thai-2024-06-20-thai)|
+|[sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04][sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04]|Chinese| 支持多种方言. See [also](https://k2-fsa.github.io/sherpa/onnx/pretrained_models/telespeech/models.html#sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04)|
+
+</details>
+
+### Useful links
+
+- Documentation: https://k2-fsa.github.io/sherpa/onnx/
+- Bilibili 演示视频: https://search.bilibili.com/all?keyword=%E6%96%B0%E4%B8%80%E4%BB%A3Kaldi
+
+### How to reach us
+
+Please see
+https://k2-fsa.github.io/sherpa/social-groups.html
+for 新一代 Kaldi **微信交流群** and **QQ 交流群**.
+
+## Projects using sherpa-onnx
+
+### [Open-LLM-VTuber](https://github.com/t41372/Open-LLM-VTuber)
+
+Talk to any LLM with hands-free voice interaction, voice interruption, and Live2D taking
+face running locally across platforms
+
+See also <https://github.com/t41372/Open-LLM-VTuber/pull/50>
+
+### [voiceapi](https://github.com/ruzhila/voiceapi)
+
+<details>
+  <summary>Streaming ASR and TTS based on FastAPI</summary>
+
+
+It shows how to use the ASR and TTS Python APIs with FastAPI.
+</details>
+
+### [腾讯会议摸鱼工具 TMSpeech](https://github.com/jxlpzqc/TMSpeech)
+
+Uses streaming ASR in C# with graphical user interface.
+
+Video demo in Chinese: [【开源】Windows实时字幕软件（网课/开会必备）](https://www.bilibili.com/video/BV1rX4y1p7Nx)
+
+### [lol互动助手](https://github.com/l1veIn/lol-wom-electron)
+
+It uses the JavaScript API of sherpa-onnx along with [Electron](https://electronjs.org/)
+
+Video demo in Chinese: [爆了！炫神教你开打字挂！真正影响胜率的英雄联盟工具！英雄联盟的最后一块拼图！和游戏中的每个人无障碍沟通！](https://www.bilibili.com/video/BV142tje9E74)
+
+### [Sherpa-ONNX 语音识别服务器](https://github.com/hfyydd/sherpa-onnx-server)
+
+A server based on nodejs providing Restful API for speech recognition.
+
+### [QSmartAssistant](https://github.com/xinhecuican/QSmartAssistant)
+
+一个模块化，全过程可离线，低占用率的对话机器人/智能音箱
+
+It uses QT. Both [ASR](https://github.com/xinhecuican/QSmartAssistant/blob/master/doc/%E5%AE%89%E8%A3%85.md#asr)
+and [TTS](https://github.com/xinhecuican/QSmartAssistant/blob/master/doc/%E5%AE%89%E8%A3%85.md#tts)
+are used.
+
+
+### [Flutter-EasySpeechRecognition](https://github.com/Jason-chen-coder/Flutter-EasySpeechRecognition)
+
+It extends [./flutter-examples/streaming_asr](./flutter-examples/streaming_asr) by
+downloading models inside the app to reduce the size of the app.
+
+### [sherpa-onnx-unity](https://github.com/xue-fei/sherpa-onnx-unity)
+
+sherpa-onnx in Unity. See also [#1695](https://github.com/k2-fsa/sherpa-onnx/issues/1695),
+[#1892](https://github.com/k2-fsa/sherpa-onnx/issues/1892), and [#1859](https://github.com/k2-fsa/sherpa-onnx/issues/1859)
+
+[sherpa-rs]: https://github.com/thewh1teagle/sherpa-rs
+[silero-vad]: https://github.com/snakers4/silero-vad
+[Raspberry Pi]: https://www.raspberrypi.com/
+[RV1126]: https://www.rock-chips.com/uploads/pdf/2022.8.26/191/RV1126%20Brief%20Datasheet.pdf
+[LicheePi4A]: https://sipeed.com/licheepi4a
+[VisionFive 2]: https://www.starfivetech.com/en/site/boards
+[旭日X3派]: https://developer.horizon.ai/api/v1/fileData/documents_pi/index.html
+[爱芯派]: https://wiki.sipeed.com/hardware/zh/maixIII/ax-pi/axpi.html
+[hf-space-speaker-diarization]: https://huggingface.co/spaces/k2-fsa/speaker-diarization
+[hf-space-asr]: https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition
+[Whisper]: https://github.com/openai/whisper
+[hf-space-asr-whisper]: https://huggingface.co/spaces/k2-fsa/automatic-speech-recognition-with-whisper
+[hf-space-tts]: https://huggingface.co/spaces/k2-fsa/text-to-speech
+[hf-space-subtitle]: https://huggingface.co/spaces/k2-fsa/generate-subtitles-for-videos
+[hf-space-audio-tagging]: https://huggingface.co/spaces/k2-fsa/audio-tagging
+[hf-space-slid-whisper]: https://huggingface.co/spaces/k2-fsa/spoken-language-identification
+[wasm-hf-vad]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-sherpa-onnx
+[wasm-ms-vad]: https://modelscope.cn/studios/csukuangfj/web-assembly-vad-sherpa-onnx
+[wasm-hf-streaming-asr-zh-en-zipformer]: https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en
+[wasm-ms-streaming-asr-zh-en-zipformer]: https://modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en
+[wasm-hf-streaming-asr-zh-en-paraformer]: https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en-paraformer
+[wasm-ms-streaming-asr-zh-en-paraformer]: https://modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-en-paraformer
+[Paraformer-large]: https://www.modelscope.cn/models/damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/summary
+[wasm-hf-streaming-asr-zh-en-yue-paraformer]: https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer
+[wasm-ms-streaming-asr-zh-en-yue-paraformer]: https://modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-zh-cantonese-en-paraformer
+[wasm-hf-streaming-asr-en-zipformer]: https://huggingface.co/spaces/k2-fsa/web-assembly-asr-sherpa-onnx-en
+[wasm-ms-streaming-asr-en-zipformer]: https://modelscope.cn/studios/k2-fsa/web-assembly-asr-sherpa-onnx-en
+[SenseVoice]: https://github.com/FunAudioLLM/SenseVoice
+[wasm-hf-vad-asr-zh-en-ko-ja-yue-sense-voice]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-ja-ko-cantonese-sense-voice
+[wasm-ms-vad-asr-zh-en-ko-ja-yue-sense-voice]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-zh-en-jp-ko-cantonese-sense-voice
+[wasm-hf-vad-asr-en-whisper-tiny-en]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-whisper-tiny
+[wasm-ms-vad-asr-en-whisper-tiny-en]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-en-whisper-tiny
+[wasm-hf-vad-asr-en-moonshine-tiny-en]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny
+[wasm-ms-vad-asr-en-moonshine-tiny-en]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-en-moonshine-tiny
+[wasm-hf-vad-asr-en-zipformer-gigaspeech]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech
+[wasm-ms-vad-asr-en-zipformer-gigaspeech]: https://www.modelscope.cn/studios/k2-fsa/web-assembly-vad-asr-sherpa-onnx-en-zipformer-gigaspeech
+[wasm-hf-vad-asr-zh-zipformer-wenetspeech]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech
+[wasm-ms-vad-asr-zh-zipformer-wenetspeech]: https://www.modelscope.cn/studios/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-zipformer-wenetspeech
+[ReazonSpeech]: https://research.reazon.jp/_static/reazonspeech_nlp2023.pdf
+[wasm-hf-vad-asr-ja-zipformer-reazonspeech]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-ja-zipformer
+[wasm-ms-vad-asr-ja-zipformer-reazonspeech]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-ja-zipformer
+[GigaSpeech2]: https://github.com/SpeechColab/GigaSpeech2
+[wasm-hf-vad-asr-th-zipformer-gigaspeech2]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-th-zipformer
+[wasm-ms-vad-asr-th-zipformer-gigaspeech2]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-vad-asr-sherpa-onnx-th-zipformer
+[TeleSpeech-ASR]: https://github.com/Tele-AI/TeleSpeech-ASR
+[wasm-hf-vad-asr-zh-telespeech]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-telespeech
+[wasm-ms-vad-asr-zh-telespeech]: https://www.modelscope.cn/studios/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-telespeech
+[wasm-hf-vad-asr-zh-en-paraformer-large]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer
+[wasm-ms-vad-asr-zh-en-paraformer-large]: https://www.modelscope.cn/studios/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer
+[wasm-hf-vad-asr-zh-en-paraformer-small]: https://huggingface.co/spaces/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small
+[wasm-ms-vad-asr-zh-en-paraformer-small]: https://www.modelscope.cn/studios/k2-fsa/web-assembly-vad-asr-sherpa-onnx-zh-en-paraformer-small
+[wasm-hf-tts-piper-en]: https://huggingface.co/spaces/k2-fsa/web-assembly-tts-sherpa-onnx-en
+[wasm-ms-tts-piper-en]: https://modelscope.cn/studios/k2-fsa/web-assembly-tts-sherpa-onnx-en
+[wasm-hf-tts-piper-de]: https://huggingface.co/spaces/k2-fsa/web-assembly-tts-sherpa-onnx-de
+[wasm-ms-tts-piper-de]: https://modelscope.cn/studios/k2-fsa/web-assembly-tts-sherpa-onnx-de
+[wasm-hf-speaker-diarization]: https://huggingface.co/spaces/k2-fsa/web-assembly-speaker-diarization-sherpa-onnx
+[wasm-ms-speaker-diarization]: https://www.modelscope.cn/studios/csukuangfj/web-assembly-speaker-diarization-sherpa-onnx
+[apk-speaker-diarization]: https://k2-fsa.github.io/sherpa/onnx/speaker-diarization/apk.html
+[apk-speaker-diarization-cn]: https://k2-fsa.github.io/sherpa/onnx/speaker-diarization/apk-cn.html
+[apk-streaming-asr]: https://k2-fsa.github.io/sherpa/onnx/android/apk.html
+[apk-streaming-asr-cn]: https://k2-fsa.github.io/sherpa/onnx/android/apk-cn.html
+[apk-tts]: https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine.html
+[apk-tts-cn]: https://k2-fsa.github.io/sherpa/onnx/tts/apk-engine-cn.html
+[apk-vad]: https://k2-fsa.github.io/sherpa/onnx/vad/apk.html
+[apk-vad-cn]: https://k2-fsa.github.io/sherpa/onnx/vad/apk-cn.html
+[apk-vad-asr]: https://k2-fsa.github.io/sherpa/onnx/vad/apk-asr.html
+[apk-vad-asr-cn]: https://k2-fsa.github.io/sherpa/onnx/vad/apk-asr-cn.html
+[apk-2pass]: https://k2-fsa.github.io/sherpa/onnx/android/apk-2pass.html
+[apk-2pass-cn]: https://k2-fsa.github.io/sherpa/onnx/android/apk-2pass-cn.html
+[apk-at]: https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk.html
+[apk-at-cn]: https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-cn.html
+[apk-at-wearos]: https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-wearos.html
+[apk-at-wearos-cn]: https://k2-fsa.github.io/sherpa/onnx/audio-tagging/apk-wearos-cn.html
+[apk-sid]: https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk.html
+[apk-sid-cn]: https://k2-fsa.github.io/sherpa/onnx/speaker-identification/apk-cn.html
+[apk-slid]: https://k2-fsa.github.io/sherpa/onnx/spoken-language-identification/apk.html
+[apk-slid-cn]: https://k2-fsa.github.io/sherpa/onnx/spoken-language-identification/apk-cn.html
+[apk-kws]: https://k2-fsa.github.io/sherpa/onnx/kws/apk.html
+[apk-kws-cn]: https://k2-fsa.github.io/sherpa/onnx/kws/apk-cn.html
+[apk-flutter-streaming-asr]: https://k2-fsa.github.io/sherpa/onnx/flutter/asr/app.html
+[apk-flutter-streaming-asr-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/asr/app-cn.html
+[flutter-tts-android]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-android.html
+[flutter-tts-android-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-android-cn.html
+[flutter-tts-linux]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-linux.html
+[flutter-tts-linux-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-linux-cn.html
+[flutter-tts-macos-x64]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-macos-x64.html
+[flutter-tts-macos-arm64-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-macos-x64-cn.html
+[flutter-tts-macos-arm64]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-macos-arm64.html
+[flutter-tts-macos-x64-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-macos-arm64-cn.html
+[flutter-tts-win-x64]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-win.html
+[flutter-tts-win-x64-cn]: https://k2-fsa.github.io/sherpa/onnx/flutter/tts-win-cn.html
+[lazarus-subtitle]: https://k2-fsa.github.io/sherpa/onnx/lazarus/download-generated-subtitles.html
+[lazarus-subtitle-cn]: https://k2-fsa.github.io/sherpa/onnx/lazarus/download-generated-subtitles-cn.html
+[asr-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models
+[tts-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models
+[vad-models]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx
+[kws-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/kws-models
+[at-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models
+[sid-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models
+[slid-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models
+[punct-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models
+[speaker-segmentation-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models
+[GigaSpeech]: https://github.com/SpeechColab/GigaSpeech
+[WenetSpeech]: https://github.com/wenet-e2e/WenetSpeech
+[sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
+[sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-small-bilingual-zh-en-2023-02-16.tar.bz2
+[sherpa-onnx-streaming-zipformer-korean-2024-06-16]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-korean-2024-06-16.tar.bz2
+[sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-zh-14M-2023-02-23.tar.bz2
+[sherpa-onnx-streaming-zipformer-en-20M-2023-02-17]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2
+[sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ja-reazonspeech-2024-08-01.tar.bz2
+[sherpa-onnx-zipformer-ru-2024-09-18]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-ru-2024-09-18.tar.bz2
+[sherpa-onnx-zipformer-korean-2024-06-24]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-korean-2024-06-24.tar.bz2
+[sherpa-onnx-zipformer-thai-2024-06-20]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-thai-2024-06-20.tar.bz2
+[sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-transducer-giga-am-russian-2024-10-24.tar.bz2
+[sherpa-onnx-paraformer-zh-2024-03-09]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-2024-03-09.tar.bz2
+[sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-nemo-ctc-giga-am-russian-2024-10-24.tar.bz2
+[sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2
+[sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
+[sherpa-onnx-streaming-zipformer-fr-2023-04-14]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-fr-2023-04-14.tar.bz2
+[Moonshine tiny]: https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
+[NVIDIA Jetson Orin NX]: https://developer.download.nvidia.com/assets/embedded/secure/jetson/orin_nx/docs/Jetson_Orin_NX_DS-10712-001_v0.5.pdf?RCPGu9Q6OVAOv7a7vgtwc9-BLScXRIWq6cSLuditMALECJ_dOj27DgnqAPGVnT2VpiNpQan9SyFy-9zRykR58CokzbXwjSA7Gj819e91AXPrWkGZR3oS1VLxiDEpJa_Y0lr7UT-N4GnXtb8NlUkP4GkCkkF_FQivGPrAucCUywL481GH_WpP_p7ziHU1Wg==&t=eyJscyI6ImdzZW8iLCJsc2QiOiJodHRwczovL3d3dy5nb29nbGUuY29tLmhrLyJ9
+[NVIDIA Jetson Nano B01]: https://www.seeedstudio.com/blog/2020/01/16/new-revision-of-jetson-nano-dev-kit-now-supports-new-jetson-nano-module/
+[speech-enhancement-models]: https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models
--- a/apps/frameworks/sherpa-mnn/build-android-arm64-v8a.sh
+++ b/apps/frameworks/sherpa-mnn/build-android-arm64-v8a.sh
@ -0,0 +1,158 @@
+#!/usr/bin/env bash
+set -ex
+
+# If BUILD_SHARED_LIBS is ON, we use libonnxruntime.so
+# If BUILD_SHARED_LIBS is OFF, we use libonnxruntime.a
+#
+# In any case, we will have libsherpa-onnx-jni.so
+#
+# If BUILD_SHARED_LIBS is OFF, then libonnxruntime.a is linked into libsherpa-onnx-jni.so
+# and you only need to copy libsherpa-onnx-jni.so to your Android projects.
+#
+# If BUILD_SHARED_LIBS is ON, then you need to copy both libsherpa-onnx-jni.so
+# and libonnxruntime.so to your Android projects
+#
+BUILD_SHARED_LIBS=ON
+
+if [ $BUILD_SHARED_LIBS == ON ]; then
+  dir=$PWD/build-android-arm64-v8a
+else
+  dir=$PWD/build-android-arm64-v8a-static
+fi
+
+mkdir -p $dir
+cd $dir
+
+# Note from https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-android
+# (optional) remove the hardcoded debug flag in Android NDK android-ndk
+# issue: https://github.com/android/ndk/issues/243
+#
+# open $ANDROID_NDK/build/cmake/android.toolchain.cmake for ndk < r23
+# or $ANDROID_NDK/build/cmake/android-legacy.toolchain.cmake for ndk >= r23
+#
+# delete "-g" line
+#
+# list(APPEND ANDROID_COMPILER_FLAGS
+#   -g
+#   -DANDROID
+
+if [ -z $ANDROID_NDK ]; then
+  ANDROID_NDK=/star-fj/fangjun/software/android-sdk/ndk/22.1.7171670
+  if [ $BUILD_SHARED_LIBS == OFF ]; then
+    ANDROID_NDK=/star-fj/fangjun/software/android-sdk/ndk/27.0.11718014
+  fi
+  # or use
+  # ANDROID_NDK=/star-fj/fangjun/software/android-ndk
+  #
+  # Inside the $ANDROID_NDK directory, you can find a binary ndk-build
+  # and some other files like the file "build/cmake/android.toolchain.cmake"
+
+  if [ ! -d $ANDROID_NDK ]; then
+    # For macOS, I have installed Android Studio, select the menu
+    # Tools -> SDK manager -> Android SDK
+    # and set "Android SDK location" to /Users/fangjun/software/my-android
+    ANDROID_NDK=/Users/fangjun/software/my-android/ndk/22.1.7171670
+
+    if [ $BUILD_SHARED_LIBS == OFF ]; then
+      ANDROID_NDK=/Users/fangjun/software/my-android/ndk/27.0.11718014
+    fi
+  fi
+fi
+
+if [ ! -d $ANDROID_NDK ]; then
+  echo Please set the environment variable ANDROID_NDK before you run this script
+  exit 1
+fi
+
+echo "ANDROID_NDK: $ANDROID_NDK"
+sleep 1
+
+if [ -z $SHERPA_MNN_ENABLE_TTS ]; then
+  SHERPA_MNN_ENABLE_TTS=ON
+fi
+
+if [ -z $SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION ]; then
+  SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION=ON
+fi
+
+if [ -z $SHERPA_MNN_ENABLE_BINARY ]; then
+  SHERPA_MNN_ENABLE_BINARY=OFF
+fi
+
+if [ -z $SHERPA_MNN_ENABLE_C_API ]; then
+  SHERPA_MNN_ENABLE_C_API=OFF
+fi
+
+if [ -z $SHERPA_MNN_ENABLE_JNI ]; then
+  SHERPA_MNN_ENABLE_JNI=ON
+fi
+
+cmake -DCMAKE_TOOLCHAIN_FILE="$ANDROID_NDK/build/cmake/android.toolchain.cmake" \
+    -DSHERPA_MNN_ENABLE_TTS=$SHERPA_MNN_ENABLE_TTS \
+    -DSHERPA_MNN_ENABLE_SPEAKER_DIARIZATION=$SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION \
+    -DSHERPA_MNN_ENABLE_BINARY=$SHERPA_MNN_ENABLE_BINARY \
+    -DBUILD_PIPER_PHONMIZE_EXE=OFF \
+    -DBUILD_PIPER_PHONMIZE_TESTS=OFF \
+    -DBUILD_ESPEAK_NG_EXE=OFF \
+    -DBUILD_ESPEAK_NG_TESTS=OFF \
+    -DCMAKE_BUILD_TYPE=Release \
+    -DMNN_LIB_DIR=/Users/xtjiang/alicnn/AliNNPrivate/project/android/build_64 \
+    -DBUILD_SHARED_LIBS=$BUILD_SHARED_LIBS \
+    -DSHERPA_MNN_ENABLE_PYTHON=OFF \
+    -DSHERPA_MNN_ENABLE_TESTS=OFF \
+    -DSHERPA_MNN_ENABLE_CHECK=OFF \
+    -DSHERPA_MNN_ENABLE_PORTAUDIO=OFF \
+    -DSHERPA_MNN_ENABLE_JNI=$SHERPA_MNN_ENABLE_JNI \
+    -DSHERPA_MNN_LINK_LIBSTDCPP_STATICALLY=OFF \
+    -DSHERPA_MNN_ENABLE_C_API=$SHERPA_MNN_ENABLE_C_API \
+    -DCMAKE_INSTALL_PREFIX=./install \
+    -DANDROID_ABI="arm64-v8a" \
+    -DANDROID_PLATFORM=android-21 ..
+
+    # By default, it links to libc++_static.a
+    # -DANDROID_STL=c++_shared \
+
+# Please use -DANDROID_PLATFORM=android-27 if you want to use Android NNAPI
+
+# make VERBOSE=1 -j4
+make -j4
+make install/strip
+rm -rf install/share
+rm -rf install/lib/pkgconfig
+rm -rf install/lib/lib*.a
+if [ -f install/lib/libsherpa-onnx-c-api.so ]; then
+  cat >install/lib/README.md <<EOF
+# Introduction
+
+Note that if you use Android Studio, then you only need to
+copy libonnxruntime.so and libsherpa-onnx-jni.so
+to your jniLibs, and you don't need libsherpa-onnx-c-api.so or
+libsherpa-onnx-cxx-api.so.
+
+libsherpa-onnx-c-api.so and libsherpa-onnx-cxx-api.so are for users
+who don't use JNI. In that case, libsherpa-onnx-jni.so is not needed.
+
+In any case, libonnxruntime.is is always needed.
+EOF
+  ls -lh install/lib/README.md
+fi
+
+# To run the generated binaries on Android, please use the following steps.
+#
+#
+# 1. Copy sherpa-onnx and its dependencies to Android
+#
+#   cd build-android-arm64-v8a/install/lib
+#   adb push ./lib*.so /data/local/tmp
+#   cd ../bin
+#   adb push ./sherpa-onnx /data/local/tmp
+#
+# 2. Login into Android
+#
+#   adb shell
+#   cd /data/local/tmp
+#   ./sherpa-onnx
+#
+# It should show the help message of sherpa-onnx.
+#
+# Please use the above approach to copy model files to your phone.
--- a/apps/frameworks/sherpa-mnn/build-ios.sh
+++ b/apps/frameworks/sherpa-mnn/build-ios.sh
@ -0,0 +1,173 @@
+#!/usr/bin/env  bash
+
+set -e
+
+dir=build-ios
+mkdir -p $dir
+cd $dir
+
+
+if [ -z ${MNN_LIB_DIR} ]; then
+  echo "Please export MNN_LIB_DIR=/path/to/MNN"
+  exit 1
+fi
+
+# First, for simulator
+echo "Building for simulator (x86_64)"
+
+
+# Note: We use -DENABLE_ARC=1 here to fix the linking error:
+#
+# The symbol _NSLog is not defined
+#
+
+cmake \
+  -DSHERPA_MNN_ENABLE_BINARY=OFF \
+  -DMNN_LIB_DIR=${MNN_LIB_DIR} \
+  -DBUILD_PIPER_PHONMIZE_EXE=OFF \
+  -DBUILD_PIPER_PHONMIZE_TESTS=OFF \
+  -DBUILD_ESPEAK_NG_EXE=OFF \
+  -DBUILD_ESPEAK_NG_TESTS=OFF \
+  -S .. \
+  -DCMAKE_TOOLCHAIN_FILE=./toolchains/ios.toolchain.cmake \
+  -DPLATFORM=SIMULATOR64 \
+  -DENABLE_BITCODE=0 \
+  -DENABLE_ARC=1 \
+  -DENABLE_VISIBILITY=0 \
+  -DCMAKE_BUILD_TYPE=Release \
+  -DBUILD_SHARED_LIBS=OFF \
+  -DSHERPA_MNN_ENABLE_PYTHON=OFF \
+  -DSHERPA_MNN_ENABLE_TESTS=OFF \
+  -DSHERPA_MNN_ENABLE_CHECK=OFF \
+  -DSHERPA_MNN_ENABLE_PORTAUDIO=OFF \
+  -DSHERPA_MNN_ENABLE_JNI=OFF \
+  -DSHERPA_MNN_ENABLE_C_API=ON \
+  -DSHERPA_MNN_ENABLE_WEBSOCKET=OFF \
+  -DDEPLOYMENT_TARGET=13.0 \
+  -DCMAKE_POLICY_VERSION_MINIMUM=3.5 \
+  -B build/simulator_x86_64
+
+cmake --build build/simulator_x86_64 -j $(nproc) --verbose
+
+echo "Building for simulator (arm64)"
+
+cmake \
+  -DSHERPA_MNN_ENABLE_BINARY=OFF \
+  -DMNN_LIB_DIR=${MNN_LIB_DIR} \
+  -DBUILD_PIPER_PHONMIZE_EXE=OFF \
+  -DBUILD_PIPER_PHONMIZE_TESTS=OFF \
+  -DBUILD_ESPEAK_NG_EXE=OFF \
+  -DBUILD_ESPEAK_NG_TESTS=OFF \
+  -S .. \
+  -DCMAKE_TOOLCHAIN_FILE=./toolchains/ios.toolchain.cmake \
+  -DPLATFORM=SIMULATORARM64 \
+  -DENABLE_BITCODE=0 \
+  -DENABLE_ARC=1 \
+  -DENABLE_VISIBILITY=0 \
+  -DCMAKE_BUILD_TYPE=Release \
+  -DCMAKE_INSTALL_PREFIX=./install \
+  -DBUILD_SHARED_LIBS=OFF \
+  -DSHERPA_MNN_ENABLE_PYTHON=OFF \
+  -DSHERPA_MNN_ENABLE_TESTS=OFF \
+  -DSHERPA_MNN_ENABLE_CHECK=OFF \
+  -DSHERPA_MNN_ENABLE_PORTAUDIO=OFF \
+  -DSHERPA_MNN_ENABLE_JNI=OFF \
+  -DSHERPA_MNN_ENABLE_C_API=ON \
+  -DSHERPA_MNN_ENABLE_WEBSOCKET=OFF \
+  -DDEPLOYMENT_TARGET=13.0 \
+  -DCMAKE_POLICY_VERSION_MINIMUM=3.5 \
+  -B build/simulator_arm64
+
+cmake --build build/simulator_arm64 -j $(nproc) --verbose
+
+echo "Building for arm64"
+
+
+cmake \
+  -DSHERPA_MNN_ENABLE_BINARY=OFF \
+  -DMNN_LIB_DIR=${MNN_LIB_DIR} \
+  -DBUILD_PIPER_PHONMIZE_EXE=OFF \
+  -DBUILD_PIPER_PHONMIZE_TESTS=OFF \
+  -DBUILD_ESPEAK_NG_EXE=OFF \
+  -DBUILD_ESPEAK_NG_TESTS=OFF \
+  -S .. \
+  -DCMAKE_TOOLCHAIN_FILE=./toolchains/ios.toolchain.cmake \
+  -DPLATFORM=OS64 \
+  -DENABLE_BITCODE=0 \
+  -DENABLE_ARC=1 \
+  -DENABLE_VISIBILITY=0 \
+  -DCMAKE_INSTALL_PREFIX=./install \
+  -DCMAKE_BUILD_TYPE=Release \
+  -DBUILD_SHARED_LIBS=OFF \
+  -DSHERPA_MNN_ENABLE_PYTHON=OFF \
+  -DSHERPA_MNN_ENABLE_TESTS=OFF \
+  -DSHERPA_MNN_ENABLE_CHECK=OFF \
+  -DSHERPA_MNN_ENABLE_PORTAUDIO=OFF \
+  -DSHERPA_MNN_ENABLE_JNI=OFF \
+  -DSHERPA_MNN_ENABLE_C_API=ON \
+  -DSHERPA_MNN_ENABLE_WEBSOCKET=OFF \
+  -DDEPLOYMENT_TARGET=13.0 \
+  -DCMAKE_POLICY_VERSION_MINIMUM=3.5 \
+  -B build/os64
+
+cmake --build build/os64 -j $(nproc)
+# Generate headers for sherpa-mnn.xcframework
+cmake --build build/os64 --target install
+
+echo "Generate xcframework"
+
+mkdir -p "build/simulator/lib"
+for f in libkaldi-native-fbank-core.a libsherpa-mnn-c-api.a libsherpa-mnn-core.a \
+         libsherpa-mnn-fstfar.a libssentencepiece_core.a \
+         libsherpa-mnn-fst.a libsherpa-mnn-kaldifst-core.a libkaldi-decoder-core.a \
+         libucd.a libpiper_phonemize.a libespeak-ng.a; do
+  lipo -create build/simulator_arm64/lib/${f} \
+               build/simulator_x86_64/lib/${f} \
+       -output build/simulator/lib/${f}
+done
+
+# Merge archive first, because the following xcodebuild create xcframework
+# cannot accept multi archive with the same architecture.
+libtool -static -o build/simulator/sherpa-mnn.a \
+  build/simulator/lib/libkaldi-native-fbank-core.a \
+  build/simulator/lib/libsherpa-mnn-c-api.a \
+  build/simulator/lib/libsherpa-mnn-core.a  \
+  build/simulator/lib/libsherpa-mnn-fstfar.a   \
+  build/simulator/lib/libsherpa-mnn-fst.a   \
+  build/simulator/lib/libsherpa-mnn-kaldifst-core.a \
+  build/simulator/lib/libkaldi-decoder-core.a \
+  build/simulator/lib/libucd.a \
+  build/simulator/lib/libpiper_phonemize.a \
+  build/simulator/lib/libespeak-ng.a \
+  build/simulator/lib/libssentencepiece_core.a
+
+libtool -static -o build/os64/sherpa-mnn.a \
+  build/os64/lib/libkaldi-native-fbank-core.a \
+  build/os64/lib/libsherpa-mnn-c-api.a \
+  build/os64/lib/libsherpa-mnn-core.a \
+  build/os64/lib/libsherpa-mnn-fstfar.a   \
+  build/os64/lib/libsherpa-mnn-fst.a   \
+  build/os64/lib/libsherpa-mnn-kaldifst-core.a \
+  build/os64/lib/libkaldi-decoder-core.a \
+  build/os64/lib/libucd.a \
+  build/os64/lib/libpiper_phonemize.a \
+  build/os64/lib/libespeak-ng.a \
+  build/os64/lib/libssentencepiece_core.a
+
+rm -rf sherpa-mnn.xcframework
+
+xcodebuild -create-xcframework \
+      -library "build/os64/sherpa-mnn.a" \
+      -library "build/simulator/sherpa-mnn.a" \
+      -output sherpa-mnn.xcframework
+
+# Copy Headers
+mkdir -p sherpa-mnn.xcframework/Headers
+cp -av install/include/* sherpa-mnn.xcframework/Headers
+
+pushd sherpa-mnn.xcframework/ios-arm64_x86_64-simulator
+ln -s sherpa-mnn.a libsherpa-mnn.a
+popd
+
+pushd sherpa-mnn.xcframework/ios-arm64
+ln -s sherpa-mnn.a libsherpa-mnn.a
--- a/apps/frameworks/sherpa-mnn/build-swift-macos.sh
+++ b/apps/frameworks/sherpa-mnn/build-swift-macos.sh
@ -0,0 +1,46 @@
+#!/usr/bin/env  bash
+
+set -ex
+
+dir=build-swift-macos
+mkdir -p $dir
+cd $dir
+
+cmake \
+  -DSHERPA_MNN_ENABLE_BINARY=OFF \
+  -DMNN_LIB_DIR=/Users/xtjiang/alicnn/AliNNPrivate \
+  -DSHERPA_MNN_BUILD_C_API_EXAMPLES=OFF \
+  -DCMAKE_OSX_ARCHITECTURES="arm64;x86_64" \
+  -DCMAKE_INSTALL_PREFIX=./install \
+  -DCMAKE_BUILD_TYPE=Release \
+  -DBUILD_SHARED_LIBS=OFF \
+  -DSHERPA_MNN_ENABLE_PYTHON=OFF \
+  -DSHERPA_MNN_ENABLE_TESTS=OFF \
+  -DSHERPA_MNN_ENABLE_CHECK=OFF \
+  -DSHERPA_MNN_ENABLE_PORTAUDIO=OFF \
+  -DSHERPA_MNN_ENABLE_JNI=OFF \
+  -DSHERPA_MNN_ENABLE_C_API=ON \
+  -DSHERPA_MNN_ENABLE_WEBSOCKET=OFF \
+  ../
+
+make VERBOSE=1 -j4
+make install
+rm -fv ./install/include/cargs.h
+
+libtool -static -o ./install/lib/libsherpa-mnn.a \
+  ./install/lib/libsherpa-mnn-c-api.a \
+  ./install/lib/libsherpa-mnn-core.a \
+  ./install/lib/libkaldi-native-fbank-core.a \
+  ./install/lib/libsherpa-mnn-fstfar.a \
+  ./install/lib/libsherpa-mnn-fst.a \
+  ./install/lib/libsherpa-mnn-kaldifst-core.a \
+  ./install/lib/libkaldi-decoder-core.a \
+  ./install/lib/libucd.a \
+  ./install/lib/libpiper_phonemize.a \
+  ./install/lib/libespeak-ng.a \
+  ./install/lib/libssentencepiece_core.a
+
+xcodebuild -create-xcframework \
+  -library install/lib/libsherpa-mnn.a \
+  -headers install/include \
+  -output sherpa-mnn.xcframework
--- a/apps/frameworks/sherpa-mnn/build-visionos.sh
+++ b/apps/frameworks/sherpa-mnn/build-visionos.sh
@ -0,0 +1,142 @@
+#!/usr/bin/env  bash
+
+set -e
+
+dir=build-visionos
+mkdir -p $dir
+cd $dir
+
+if [ -z ${MNN_LIB_DIR} ]; then
+  echo "Please export MNN_LIB_DIR=/path/to/MNN"
+  exit 1
+fi
+
+
+# Note: We use -DENABLE_ARC=1 here to fix the linking error:
+#
+# The symbol _NSLog is not defined
+
+echo "Building for simulator (arm64)"
+
+cmake \
+  -DSHERPA_MNN_ENABLE_BINARY=OFF \
+  -DMNN_LIB_DIR=${MNN_LIB_DIR} \
+  -DBUILD_PIPER_PHONMIZE_EXE=OFF \
+  -DBUILD_PIPER_PHONMIZE_TESTS=OFF \
+  -DBUILD_ESPEAK_NG_EXE=OFF \
+  -DBUILD_ESPEAK_NG_TESTS=OFF \
+  -S .. \
+  -DCMAKE_TOOLCHAIN_FILE=./toolchains/ios.toolchain.cmake \
+  -DPLATFORM=XRSIMULATOR \
+  -DENABLE_BITCODE=0 \
+  -DENABLE_ARC=1 \
+  -DENABLE_VISIBILITY=0 \
+  -DCMAKE_BUILD_TYPE=Release \
+  -DCMAKE_INSTALL_PREFIX=./install \
+  -DBUILD_SHARED_LIBS=OFF \
+  -DSHERPA_MNN_ENABLE_PYTHON=OFF \
+  -DSHERPA_MNN_ENABLE_TESTS=OFF \
+  -DSHERPA_MNN_ENABLE_CHECK=OFF \
+  -DSHERPA_MNN_ENABLE_PORTAUDIO=OFF \
+  -DSHERPA_MNN_ENABLE_JNI=OFF \
+  -DSHERPA_MNN_ENABLE_C_API=ON \
+  -DSHERPA_MNN_ENABLE_WEBSOCKET=OFF \
+  -DDEPLOYMENT_TARGET=1.0 \
+  -DCMAKE_POLICY_VERSION_MINIMUM=3.5 \
+  -B build/simulator
+
+cmake --build build/simulator -j $(nproc) --verbose
+
+cmake --build build/simulator --target install
+
+echo "Building for arm64"
+
+
+cmake \
+  -DSHERPA_MNN_ENABLE_BINARY=OFF \
+  -DMNN_LIB_DIR=${MNN_LIB_DIR} \
+  -DBUILD_PIPER_PHONMIZE_EXE=OFF \
+  -DBUILD_PIPER_PHONMIZE_TESTS=OFF \
+  -DBUILD_ESPEAK_NG_EXE=OFF \
+  -DBUILD_ESPEAK_NG_TESTS=OFF \
+  -S .. \
+  -DCMAKE_TOOLCHAIN_FILE=./toolchains/ios.toolchain.cmake \
+  -DPLATFORM=XROS \
+  -DENABLE_BITCODE=0 \
+  -DENABLE_ARC=1 \
+  -DENABLE_VISIBILITY=0 \
+  -DCMAKE_INSTALL_PREFIX=./install \
+  -DCMAKE_BUILD_TYPE=Release \
+  -DBUILD_SHARED_LIBS=OFF \
+  -DSHERPA_MNN_ENABLE_PYTHON=OFF \
+  -DSHERPA_MNN_ENABLE_TESTS=OFF \
+  -DSHERPA_MNN_ENABLE_CHECK=OFF \
+  -DSHERPA_MNN_ENABLE_PORTAUDIO=OFF \
+  -DSHERPA_MNN_ENABLE_JNI=OFF \
+  -DSHERPA_MNN_ENABLE_C_API=ON \
+  -DSHERPA_MNN_ENABLE_WEBSOCKET=OFF \
+  -DDEPLOYMENT_TARGET=1.0 \
+  -DCMAKE_POLICY_VERSION_MINIMUM=3.5 \
+  -B build/os64
+
+cmake --build build/os64 -j $(nproc)
+# Generate headers for sherpa-mnn.xcframework
+cmake --build build/os64 --target install
+
+echo "Generate xcframework"
+
+# mkdir -p "build/simulator/lib"
+# for f in libkaldi-native-fbank-core.a libsherpa-mnn-c-api.a libsherpa-mnn-core.a \
+#          libsherpa-mnn-fstfar.a libssentencepiece_core.a \
+#          libsherpa-mnn-fst.a libsherpa-mnn-kaldifst-core.a libkaldi-decoder-core.a \
+#          libucd.a libpiper_phonemize.a libespeak-ng.a; do
+#   lipo -create build/simulator_arm64/lib/${f} \
+#                build/simulator_x86_64/lib/${f} \
+#        -output build/simulator/lib/${f}
+# done
+
+# Merge archive first, because the following xcodebuild create xcframework
+# cannot accept multi archive with the same architecture.
+libtool -static -o build/simulator/sherpa-mnn.a \
+  build/simulator/lib/libkaldi-native-fbank-core.a \
+  build/simulator/lib/libsherpa-mnn-c-api.a \
+  build/simulator/lib/libsherpa-mnn-core.a  \
+  build/simulator/lib/libsherpa-mnn-fstfar.a   \
+  build/simulator/lib/libsherpa-mnn-fst.a   \
+  build/simulator/lib/libsherpa-mnn-kaldifst-core.a \
+  build/simulator/lib/libkaldi-decoder-core.a \
+  build/simulator/lib/libucd.a \
+  build/simulator/lib/libpiper_phonemize.a \
+  build/simulator/lib/libespeak-ng.a \
+  build/simulator/lib/libssentencepiece_core.a
+
+libtool -static -o build/os64/sherpa-mnn.a \
+  build/os64/lib/libkaldi-native-fbank-core.a \
+  build/os64/lib/libsherpa-mnn-c-api.a \
+  build/os64/lib/libsherpa-mnn-core.a \
+  build/os64/lib/libsherpa-mnn-fstfar.a   \
+  build/os64/lib/libsherpa-mnn-fst.a   \
+  build/os64/lib/libsherpa-mnn-kaldifst-core.a \
+  build/os64/lib/libkaldi-decoder-core.a \
+  build/os64/lib/libucd.a \
+  build/os64/lib/libpiper_phonemize.a \
+  build/os64/lib/libespeak-ng.a \
+  build/os64/lib/libssentencepiece_core.a
+
+rm -rf sherpa-mnn.xcframework
+
+xcodebuild -create-xcframework \
+      -library "build/os64/sherpa-mnn.a" \
+      -library "build/simulator/sherpa-mnn.a" \
+      -output sherpa-mnn.xcframework
+
+# Copy Headers
+mkdir -p sherpa-mnn.xcframework/Headers
+cp -av install/include/* sherpa-mnn.xcframework/Headers
+
+pushd sherpa-mnn.xcframework/xros-arm64-simulator
+ln -s sherpa-mnn.a libsherpa-mnn.a
+popd
+
+pushd sherpa-mnn.xcframework/xros-arm64
+ln -s sherpa-mnn.a libsherpa-mnn.a
--- a/apps/frameworks/sherpa-mnn/c-api-examples/CMakeLists.txt
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/CMakeLists.txt
@ -0,0 +1,106 @@
+include(cargs)
+
+include_directories(${CMAKE_SOURCE_DIR})
+add_executable(decode-file-c-api decode-file-c-api.c)
+target_link_libraries(decode-file-c-api sherpa-mnn-c-api cargs)
+
+add_executable(kws-c-api kws-c-api.c)
+target_link_libraries(kws-c-api sherpa-mnn-c-api)
+
+add_executable(speech-enhancement-gtcrn-c-api speech-enhancement-gtcrn-c-api.c)
+target_link_libraries(speech-enhancement-gtcrn-c-api sherpa-mnn-c-api)
+
+if(SHERPA_MNN_ENABLE_TTS)
+  add_executable(offline-tts-c-api offline-tts-c-api.c)
+  target_link_libraries(offline-tts-c-api sherpa-mnn-c-api cargs)
+
+  add_executable(matcha-tts-zh-c-api matcha-tts-zh-c-api.c)
+  target_link_libraries(matcha-tts-zh-c-api sherpa-mnn-c-api)
+
+  add_executable(matcha-tts-en-c-api matcha-tts-en-c-api.c)
+  target_link_libraries(matcha-tts-en-c-api sherpa-mnn-c-api)
+
+  add_executable(kokoro-tts-en-c-api kokoro-tts-en-c-api.c)
+  target_link_libraries(kokoro-tts-en-c-api sherpa-mnn-c-api)
+
+  add_executable(kokoro-tts-zh-en-c-api kokoro-tts-zh-en-c-api.c)
+  target_link_libraries(kokoro-tts-zh-en-c-api sherpa-mnn-c-api)
+endif()
+
+if(SHERPA_MNN_ENABLE_SPEAKER_DIARIZATION)
+  add_executable(offline-speaker-diarization-c-api offline-speaker-diarization-c-api.c)
+  target_link_libraries(offline-speaker-diarization-c-api sherpa-mnn-c-api)
+endif()
+
+add_executable(spoken-language-identification-c-api spoken-language-identification-c-api.c)
+target_link_libraries(spoken-language-identification-c-api sherpa-mnn-c-api)
+
+add_executable(speaker-identification-c-api speaker-identification-c-api.c)
+target_link_libraries(speaker-identification-c-api sherpa-mnn-c-api)
+
+add_executable(streaming-hlg-decode-file-c-api streaming-hlg-decode-file-c-api.c)
+target_link_libraries(streaming-hlg-decode-file-c-api sherpa-mnn-c-api)
+
+add_executable(audio-tagging-c-api audio-tagging-c-api.c)
+target_link_libraries(audio-tagging-c-api sherpa-mnn-c-api)
+
+add_executable(add-punctuation-c-api add-punctuation-c-api.c)
+target_link_libraries(add-punctuation-c-api sherpa-mnn-c-api)
+
+add_executable(whisper-c-api whisper-c-api.c)
+target_link_libraries(whisper-c-api sherpa-mnn-c-api)
+
+add_executable(fire-red-asr-c-api fire-red-asr-c-api.c)
+target_link_libraries(fire-red-asr-c-api sherpa-mnn-c-api)
+
+add_executable(sense-voice-c-api sense-voice-c-api.c)
+target_link_libraries(sense-voice-c-api sherpa-mnn-c-api)
+
+add_executable(moonshine-c-api moonshine-c-api.c)
+target_link_libraries(moonshine-c-api sherpa-mnn-c-api)
+
+add_executable(zipformer-c-api zipformer-c-api.c)
+target_link_libraries(zipformer-c-api sherpa-mnn-c-api)
+
+add_executable(streaming-zipformer-c-api streaming-zipformer-c-api.c)
+target_link_libraries(streaming-zipformer-c-api sherpa-mnn-c-api)
+
+add_executable(paraformer-c-api paraformer-c-api.c)
+target_link_libraries(paraformer-c-api sherpa-mnn-c-api)
+
+add_executable(streaming-paraformer-c-api streaming-paraformer-c-api.c)
+target_link_libraries(streaming-paraformer-c-api sherpa-mnn-c-api)
+
+add_executable(telespeech-c-api telespeech-c-api.c)
+target_link_libraries(telespeech-c-api sherpa-mnn-c-api)
+
+add_executable(vad-sense-voice-c-api vad-sense-voice-c-api.c)
+target_link_libraries(vad-sense-voice-c-api sherpa-mnn-c-api)
+
+add_executable(vad-whisper-c-api vad-whisper-c-api.c)
+target_link_libraries(vad-whisper-c-api sherpa-mnn-c-api)
+
+add_executable(vad-moonshine-c-api vad-moonshine-c-api.c)
+target_link_libraries(vad-moonshine-c-api sherpa-mnn-c-api)
+
+add_executable(streaming-zipformer-buffered-tokens-hotwords-c-api
+               streaming-zipformer-buffered-tokens-hotwords-c-api.c)
+target_link_libraries(streaming-zipformer-buffered-tokens-hotwords-c-api sherpa-mnn-c-api)
+
+add_executable(streaming-paraformer-buffered-tokens-c-api
+               streaming-paraformer-buffered-tokens-c-api.c)
+target_link_libraries(streaming-paraformer-buffered-tokens-c-api sherpa-mnn-c-api)
+
+add_executable(streaming-ctc-buffered-tokens-c-api
+               streaming-ctc-buffered-tokens-c-api.c)
+target_link_libraries(streaming-ctc-buffered-tokens-c-api sherpa-mnn-c-api)
+
+add_executable(keywords-spotter-buffered-tokens-keywords-c-api
+               keywords-spotter-buffered-tokens-keywords-c-api.c)
+target_link_libraries(keywords-spotter-buffered-tokens-keywords-c-api sherpa-mnn-c-api)
+
+if(SHERPA_MNN_HAS_ALSA)
+  add_subdirectory(./asr-microphone-example)
+elseif((UNIX AND NOT APPLE) OR LINUX)
+  message(WARNING "Not include ./asr-microphone-example since alsa is not available")
+endif()
--- a/apps/frameworks/sherpa-mnn/c-api-examples/README.md
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/README.md
@ -0,0 +1,18 @@
+# Introduction
+
+This folder contains C API examples for [sherpa-onnx][sherpa-onnx].
+
+Please refer to the documentation
+https://k2-fsa.github.io/sherpa/onnx/c-api/index.html
+for details.
+
+
+## File descriptions
+
+- [decode-file-c-api.c](./decode-file-c-api.c) This file shows how to use the C API
+  for speech recognition with a streaming model.
+
+- [offline-tts-c-api.c](./offline-tts-c-api.c) This file shows how to use the C API
+  to convert text to speech with a non-streaming model.
+
+[sherpa-onnx]: https://github.com/k2-fsa/sherpa-onnx
--- a/apps/frameworks/sherpa-mnn/c-api-examples/add-punctuation-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/add-punctuation-c-api.c
@ -0,0 +1,67 @@
+// c-api-examples/add-punctuation-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+// We assume you have pre-downloaded the model files for testing
+// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/punctuation-models
+//
+// An example is given below:
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/punctuation-models/sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2
+// tar xvf sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2
+// rm sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  SherpaMnnOfflinePunctuationConfig config;
+  memset(&config, 0, sizeof(config));
+
+  // clang-format off
+  config.model.ct_transformer = "./sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12/model.onnx";
+  // clang-format on
+  config.model.num_threads = 1;
+  config.model.debug = 1;
+  config.model.provider = "cpu";
+
+  const SherpaMnnOfflinePunctuation *punct =
+      SherpaMnnCreateOfflinePunctuation(&config);
+  if (!punct) {
+    fprintf(stderr,
+            "Failed to create OfflinePunctuation. Please check your config");
+    return -1;
+  }
+
+  const char *texts[] = {
+      "这是一个测试你好吗How are you我很好thank you are you ok谢谢你",
+      "我们都是木头人不会说话不会动",
+      ("The African blogosphere is rapidly expanding bringing more voices "
+       "online in the form of commentaries opinions analyses rants and poetry"),
+  };
+
+  int32_t n = sizeof(texts) / sizeof(const char *);
+  fprintf(stderr, "n: %d\n", n);
+
+  fprintf(stderr, "--------------------\n");
+  for (int32_t i = 0; i != n; ++i) {
+    const char *text_with_punct =
+        SherpaOfflinePunctuationAddPunct(punct, texts[i]);
+
+    fprintf(stderr, "Input text: %s\n", texts[i]);
+    fprintf(stderr, "Output text: %s\n", text_with_punct);
+    SherpaOfflinePunctuationFreeText(text_with_punct);
+    fprintf(stderr, "--------------------\n");
+  }
+
+  SherpaMnnDestroyOfflinePunctuation(punct);
+
+  return 0;
+};
--- a/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/CMakeLists.txt
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/CMakeLists.txt
@ -0,0 +1,9 @@
+
+add_executable(c-api-alsa c-api-alsa.cc alsa.cc)
+target_link_libraries(c-api-alsa sherpa-onnx-c-api cargs)
+
+if(DEFINED ENV{SHERPA_MNN_ALSA_LIB_DIR})
+  target_link_libraries(c-api-alsa -L$ENV{SHERPA_MNN_ALSA_LIB_DIR} -lasound)
+else()
+  target_link_libraries(c-api-alsa asound)
+endif()
--- a/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/CPPLINT.cfg
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/CPPLINT.cfg
@ -0,0 +1 @@
+exclude_files=alsa.cc|alsa.h
--- a/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/README.md
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/README.md
@ -0,0 +1,12 @@
+# Introduction
+
+This folder contains examples for real-time speech recognition from a microphone
+using sherpa-onnx C API.
+
+**Note**: You can call C API from C++ files.
+
+
+## ./c-api-alsa.cc
+
+This file uses alsa to read a microphone. It runs only on Linux. This file
+does not support macOS or Windows.
--- a/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/alsa.cc
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/alsa.cc
@ -0,0 +1 @@
+../../sherpa-onnx/csrc/alsa.cc
--- a/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/alsa.h
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/alsa.h
@ -0,0 +1 @@
+../../sherpa-onnx/csrc/alsa.h
--- a/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/c-api-alsa.cc
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/asr-microphone-example/c-api-alsa.cc
@ -0,0 +1,259 @@
+// c-api-examples/asr-microphone-example/c-api-alsa.cc
+// Copyright (c)  2022-2024  Xiaomi Corporation
+
+#include <signal.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include <algorithm>
+#include <cctype>  // std::tolower
+#include <cstdint>
+#include <string>
+
+#include "c-api-examples/asr-microphone-example/alsa.h"
+
+// NOTE: You don't need to use cargs.h in your own project.
+// We use it in this file to parse commandline arguments
+#include "cargs.h"  // NOLINT
+#include "sherpa-mnn/c-api/c-api.h"
+
+static struct cag_option options[] = {
+    {/*.identifier =*/'h',
+     /*.access_letters =*/"h",
+     /*.access_name =*/"help",
+     /*.value_name =*/"help",
+     /*.description =*/"Show help"},
+    {/*.identifier =*/'t',
+     /*.access_letters =*/NULL,
+     /*.access_name =*/"tokens",
+     /*.value_name =*/"tokens",
+     /*.description =*/"Tokens file"},
+    {/*.identifier =*/'e',
+     /*.access_letters =*/NULL,
+     /*.access_name =*/"encoder",
+     /*.value_name =*/"encoder",
+     /*.description =*/"Encoder ONNX file"},
+    {/*.identifier =*/'d',
+     /*.access_letters =*/NULL,
+     /*.access_name =*/"decoder",
+     /*.value_name =*/"decoder",
+     /*.description =*/"Decoder ONNX file"},
+    {/*.identifier =*/'j',
+     /*.access_letters =*/NULL,
+     /*.access_name =*/"joiner",
+     /*.value_name =*/"joiner",
+     /*.description =*/"Joiner ONNX file"},
+    {/*.identifier =*/'n',
+     /*.access_letters =*/NULL,
+     /*.access_name =*/"num-threads",
+     /*.value_name =*/"num-threads",
+     /*.description =*/"Number of threads"},
+    {/*.identifier =*/'p',
+     /*.access_letters =*/NULL,
+     /*.access_name =*/"provider",
+     /*.value_name =*/"provider",
+     /*.description =*/"Provider: cpu (default), cuda, coreml"},
+    {/*.identifier =*/'m',
+     /*.access_letters =*/NULL,
+     /*.access_name =*/"decoding-method",
+     /*.value_name =*/"decoding-method",
+     /*.description =*/
+     "Decoding method: greedy_search (default), modified_beam_search"},
+    {/*.identifier =*/'f',
+     /*.access_letters =*/NULL,
+     /*.access_name =*/"hotwords-file",
+     /*.value_name =*/"hotwords-file",
+     /*.description =*/
+     "The file containing hotwords, one words/phrases per line, and for each "
+     "phrase the bpe/cjkchar are separated by a space. For example: ▁HE LL O "
+     "▁WORLD, 你 好 世 界"},
+    {/*.identifier =*/'s',
+     /*.access_letters =*/NULL,
+     /*.access_name =*/"hotwords-score",
+     /*.value_name =*/"hotwords-score",
+     /*.description =*/
+     "The bonus score for each token in hotwords. Used only when "
+     "decoding_method is modified_beam_search"},
+};
+
+const char *kUsage =
+    R"(
+Usage:
+  ./bin/c-api-alsa \
+    --tokens=/path/to/tokens.txt \
+    --encoder=/path/to/encoder.onnx \
+    --decoder=/path/to/decoder.onnx \
+    --joiner=/path/to/decoder.onnx \
+    device_name
+
+The device name specifies which microphone to use in case there are several
+on your system. You can use
+
+  arecord -l
+
+to find all available microphones on your computer. For instance, if it outputs
+
+**** List of CAPTURE Hardware Devices ****
+card 3: UACDemoV10 [UACDemoV1.0], device 0: USB Audio [USB Audio]
+  Subdevices: 1/1
+  Subdevice #0: subdevice #0
+
+and if you want to select card 3 and device 0 on that card, please use:
+
+  plughw:3,0
+
+as the device_name.
+)";
+
+bool stop = false;
+
+static void Handler(int sig) {
+  stop = true;
+  fprintf(stderr, "\nCaught Ctrl + C. Exiting...\n");
+}
+
+int32_t main(int32_t argc, char *argv[]) {
+  if (argc < 6) {
+    fprintf(stderr, "%s\n", kUsage);
+    exit(0);
+  }
+
+  signal(SIGINT, Handler);
+
+  SherpaMnnOnlineRecognizerConfig config;
+  memset(&config, 0, sizeof(config));
+
+  config.model_config.debug = 0;
+  config.model_config.num_threads = 1;
+  config.model_config.provider = "cpu";
+
+  config.decoding_method = "greedy_search";
+
+  config.max_active_paths = 4;
+
+  config.feat_config.sample_rate = 16000;
+  config.feat_config.feature_dim = 80;
+
+  config.enable_endpoint = 1;
+  config.rule1_min_trailing_silence = 2.4;
+  config.rule2_min_trailing_silence = 1.2;
+  config.rule3_min_utterance_length = 300;
+
+  cag_option_context context;
+  char identifier;
+  const char *value;
+
+  cag_option_prepare(&context, options, CAG_ARRAY_SIZE(options), argc, argv);
+
+  while (cag_option_fetch(&context)) {
+    identifier = cag_option_get(&context);
+    value = cag_option_get_value(&context);
+    switch (identifier) {
+      case 't':
+        config.model_config.tokens = value;
+        break;
+      case 'e':
+        config.model_config.transducer.encoder = value;
+        break;
+      case 'd':
+        config.model_config.transducer.decoder = value;
+        break;
+      case 'j':
+        config.model_config.transducer.joiner = value;
+        break;
+      case 'n':
+        config.model_config.num_threads = atoi(value);
+        break;
+      case 'p':
+        config.model_config.provider = value;
+        break;
+      case 'm':
+        config.decoding_method = value;
+        break;
+      case 'f':
+        config.hotwords_file = value;
+        break;
+      case 's':
+        config.hotwords_score = atof(value);
+        break;
+      case 'h': {
+        fprintf(stderr, "%s\n", kUsage);
+        exit(0);
+        break;
+      }
+      default:
+        // do nothing as config already has valid default values
+        break;
+    }
+  }
+
+  const SherpaMnnOnlineRecognizer *recognizer =
+      SherpaMnnCreateOnlineRecognizer(&config);
+  const SherpaMnnOnlineStream *stream =
+      SherpaMnnCreateOnlineStream(recognizer);
+
+  const SherpaMnnDisplay *display = SherpaMnnCreateDisplay(50);
+  int32_t segment_id = 0;
+
+  const char *device_name = argv[context.index];
+  sherpa_mnn::Alsa alsa(device_name);
+  fprintf(stderr, "Use recording device: %s\n", device_name);
+  fprintf(stderr,
+          "Please \033[32m\033[1mspeak\033[0m! Press \033[31m\033[1mCtrl + "
+          "C\033[0m to exit\n");
+
+  int32_t expected_sample_rate = 16000;
+
+  if (alsa.GetExpectedSampleRate() != expected_sample_rate) {
+    fprintf(stderr, "sample rate: %d != %d\n", alsa.GetExpectedSampleRate(),
+            expected_sample_rate);
+    exit(-1);
+  }
+
+  int32_t chunk = 0.1 * alsa.GetActualSampleRate();
+
+  std::string last_text;
+
+  int32_t segment_index = 0;
+
+  while (!stop) {
+    const std::vector<float> &samples = alsa.Read(chunk);
+    SherpaMnnOnlineStreamAcceptWaveform(stream, expected_sample_rate,
+                                         samples.data(), samples.size());
+    while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+      SherpaMnnDecodeOnlineStream(recognizer, stream);
+    }
+
+    const SherpaMnnOnlineRecognizerResult *r =
+        SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+    std::string text = r->text;
+    SherpaMnnDestroyOnlineRecognizerResult(r);
+
+    if (!text.empty() && last_text != text) {
+      last_text = text;
+
+      std::transform(text.begin(), text.end(), text.begin(),
+                     [](auto c) { return std::tolower(c); });
+
+      SherpaMnnPrint(display, segment_index, text.c_str());
+      fflush(stderr);
+    }
+
+    if (SherpaMnnOnlineStreamIsEndpoint(recognizer, stream)) {
+      if (!text.empty()) {
+        ++segment_index;
+      }
+      SherpaMnnOnlineStreamReset(recognizer, stream);
+    }
+  }
+
+  // free allocated resources
+  SherpaMnnDestroyDisplay(display);
+  SherpaMnnDestroyOnlineStream(stream);
+  SherpaMnnDestroyOnlineRecognizer(recognizer);
+  fprintf(stderr, "\n");
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/audio-tagging-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/audio-tagging-c-api.c
@ -0,0 +1,79 @@
+// c-api-examples/audio-tagging-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+// We assume you have pre-downloaded the model files for testing
+// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/audio-tagging-models
+//
+// An example is given below:
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2
+// tar xvf sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2
+// rm sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  SherpaMnnAudioTaggingConfig config;
+  memset(&config, 0, sizeof(config));
+
+  config.model.zipformer.model =
+      "./sherpa-onnx-zipformer-audio-tagging-2024-04-09/model.int8.onnx";
+  config.model.num_threads = 1;
+  config.model.debug = 1;
+  config.model.provider = "cpu";
+  // clang-format off
+  config.labels = "./sherpa-onnx-zipformer-audio-tagging-2024-04-09/class_labels_indices.csv";
+  // clang-format on
+
+  const SherpaMnnAudioTagging *tagger = SherpaMnnCreateAudioTagging(&config);
+  if (!tagger) {
+    fprintf(stderr, "Failed to create audio tagger. Please check your config");
+    return -1;
+  }
+
+  // You can find more test waves from
+  // https://github.com/k2-fsa/sherpa-onnx/releases/download/audio-tagging-models/sherpa-onnx-zipformer-audio-tagging-2024-04-09.tar.bz2
+  const char *wav_filename =
+      "./sherpa-onnx-zipformer-audio-tagging-2024-04-09/test_wavs/1.wav";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  const SherpaMnnOfflineStream *stream =
+      SherpaMnnAudioTaggingCreateOfflineStream(tagger);
+
+  SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,
+                                  wave->num_samples);
+
+  int32_t top_k = 5;
+  const SherpaMnnAudioEvent *const *results =
+      SherpaMnnAudioTaggingCompute(tagger, stream, top_k);
+
+  fprintf(stderr, "--------------------------------------------------\n");
+  fprintf(stderr, "Index\t\tProbability\t\tEvent name\n");
+  fprintf(stderr, "--------------------------------------------------\n");
+  for (int32_t i = 0; i != top_k; ++i) {
+    fprintf(stderr, "%d\t\t%.3f\t\t\t%s\n", i, results[i]->prob,
+            results[i]->name);
+  }
+  fprintf(stderr, "--------------------------------------------------\n");
+
+  SherpaMnnAudioTaggingFreeResults(results);
+  SherpaMnnDestroyOfflineStream(stream);
+  SherpaMnnFreeWave(wave);
+  SherpaMnnDestroyAudioTagging(tagger);
+
+  return 0;
+};
--- a/apps/frameworks/sherpa-mnn/c-api-examples/decode-file-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/decode-file-c-api.c
@ -0,0 +1,244 @@
+// c-api-examples/decode-file-c-api.c
+//
+// Copyright (c)  2023  Xiaomi Corporation
+
+// This file shows how to use sherpa-onnx C API
+// to decode a file.
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "cargs.h"
+#include "sherpa-mnn/c-api/c-api.h"
+
+static struct cag_option options[] = {
+    {.identifier = 'h',
+     .access_letters = "h",
+     .access_name = "help",
+     .description = "Show help"},
+    {.identifier = 't',
+     .access_letters = NULL,
+     .access_name = "tokens",
+     .value_name = "tokens",
+     .description = "Tokens file"},
+    {.identifier = 'e',
+     .access_letters = NULL,
+     .access_name = "encoder",
+     .value_name = "encoder",
+     .description = "Encoder ONNX file"},
+    {.identifier = 'd',
+     .access_letters = NULL,
+     .access_name = "decoder",
+     .value_name = "decoder",
+     .description = "Decoder ONNX file"},
+    {.identifier = 'j',
+     .access_letters = NULL,
+     .access_name = "joiner",
+     .value_name = "joiner",
+     .description = "Joiner ONNX file"},
+    {.identifier = 'n',
+     .access_letters = NULL,
+     .access_name = "num-threads",
+     .value_name = "num-threads",
+     .description = "Number of threads"},
+    {.identifier = 'p',
+     .access_letters = NULL,
+     .access_name = "provider",
+     .value_name = "provider",
+     .description = "Provider: cpu (default), cuda, coreml"},
+    {.identifier = 'm',
+     .access_letters = NULL,
+     .access_name = "decoding-method",
+     .value_name = "decoding-method",
+     .description =
+         "Decoding method: greedy_search (default), modified_beam_search"},
+    {.identifier = 'f',
+     .access_letters = NULL,
+     .access_name = "hotwords-file",
+     .value_name = "hotwords-file",
+     .description = "The file containing hotwords, one words/phrases per line, "
+                    "and for each phrase the bpe/cjkchar are separated by a "
+                    "space. For example: ▁HE LL O ▁WORLD, 你 好 世 界"},
+    {.identifier = 's',
+     .access_letters = NULL,
+     .access_name = "hotwords-score",
+     .value_name = "hotwords-score",
+     .description = "The bonus score for each token in hotwords. Used only "
+                    "when decoding_method is modified_beam_search"},
+};
+
+const char *kUsage =
+    "\n"
+    "Usage:\n "
+    "  ./bin/decode-file-c-api \\\n"
+    "    --tokens=/path/to/tokens.txt \\\n"
+    "    --encoder=/path/to/encoder.onnx \\\n"
+    "    --decoder=/path/to/decoder.onnx \\\n"
+    "    --joiner=/path/to/joiner.onnx \\\n"
+    "    --provider=cpu \\\n"
+    "    /path/to/foo.wav\n"
+    "\n\n"
+    "Default num_threads is 1.\n"
+    "Valid decoding_method: greedy_search (default), modified_beam_search\n\n"
+    "Valid provider: cpu (default), cuda, coreml\n\n"
+    "Please refer to \n"
+    "https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/"
+    "index.html\n"
+    "for a list of pre-trained models to download.\n"
+    "\n"
+    "Note that this file supports only streaming transducer models.\n";
+
+int32_t main(int32_t argc, char *argv[]) {
+  if (argc < 6) {
+    fprintf(stderr, "%s\n", kUsage);
+    exit(0);
+  }
+
+  SherpaMnnOnlineRecognizerConfig config;
+  memset(&config, 0, sizeof(config));
+
+  config.model_config.debug = 0;
+  config.model_config.num_threads = 1;
+  config.model_config.provider = "cpu";
+
+  config.decoding_method = "greedy_search";
+
+  config.max_active_paths = 4;
+
+  config.feat_config.sample_rate = 16000;
+  config.feat_config.feature_dim = 80;
+
+  config.enable_endpoint = 1;
+  config.rule1_min_trailing_silence = 2.4;
+  config.rule2_min_trailing_silence = 1.2;
+  config.rule3_min_utterance_length = 300;
+
+  cag_option_context context;
+  char identifier;
+  const char *value;
+
+  cag_option_prepare(&context, options, CAG_ARRAY_SIZE(options), argc, argv);
+
+  while (cag_option_fetch(&context)) {
+    identifier = cag_option_get(&context);
+    value = cag_option_get_value(&context);
+    switch (identifier) {
+      case 't':
+        config.model_config.tokens = value;
+        break;
+      case 'e':
+        config.model_config.transducer.encoder = value;
+        break;
+      case 'd':
+        config.model_config.transducer.decoder = value;
+        break;
+      case 'j':
+        config.model_config.transducer.joiner = value;
+        break;
+      case 'n':
+        config.model_config.num_threads = atoi(value);
+        break;
+      case 'p':
+        config.model_config.provider = value;
+        break;
+      case 'm':
+        config.decoding_method = value;
+        break;
+      case 'f':
+        config.hotwords_file = value;
+        break;
+      case 's':
+        config.hotwords_score = atof(value);
+        break;
+      case 'h': {
+        fprintf(stderr, "%s\n", kUsage);
+        exit(0);
+        break;
+      }
+      default:
+        // do nothing as config already has valid default values
+        break;
+    }
+  }
+
+  const SherpaMnnOnlineRecognizer *recognizer =
+      SherpaMnnCreateOnlineRecognizer(&config);
+  const SherpaMnnOnlineStream *stream =
+      SherpaMnnCreateOnlineStream(recognizer);
+
+  const SherpaMnnDisplay *display = SherpaMnnCreateDisplay(50);
+  int32_t segment_id = 0;
+
+  const char *wav_filename = argv[context.index];
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+  // simulate streaming
+
+#define N 3200  // 0.2 s. Sample rate is fixed to 16 kHz
+
+  fprintf(stderr, "sample rate: %d, num samples: %d, duration: %.2f s\n",
+          wave->sample_rate, wave->num_samples,
+          (float)wave->num_samples / wave->sample_rate);
+
+  int32_t k = 0;
+  while (k < wave->num_samples) {
+    int32_t start = k;
+    int32_t end =
+        (start + N > wave->num_samples) ? wave->num_samples : (start + N);
+    k += N;
+
+    SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate,
+                                         wave->samples + start, end - start);
+    while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+      SherpaMnnDecodeOnlineStream(recognizer, stream);
+    }
+
+    const SherpaMnnOnlineRecognizerResult *r =
+        SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+    if (strlen(r->text)) {
+      SherpaMnnPrint(display, segment_id, r->text);
+    }
+
+    if (SherpaMnnOnlineStreamIsEndpoint(recognizer, stream)) {
+      if (strlen(r->text)) {
+        ++segment_id;
+      }
+      SherpaMnnOnlineStreamReset(recognizer, stream);
+    }
+
+    SherpaMnnDestroyOnlineRecognizerResult(r);
+  }
+
+  // add some tail padding
+  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       4800);
+
+  SherpaMnnFreeWave(wave);
+
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+    SherpaMnnDecodeOnlineStream(recognizer, stream);
+  }
+
+  const SherpaMnnOnlineRecognizerResult *r =
+      SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+  if (strlen(r->text)) {
+    SherpaMnnPrint(display, segment_id, r->text);
+  }
+
+  SherpaMnnDestroyOnlineRecognizerResult(r);
+
+  SherpaMnnDestroyDisplay(display);
+  SherpaMnnDestroyOnlineStream(stream);
+  SherpaMnnDestroyOnlineRecognizer(recognizer);
+  fprintf(stderr, "\n");
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/fire-red-asr-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/fire-red-asr-c-api.c
@ -0,0 +1,84 @@
+// c-api-examples/fire-red-asr-c-api.c
+//
+// Copyright (c)  2025  Xiaomi Corporation
+
+// We assume you have pre-downloaded the FireRedAsr model
+// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models
+// An example is given below:
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
+// tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
+// rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename =
+      "./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav";
+  const char *encoder_filename =
+      "sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx";
+  const char *decoder_filename =
+      "sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx";
+  const char *tokens_filename =
+      "sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt";
+  const char *provider = "cpu";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // Offline model config
+  SherpaMnnOfflineModelConfig offline_model_config;
+  memset(&offline_model_config, 0, sizeof(offline_model_config));
+  offline_model_config.debug = 1;
+  offline_model_config.num_threads = 1;
+  offline_model_config.provider = provider;
+  offline_model_config.tokens = tokens_filename;
+  offline_model_config.fire_red_asr.encoder = encoder_filename;
+  offline_model_config.fire_red_asr.decoder = decoder_filename;
+
+  // Recognizer config
+  SherpaMnnOfflineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = offline_model_config;
+
+  const SherpaMnnOfflineRecognizer *recognizer =
+      SherpaMnnCreateOfflineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+
+    SherpaMnnFreeWave(wave);
+
+    return -1;
+  }
+
+  const SherpaMnnOfflineStream *stream =
+      SherpaMnnCreateOfflineStream(recognizer);
+
+  SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,
+                                  wave->num_samples);
+  SherpaMnnDecodeOfflineStream(recognizer, stream);
+  const SherpaMnnOfflineRecognizerResult *result =
+      SherpaMnnGetOfflineStreamResult(stream);
+
+  fprintf(stderr, "Decoded text: %s\n", result->text);
+
+  SherpaMnnDestroyOfflineRecognizerResult(result);
+  SherpaMnnDestroyOfflineStream(stream);
+  SherpaMnnDestroyOfflineRecognizer(recognizer);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/keywords-spotter-buffered-tokens-keywords-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/keywords-spotter-buffered-tokens-keywords-c-api.c
@ -0,0 +1,196 @@
+// c-api-examples/keywords-spotter-buffered-tokens-keywords-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+// Copyright (c)  2024  Luo Xiao
+
+//
+// This file demonstrates how to use keywords spotter with sherpa-onnx's C
+// API and with tokens and keywords loaded from buffered strings instead of from
+// external files API.
+// clang-format off
+// 
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2
+// tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2
+// rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+static size_t ReadFile(const char *filename, const char **buffer_out) {
+  FILE *file = fopen(filename, "r");
+  if (file == NULL) {
+    fprintf(stderr, "Failed to open %s\n", filename);
+    return -1;
+  }
+  fseek(file, 0L, SEEK_END);
+  long size = ftell(file);
+  rewind(file);
+  *buffer_out = malloc(size);
+  if (*buffer_out == NULL) {
+    fclose(file);
+    fprintf(stderr, "Memory error\n");
+    return -1;
+  }
+  size_t read_bytes = fread((void *)*buffer_out, 1, size, file);
+  if (read_bytes != size) {
+    printf("Errors occured in reading the file %s\n", filename);
+    free((void *)*buffer_out);
+    *buffer_out = NULL;
+    fclose(file);
+    return -1;
+  }
+  fclose(file);
+  return read_bytes;
+}
+
+int32_t main() {
+  const char *wav_filename =
+      "sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/test_wavs/"
+      "6.wav";
+  const char *encoder_filename =
+      "sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx";
+  const char *decoder_filename =
+      "sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "decoder-epoch-12-avg-2-chunk-16-left-64.onnx";
+  const char *joiner_filename =
+      "sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx";
+  const char *provider = "cpu";
+  const char *tokens_filename =
+      "sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/tokens.txt";
+  const char *keywords_filename =
+      "sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/test_wavs/"
+      "test_keywords.txt";
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // reading tokens and keywords to buffers
+  const char *tokens_buf;
+  size_t token_buf_size = ReadFile(tokens_filename, &tokens_buf);
+  if (token_buf_size < 1) {
+    fprintf(stderr, "Please check your tokens.txt!\n");
+    free((void *)tokens_buf);
+    return -1;
+  }
+  const char *keywords_buf;
+  size_t keywords_buf_size = ReadFile(keywords_filename, &keywords_buf);
+  if (keywords_buf_size < 1) {
+    fprintf(stderr, "Please check your keywords.txt!\n");
+    free((void *)keywords_buf);
+    return -1;
+  }
+
+  // Zipformer config
+  SherpaMnnOnlineTransducerModelConfig zipformer_config;
+  memset(&zipformer_config, 0, sizeof(zipformer_config));
+  zipformer_config.encoder = encoder_filename;
+  zipformer_config.decoder = decoder_filename;
+  zipformer_config.joiner = joiner_filename;
+
+  // Online model config
+  SherpaMnnOnlineModelConfig online_model_config;
+  memset(&online_model_config, 0, sizeof(online_model_config));
+  online_model_config.debug = 1;
+  online_model_config.num_threads = 1;
+  online_model_config.provider = provider;
+  online_model_config.tokens_buf = tokens_buf;
+  online_model_config.tokens_buf_size = token_buf_size;
+  online_model_config.transducer = zipformer_config;
+
+  // Keywords-spotter config
+  SherpaMnnKeywordSpotterConfig keywords_spotter_config;
+  memset(&keywords_spotter_config, 0, sizeof(keywords_spotter_config));
+  keywords_spotter_config.max_active_paths = 4;
+  keywords_spotter_config.keywords_threshold = 0.1;
+  keywords_spotter_config.keywords_score = 3.0;
+  keywords_spotter_config.model_config = online_model_config;
+  keywords_spotter_config.keywords_buf = keywords_buf;
+  keywords_spotter_config.keywords_buf_size = keywords_buf_size;
+
+  const SherpaMnnKeywordSpotter *keywords_spotter =
+      SherpaMnnCreateKeywordSpotter(&keywords_spotter_config);
+
+  free((void *)tokens_buf);
+  tokens_buf = NULL;
+  free((void *)keywords_buf);
+  keywords_buf = NULL;
+
+  if (keywords_spotter == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOnlineStream *stream =
+      SherpaMnnCreateKeywordStream(keywords_spotter);
+
+  const SherpaMnnDisplay *display = SherpaMnnCreateDisplay(50);
+  int32_t segment_id = 0;
+
+// simulate streaming. You can choose an arbitrary N
+#define N 3200
+
+  fprintf(stderr, "sample rate: %d, num samples: %d, duration: %.2f s\n",
+          wave->sample_rate, wave->num_samples,
+          (float)wave->num_samples / wave->sample_rate);
+
+  int32_t k = 0;
+  while (k < wave->num_samples) {
+    int32_t start = k;
+    int32_t end =
+        (start + N > wave->num_samples) ? wave->num_samples : (start + N);
+    k += N;
+
+    SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate,
+                                         wave->samples + start, end - start);
+    while (SherpaMnnIsKeywordStreamReady(keywords_spotter, stream)) {
+      SherpaMnnDecodeKeywordStream(keywords_spotter, stream);
+    }
+
+    const SherpaMnnKeywordResult *r =
+        SherpaMnnGetKeywordResult(keywords_spotter, stream);
+
+    if (strlen(r->keyword)) {
+      SherpaMnnPrint(display, segment_id, r->keyword);
+    }
+
+    SherpaMnnDestroyKeywordResult(r);
+  }
+
+  // add some tail padding
+  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       4800);
+
+  SherpaMnnFreeWave(wave);
+
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsKeywordStreamReady(keywords_spotter, stream)) {
+    SherpaMnnDecodeKeywordStream(keywords_spotter, stream);
+  }
+
+  const SherpaMnnKeywordResult *r =
+      SherpaMnnGetKeywordResult(keywords_spotter, stream);
+
+  if (strlen(r->keyword)) {
+    SherpaMnnPrint(display, segment_id, r->keyword);
+  }
+
+  SherpaMnnDestroyKeywordResult(r);
+
+  SherpaMnnDestroyDisplay(display);
+  SherpaMnnDestroyOnlineStream(stream);
+  SherpaMnnDestroyKeywordSpotter(keywords_spotter);
+  fprintf(stderr, "\n");
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/kokoro-tts-en-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/kokoro-tts-en-c-api.c
@ -0,0 +1,84 @@
+// c-api-examples/kokoro-tts-en-c-api.c
+//
+// Copyright (c)  2025  Xiaomi Corporation
+
+// This file shows how to use sherpa-onnx C API
+// for English TTS with Kokoro.
+//
+// clang-format off
+/*
+Usage
+
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2
+tar xf kokoro-en-v0_19.tar.bz2
+rm kokoro-en-v0_19.tar.bz2
+
+./kokoro-tts-en-c-api
+
+ */
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+static int32_t ProgressCallback(const float *samples, int32_t num_samples,
+                                float progress) {
+  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
+  // return 1 to continue generating
+  // return 0 to stop generating
+  return 1;
+}
+
+int32_t main(int32_t argc, char *argv[]) {
+  SherpaMnnOfflineTtsConfig config;
+  memset(&config, 0, sizeof(config));
+  config.model.kokoro.model = "./kokoro-en-v0_19/model.onnx";
+  config.model.kokoro.voices = "./kokoro-en-v0_19/voices.bin";
+  config.model.kokoro.tokens = "./kokoro-en-v0_19/tokens.txt";
+  config.model.kokoro.data_dir = "./kokoro-en-v0_19/espeak-ng-data";
+
+  config.model.num_threads = 2;
+
+  // If you don't want to see debug messages, please set it to 0
+  config.model.debug = 1;
+
+  const char *filename = "./generated-kokoro-en.wav";
+  const char *text =
+      "Today as always, men fall into two groups: slaves and free men. Whoever "
+      "does not have two-thirds of his day for himself, is a slave, whatever "
+      "he may be: a statesman, a businessman, an official, or a scholar. "
+      "Friends fell out often because life was changing so fast. The easiest "
+      "thing in the world was to lose touch with someone.";
+
+  const SherpaMnnOfflineTts *tts = SherpaMnnCreateOfflineTts(&config);
+  // mapping of sid to voice name
+  // 0->af, 1->af_bella, 2->af_nicole, 3->af_sarah, 4->af_sky, 5->am_adam
+  // 6->am_michael, 7->bf_emma, 8->bf_isabella, 9->bm_george, 10->bm_lewis
+  int32_t sid = 0;
+  float speed = 1.0;  // larger -> faster in speech speed
+
+#if 0
+  // If you don't want to use a callback, then please enable this branch
+  const SherpaMnnGeneratedAudio *audio =
+      SherpaMnnOfflineTtsGenerate(tts, text, sid, speed);
+#else
+  const SherpaMnnGeneratedAudio *audio =
+      SherpaMnnOfflineTtsGenerateWithProgressCallback(tts, text, sid, speed,
+                                                       ProgressCallback);
+#endif
+
+  SherpaMnnWriteWave(audio->samples, audio->n, audio->sample_rate, filename);
+
+  SherpaMnnDestroyOfflineTtsGeneratedAudio(audio);
+  SherpaMnnDestroyOfflineTts(tts);
+
+  fprintf(stderr, "Input text is: %s\n", text);
+  fprintf(stderr, "Speaker ID is is: %d\n", sid);
+  fprintf(stderr, "Saved to: %s\n", filename);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/kokoro-tts-zh-en-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/kokoro-tts-zh-en-c-api.c
@ -0,0 +1,82 @@
+// c-api-examples/kokoro-tts-zh-en-c-api.c
+//
+// Copyright (c)  2025  Xiaomi Corporation
+
+// This file shows how to use sherpa-onnx C API
+// for English + Chinese TTS with Kokoro.
+//
+// clang-format off
+/*
+Usage
+
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2
+tar xf kokoro-multi-lang-v1_0.tar.bz2
+rm kokoro-multi-lang-v1_0.tar.bz2
+
+./kokoro-tts-zh-en-c-api
+
+ */
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+static int32_t ProgressCallback(const float *samples, int32_t num_samples,
+                                float progress) {
+  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
+  // return 1 to continue generating
+  // return 0 to stop generating
+  return 1;
+}
+
+int32_t main(int32_t argc, char *argv[]) {
+  SherpaMnnOfflineTtsConfig config;
+  memset(&config, 0, sizeof(config));
+  config.model.kokoro.model = "./kokoro-multi-lang-v1_0/model.onnx";
+  config.model.kokoro.voices = "./kokoro-multi-lang-v1_0/voices.bin";
+  config.model.kokoro.tokens = "./kokoro-multi-lang-v1_0/tokens.txt";
+  config.model.kokoro.data_dir = "./kokoro-multi-lang-v1_0/espeak-ng-data";
+  config.model.kokoro.dict_dir = "./kokoro-multi-lang-v1_0/dict";
+  config.model.kokoro.lexicon =
+      "./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/"
+      "lexicon-zh.txt";
+
+  config.model.num_threads = 2;
+
+  // If you don't want to see debug messages, please set it to 0
+  config.model.debug = 1;
+
+  const char *filename = "./generated-kokoro-zh-en.wav";
+  const char *text =
+      "中英文语音合成测试。This is generated by next generation Kaldi using "
+      "Kokoro without Misaki. 你觉得中英文说的如何呢？";
+
+  const SherpaMnnOfflineTts *tts = SherpaMnnCreateOfflineTts(&config);
+  int32_t sid = 0;    // there are 53 speakers
+  float speed = 1.0;  // larger -> faster in speech speed
+
+#if 0
+  // If you don't want to use a callback, then please enable this branch
+  const SherpaMnnGeneratedAudio *audio =
+      SherpaMnnOfflineTtsGenerate(tts, text, sid, speed);
+#else
+  const SherpaMnnGeneratedAudio *audio =
+      SherpaMnnOfflineTtsGenerateWithProgressCallback(tts, text, sid, speed,
+                                                       ProgressCallback);
+#endif
+
+  SherpaMnnWriteWave(audio->samples, audio->n, audio->sample_rate, filename);
+
+  SherpaMnnDestroyOfflineTtsGeneratedAudio(audio);
+  SherpaMnnDestroyOfflineTts(tts);
+
+  fprintf(stderr, "Input text is: %s\n", text);
+  fprintf(stderr, "Speaker ID is is: %d\n", sid);
+  fprintf(stderr, "Saved to: %s\n", filename);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/kws-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/kws-c-api.c
@ -0,0 +1,152 @@
+// c-api-examples/kws-c-api.c
+//
+// Copyright (c)  2025  Xiaomi Corporation
+//
+// This file demonstrates how to use keywords spotter with sherpa-onnx's C
+// clang-format off
+//
+// Usage
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2
+// tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2
+// rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2
+//
+// ./kws-c-api
+//
+// clang-format on
+#include <stdio.h>
+#include <stdlib.h>  // exit
+#include <string.h>  // memset
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  SherpaMnnKeywordSpotterConfig config;
+
+  memset(&config, 0, sizeof(config));
+  config.model_config.transducer.encoder =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx";
+
+  config.model_config.transducer.decoder =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "decoder-epoch-12-avg-2-chunk-16-left-64.onnx";
+
+  config.model_config.transducer.joiner =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx";
+
+  config.model_config.tokens =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "tokens.txt";
+
+  config.model_config.provider = "cpu";
+  config.model_config.num_threads = 1;
+  config.model_config.debug = 1;
+
+  config.keywords_file =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "test_wavs/test_keywords.txt";
+
+  const SherpaMnnKeywordSpotter *kws = SherpaMnnCreateKeywordSpotter(&config);
+  if (!kws) {
+    fprintf(stderr, "Please check your config");
+    exit(-1);
+  }
+
+  fprintf(stderr,
+          "--Test pre-defined keywords from test_wavs/test_keywords.txt--\n");
+
+  const char *wav_filename =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "test_wavs/3.wav";
+
+  float tail_paddings[8000] = {0};  // 0.5 seconds
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    exit(-1);
+  }
+
+  const SherpaMnnOnlineStream *stream = SherpaMnnCreateKeywordStream(kws);
+  if (!stream) {
+    fprintf(stderr, "Failed to create stream\n");
+    exit(-1);
+  }
+
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, wave->samples,
+                                       wave->num_samples);
+
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       sizeof(tail_paddings) / sizeof(float));
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsKeywordStreamReady(kws, stream)) {
+    SherpaMnnDecodeKeywordStream(kws, stream);
+    const SherpaMnnKeywordResult *r = SherpaMnnGetKeywordResult(kws, stream);
+    if (r && r->json && strlen(r->keyword)) {
+      fprintf(stderr, "Detected keyword: %s\n", r->json);
+
+      // Remember to reset the keyword stream right after a keyword is detected
+      SherpaMnnResetKeywordStream(kws, stream);
+    }
+    SherpaMnnDestroyKeywordResult(r);
+  }
+  SherpaMnnDestroyOnlineStream(stream);
+
+  // --------------------------------------------------------------------------
+
+  fprintf(stderr, "--Use pre-defined keywords + add a new keyword--\n");
+
+  stream = SherpaMnnCreateKeywordStreamWithKeywords(kws, "y ǎn y uán @演员");
+
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, wave->samples,
+                                       wave->num_samples);
+
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       sizeof(tail_paddings) / sizeof(float));
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsKeywordStreamReady(kws, stream)) {
+    SherpaMnnDecodeKeywordStream(kws, stream);
+    const SherpaMnnKeywordResult *r = SherpaMnnGetKeywordResult(kws, stream);
+    if (r && r->json && strlen(r->keyword)) {
+      fprintf(stderr, "Detected keyword: %s\n", r->json);
+
+      // Remember to reset the keyword stream
+      SherpaMnnResetKeywordStream(kws, stream);
+    }
+    SherpaMnnDestroyKeywordResult(r);
+  }
+  SherpaMnnDestroyOnlineStream(stream);
+
+  // --------------------------------------------------------------------------
+
+  fprintf(stderr, "--Use pre-defined keywords + add two new keywords--\n");
+
+  stream = SherpaMnnCreateKeywordStreamWithKeywords(
+      kws, "y ǎn y uán @演员/zh ī m íng @知名");
+
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, wave->samples,
+                                       wave->num_samples);
+
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       sizeof(tail_paddings) / sizeof(float));
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsKeywordStreamReady(kws, stream)) {
+    SherpaMnnDecodeKeywordStream(kws, stream);
+    const SherpaMnnKeywordResult *r = SherpaMnnGetKeywordResult(kws, stream);
+    if (r && r->json && strlen(r->keyword)) {
+      fprintf(stderr, "Detected keyword: %s\n", r->json);
+
+      // Remember to reset the keyword stream
+      SherpaMnnResetKeywordStream(kws, stream);
+    }
+    SherpaMnnDestroyKeywordResult(r);
+  }
+  SherpaMnnDestroyOnlineStream(stream);
+
+  SherpaMnnFreeWave(wave);
+  SherpaMnnDestroyKeywordSpotter(kws);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/matcha-tts-en-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/matcha-tts-en-c-api.c
@ -0,0 +1,87 @@
+// c-api-examples/matcha-tts-en-c-api.c
+//
+// Copyright (c)  2025  Xiaomi Corporation
+
+// This file shows how to use sherpa-onnx C API
+// for English TTS with MatchaTTS.
+//
+// clang-format off
+/*
+Usage
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2
+tar xvf matcha-icefall-en_US-ljspeech.tar.bz2
+rm matcha-icefall-en_US-ljspeech.tar.bz2
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/hifigan_v2.onnx
+
+./matcha-tts-en-c-api
+
+ */
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+static int32_t ProgressCallback(const float *samples, int32_t num_samples,
+                                float progress) {
+  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
+  // return 1 to continue generating
+  // return 0 to stop generating
+  return 1;
+}
+
+int32_t main(int32_t argc, char *argv[]) {
+  SherpaMnnOfflineTtsConfig config;
+  memset(&config, 0, sizeof(config));
+  config.model.matcha.acoustic_model =
+      "./matcha-icefall-en_US-ljspeech/model-steps-3.onnx";
+
+  config.model.matcha.vocoder = "./hifigan_v2.onnx";
+
+  config.model.matcha.tokens = "./matcha-icefall-en_US-ljspeech/tokens.txt";
+
+  config.model.matcha.data_dir =
+      "./matcha-icefall-en_US-ljspeech/espeak-ng-data";
+
+  config.model.num_threads = 1;
+
+  // If you don't want to see debug messages, please set it to 0
+  config.model.debug = 1;
+
+  const char *filename = "./generated-matcha-en.wav";
+  const char *text =
+      "Today as always, men fall into two groups: slaves and free men. Whoever "
+      "does not have two-thirds of his day for himself, is a slave, whatever "
+      "he may be: a statesman, a businessman, an official, or a scholar. "
+      "Friends fell out often because life was changing so fast. The easiest "
+      "thing in the world was to lose touch with someone.";
+
+  const SherpaMnnOfflineTts *tts = SherpaMnnCreateOfflineTts(&config);
+  int32_t sid = 0;
+  float speed = 1.0;  // larger -> faster in speech speed
+
+#if 0
+  // If you don't want to use a callback, then please enable this branch
+  const SherpaMnnGeneratedAudio *audio =
+      SherpaMnnOfflineTtsGenerate(tts, text, sid, speed);
+#else
+  const SherpaMnnGeneratedAudio *audio =
+      SherpaMnnOfflineTtsGenerateWithProgressCallback(tts, text, sid, speed,
+                                                       ProgressCallback);
+#endif
+
+  SherpaMnnWriteWave(audio->samples, audio->n, audio->sample_rate, filename);
+
+  SherpaMnnDestroyOfflineTtsGeneratedAudio(audio);
+  SherpaMnnDestroyOfflineTts(tts);
+
+  fprintf(stderr, "Input text is: %s\n", text);
+  fprintf(stderr, "Speaker ID is is: %d\n", sid);
+  fprintf(stderr, "Saved to: %s\n", filename);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/matcha-tts-zh-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/matcha-tts-zh-c-api.c
@ -0,0 +1,87 @@
+// c-api-examples/matcha-tts-zh-c-api.c
+//
+// Copyright (c)  2025  Xiaomi Corporation
+
+// This file shows how to use sherpa-onnx C API
+// for Chinese TTS with MatchaTTS.
+//
+// clang-format off
+/*
+Usage
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2
+tar xvf matcha-icefall-zh-baker.tar.bz2
+rm matcha-icefall-zh-baker.tar.bz2
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/hifigan_v2.onnx
+
+./matcha-tts-zh-c-api
+
+ */
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+static int32_t ProgressCallback(const float *samples, int32_t num_samples,
+                                float progress) {
+  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
+  // return 1 to continue generating
+  // return 0 to stop generating
+  return 1;
+}
+
+int32_t main(int32_t argc, char *argv[]) {
+  SherpaMnnOfflineTtsConfig config;
+  memset(&config, 0, sizeof(config));
+  config.model.matcha.acoustic_model =
+      "./matcha-icefall-zh-baker/model-steps-3.onnx";
+  config.model.matcha.vocoder = "./hifigan_v2.onnx";
+  config.model.matcha.lexicon = "./matcha-icefall-zh-baker/lexicon.txt";
+  config.model.matcha.tokens = "./matcha-icefall-zh-baker/tokens.txt";
+  config.model.matcha.dict_dir = "./matcha-icefall-zh-baker/dict";
+  config.model.num_threads = 1;
+
+  // If you don't want to see debug messages, please set it to 0
+  config.model.debug = 1;
+
+  // clang-format off
+  config.rule_fsts = "./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst";
+  // clang-format on
+
+  const char *filename = "./generated-matcha-zh.wav";
+  const char *text =
+      "当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如"
+      "涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感"
+      "受着生命的奇迹与温柔."
+      "某某银行的副行长和一些行政领导表示，他们去过长江和长白山; "
+      "经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。";
+
+  const SherpaMnnOfflineTts *tts = SherpaMnnCreateOfflineTts(&config);
+  int32_t sid = 0;
+  float speed = 1.0;  // larger -> faster in speech speed
+
+#if 0
+  // If you don't want to use a callback, then please enable this branch
+  const SherpaMnnGeneratedAudio *audio =
+      SherpaMnnOfflineTtsGenerate(tts, text, sid, speed);
+#else
+  const SherpaMnnGeneratedAudio *audio =
+      SherpaMnnOfflineTtsGenerateWithProgressCallback(tts, text, sid, speed,
+                                                       ProgressCallback);
+#endif
+
+  SherpaMnnWriteWave(audio->samples, audio->n, audio->sample_rate, filename);
+
+  SherpaMnnDestroyOfflineTtsGeneratedAudio(audio);
+  SherpaMnnDestroyOfflineTts(tts);
+
+  fprintf(stderr, "Input text is: %s\n", text);
+  fprintf(stderr, "Speaker ID is is: %d\n", sid);
+  fprintf(stderr, "Saved to: %s\n", filename);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/moonshine-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/moonshine-c-api.c
@ -0,0 +1,83 @@
+// c-api-examples/moonshine-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use Moonshine tiny with sherpa-onnx's C API.
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
+// tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
+// rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename =
+      "./sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav";
+  const char *preprocessor =
+      "./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx";
+  const char *encoder = "./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx";
+  const char *uncached_decoder =
+      "./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx";
+  const char *cached_decoder =
+      "./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx";
+  const char *tokens = "./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // Offline model config
+  SherpaMnnOfflineModelConfig offline_model_config;
+  memset(&offline_model_config, 0, sizeof(offline_model_config));
+  offline_model_config.debug = 1;
+  offline_model_config.num_threads = 1;
+  offline_model_config.provider = "cpu";
+  offline_model_config.tokens = tokens;
+  offline_model_config.moonshine.preprocessor = preprocessor;
+  offline_model_config.moonshine.encoder = encoder;
+  offline_model_config.moonshine.uncached_decoder = uncached_decoder;
+  offline_model_config.moonshine.cached_decoder = cached_decoder;
+
+  // Recognizer config
+  SherpaMnnOfflineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = offline_model_config;
+
+  const SherpaMnnOfflineRecognizer *recognizer =
+      SherpaMnnCreateOfflineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOfflineStream *stream =
+      SherpaMnnCreateOfflineStream(recognizer);
+
+  SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,
+                                  wave->num_samples);
+  SherpaMnnDecodeOfflineStream(recognizer, stream);
+  const SherpaMnnOfflineRecognizerResult *result =
+      SherpaMnnGetOfflineStreamResult(stream);
+
+  fprintf(stderr, "Decoded text: %s\n", result->text);
+
+  SherpaMnnDestroyOfflineRecognizerResult(result);
+  SherpaMnnDestroyOfflineStream(stream);
+  SherpaMnnDestroyOfflineRecognizer(recognizer);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/offline-speaker-diarization-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/offline-speaker-diarization-c-api.c
@ -0,0 +1,131 @@
+// c-api-examples/offline-sepaker-diarization-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to implement speaker diarization with
+// sherpa-onnx's C API.
+
+// clang-format off
+/*
+Usage:
+
+Step 1: Download a speaker segmentation model
+
+Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models
+for a list of available models. The following is an example
+
+  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/sherpa-onnx-pyannote-segmentation-3-0.tar.bz2
+  tar xvf sherpa-onnx-pyannote-segmentation-3-0.tar.bz2
+  rm sherpa-onnx-pyannote-segmentation-3-0.tar.bz2
+
+Step 2: Download a speaker embedding extractor model
+
+Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models
+for a list of available models. The following is an example
+
+  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx
+
+Step 3. Download test wave files
+
+Please visit https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-segmentation-models
+for a list of available test wave files. The following is an example
+
+  wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-segmentation-models/0-four-speakers-zh.wav
+
+Step 4. Run it
+
+ */
+// clang-format on
+
+#include <stdio.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+static int32_t ProgressCallback(int32_t num_processed_chunks,
+                                int32_t num_total_chunks, void *arg) {
+  float progress = 100.0 * num_processed_chunks / num_total_chunks;
+  fprintf(stderr, "progress %.2f%%\n", progress);
+
+  // the return value is currently ignored
+  return 0;
+}
+
+int main() {
+  // Please see the comments at the start of this file for how to download
+  // the .onnx file and .wav files below
+  const char *segmentation_model =
+      "./sherpa-onnx-pyannote-segmentation-3-0/model.onnx";
+
+  const char *embedding_extractor_model =
+      "./3dspeaker_speech_eres2net_base_sv_zh-cn_3dspeaker_16k.onnx";
+
+  const char *wav_filename = "./0-four-speakers-zh.wav";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  SherpaMnnOfflineSpeakerDiarizationConfig config;
+  memset(&config, 0, sizeof(config));
+
+  config.segmentation.pyannote.model = segmentation_model;
+  config.embedding.model = embedding_extractor_model;
+
+  // the test wave ./0-four-speakers-zh.wav has 4 speakers, so
+  // we set num_clusters to 4
+  //
+  config.clustering.num_clusters = 4;
+  // If you don't know the number of speakers in the test wave file, please
+  // use
+  // config.clustering.threshold = 0.5; // You need to tune this threshold
+
+  const SherpaMnnOfflineSpeakerDiarization *sd =
+      SherpaMnnCreateOfflineSpeakerDiarization(&config);
+
+  if (!sd) {
+    fprintf(stderr, "Failed to initialize offline speaker diarization\n");
+    return -1;
+  }
+
+  if (SherpaMnnOfflineSpeakerDiarizationGetSampleRate(sd) !=
+      wave->sample_rate) {
+    fprintf(
+        stderr,
+        "Expected sample rate: %d. Actual sample rate from the wave file: %d\n",
+        SherpaMnnOfflineSpeakerDiarizationGetSampleRate(sd),
+        wave->sample_rate);
+    goto failed;
+  }
+
+  const SherpaMnnOfflineSpeakerDiarizationResult *result =
+      SherpaMnnOfflineSpeakerDiarizationProcessWithCallback(
+          sd, wave->samples, wave->num_samples, ProgressCallback, NULL);
+  if (!result) {
+    fprintf(stderr, "Failed to do speaker diarization");
+    goto failed;
+  }
+
+  int32_t num_segments =
+      SherpaMnnOfflineSpeakerDiarizationResultGetNumSegments(result);
+
+  const SherpaMnnOfflineSpeakerDiarizationSegment *segments =
+      SherpaMnnOfflineSpeakerDiarizationResultSortByStartTime(result);
+
+  for (int32_t i = 0; i != num_segments; ++i) {
+    fprintf(stderr, "%.3f -- %.3f speaker_%02d\n", segments[i].start,
+            segments[i].end, segments[i].speaker);
+  }
+
+failed:
+
+  SherpaMnnOfflineSpeakerDiarizationDestroySegment(segments);
+  SherpaMnnOfflineSpeakerDiarizationDestroyResult(result);
+  SherpaMnnDestroyOfflineSpeakerDiarization(sd);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/offline-tts-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/offline-tts-c-api.c
@ -0,0 +1,249 @@
+// c-api-examples/offline-tts-c-api.c
+//
+// Copyright (c)  2023  Xiaomi Corporation
+
+// This file shows how to use sherpa-onnx C API
+// to convert text to speech using an offline model.
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "cargs.h"
+#include "sherpa-mnn/c-api/c-api.h"
+
+static struct cag_option options[] = {
+    {.identifier = 'h',
+     .access_letters = "h",
+     .access_name = "help",
+     .description = "Show help"},
+    {.access_name = "vits-model",
+     .value_name = "/path/to/xxx.onnx",
+     .identifier = '0',
+     .description = "Path to VITS model"},
+    {.access_name = "vits-lexicon",
+     .value_name = "/path/to/lexicon.txt",
+     .identifier = '1',
+     .description = "Path to lexicon.txt for VITS models"},
+    {.access_name = "vits-tokens",
+     .value_name = "/path/to/tokens.txt",
+     .identifier = '2',
+     .description = "Path to tokens.txt for VITS models"},
+    {.access_name = "vits-noise-scale",
+     .value_name = "0.667",
+     .identifier = '3',
+     .description = "noise_scale for VITS models"},
+    {.access_name = "vits-noise-scale-w",
+     .value_name = "0.8",
+     .identifier = '4',
+     .description = "noise_scale_w for VITS models"},
+    {.access_name = "vits-length-scale",
+     .value_name = "1.0",
+     .identifier = '5',
+     .description =
+         "length_scale for VITS models. Default to 1. You can tune it "
+         "to change the speech speed. small -> faster; large -> slower. "},
+    {.access_name = "num-threads",
+     .value_name = "1",
+     .identifier = '6',
+     .description = "Number of threads"},
+    {.access_name = "provider",
+     .value_name = "cpu",
+     .identifier = '7',
+     .description = "Provider: cpu (default), cuda, coreml"},
+    {.access_name = "debug",
+     .value_name = "0",
+     .identifier = '8',
+     .description = "1 to show debug messages while loading the model"},
+    {.access_name = "sid",
+     .value_name = "0",
+     .identifier = '9',
+     .description = "Speaker ID. Default to 0. Note it is not used for "
+                    "single-speaker models."},
+    {.access_name = "output-filename",
+     .value_name = "./generated.wav",
+     .identifier = 'a',
+     .description =
+         "Filename to save the generated audio. Default to ./generated.wav"},
+
+    {.access_name = "tts-rule-fsts",
+     .value_name = "/path/to/rule.fst",
+     .identifier = 'b',
+     .description = "It not empty, it contains a list of rule FST filenames."
+                    "Multiple filenames are separated by a comma and they are "
+                    "applied from left to right. An example value: "
+                    "rule1.fst,rule2,fst,rule3.fst"},
+
+    {.access_name = "max-num-sentences",
+     .value_name = "2",
+     .identifier = 'c',
+     .description = "Maximum number of sentences that we process at a time. "
+                    "This is to avoid OOM for very long input text. "
+                    "If you set it to -1, then we process all sentences in a "
+                    "single batch."},
+
+    {.access_name = "vits-data-dir",
+     .value_name = "/path/to/espeak-ng-data",
+     .identifier = 'd',
+     .description =
+         "Path to espeak-ng-data. If it is given, --vits-lexicon is ignored"},
+
+};
+
+static void ShowUsage() {
+  const char *kUsageMessage =
+      "Offline text-to-speech with sherpa-onnx C API"
+      "\n"
+      "./offline-tts-c-api \\\n"
+      " --vits-model=/path/to/model.onnx \\\n"
+      " --vits-lexicon=/path/to/lexicon.txt \\\n"
+      " --vits-tokens=/path/to/tokens.txt \\\n"
+      " --sid=0 \\\n"
+      " --output-filename=./generated.wav \\\n"
+      " 'some text within single quotes on linux/macos or use double quotes on "
+      "windows'\n"
+      "\n"
+      "It will generate a file ./generated.wav as specified by "
+      "--output-filename.\n"
+      "\n"
+      "You can download a test model from\n"
+      "https://huggingface.co/csukuangfj/vits-ljs\n"
+      "\n"
+      "For instance, you can use:\n"
+      "wget "
+      "https://huggingface.co/csukuangfj/vits-ljs/resolve/main/vits-ljs.onnx\n"
+      "wget "
+      "https://huggingface.co/csukuangfj/vits-ljs/resolve/main/lexicon.txt\n"
+      "wget "
+      "https://huggingface.co/csukuangfj/vits-ljs/resolve/main/tokens.txt\n"
+      "\n"
+      "./offline-tts-c-api \\\n"
+      "  --vits-model=./vits-ljs.onnx \\\n"
+      "  --vits-lexicon=./lexicon.txt \\\n"
+      "  --vits-tokens=./tokens.txt \\\n"
+      "  --sid=0 \\\n"
+      "  --output-filename=./generated.wav \\\n"
+      "  'liliana, the most beautiful and lovely assistant of our team!'\n"
+      "\n"
+      "Please see\n"
+      "https://k2-fsa.github.io/sherpa/onnx/tts/index.html\n"
+      "or details.\n\n";
+
+  fprintf(stderr, "%s", kUsageMessage);
+  cag_option_print(options, CAG_ARRAY_SIZE(options), stderr);
+  exit(0);
+}
+
+int32_t main(int32_t argc, char *argv[]) {
+  cag_option_context context;
+  char identifier;
+  const char *value;
+
+  cag_option_prepare(&context, options, CAG_ARRAY_SIZE(options), argc, argv);
+
+  SherpaMnnOfflineTtsConfig config;
+  memset(&config, 0, sizeof(config));
+
+  int32_t sid = 0;
+  const char *filename = strdup("./generated.wav");
+  const char *text;
+
+  while (cag_option_fetch(&context)) {
+    identifier = cag_option_get(&context);
+    value = cag_option_get_value(&context);
+    switch (identifier) {
+      case '0':
+        config.model.vits.model = value;
+        break;
+      case '1':
+        config.model.vits.lexicon = value;
+        break;
+      case '2':
+        config.model.vits.tokens = value;
+        break;
+      case '3':
+        config.model.vits.noise_scale = atof(value);
+        break;
+      case '4':
+        config.model.vits.noise_scale_w = atof(value);
+        break;
+      case '5':
+        config.model.vits.length_scale = atof(value);
+        break;
+      case '6':
+        config.model.num_threads = atoi(value);
+        break;
+      case '7':
+        config.model.provider = value;
+        break;
+      case '8':
+        config.model.debug = atoi(value);
+        break;
+      case '9':
+        sid = atoi(value);
+        break;
+      case 'a':
+        free((void *)filename);
+        filename = strdup(value);
+        break;
+      case 'b':
+        config.rule_fsts = value;
+        break;
+      case 'c':
+        config.max_num_sentences = atoi(value);
+        break;
+      case 'd':
+        config.model.vits.data_dir = value;
+        break;
+      case '?':
+        fprintf(stderr, "Unknown option\n");
+        // fall through
+      case 'h':
+        // fall through
+      default:
+        ShowUsage();
+    }
+  }
+  fprintf(stderr, "here\n");
+
+  if (!config.model.vits.model) {
+    fprintf(stderr, "Please provide --vits-model\n");
+    ShowUsage();
+  }
+
+  if (!config.model.vits.tokens) {
+    fprintf(stderr, "Please provide --vits-tokens\n");
+    ShowUsage();
+  }
+
+  if (!config.model.vits.data_dir && !config.model.vits.lexicon) {
+    fprintf(stderr, "Please provide --vits-data-dir or --vits-lexicon\n");
+    ShowUsage();
+  }
+
+  // the last arg is the text
+  text = argv[argc - 1];
+  if (text[0] == '-') {
+    fprintf(stderr, "\n***Please input your text!***\n\n");
+    fprintf(stderr, "\n---------------Usage---------------\n\n");
+    ShowUsage();
+  }
+
+  const SherpaMnnOfflineTts *tts = SherpaMnnCreateOfflineTts(&config);
+
+  const SherpaMnnGeneratedAudio *audio =
+      SherpaMnnOfflineTtsGenerate(tts, text, sid, 1.0);
+
+  SherpaMnnWriteWave(audio->samples, audio->n, audio->sample_rate, filename);
+
+  SherpaMnnDestroyOfflineTtsGeneratedAudio(audio);
+  SherpaMnnDestroyOfflineTts(tts);
+
+  fprintf(stderr, "Input text is: %s\n", text);
+  fprintf(stderr, "Speaker ID is is: %d\n", sid);
+  fprintf(stderr, "Saved to: %s\n", filename);
+
+  free((void *)filename);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/paraformer-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/paraformer-c-api.c
@ -0,0 +1,83 @@
+// c-api-examples/paraformer-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use non-streaming Paraformer with sherpa-onnx's
+// C API.
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-paraformer-zh-small-2024-03-09.tar.bz2
+// tar xvf sherpa-onnx-paraformer-zh-small-2024-03-09.tar.bz2
+// rm sherpa-onnx-paraformer-zh-small-2024-03-09.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename =
+      "sherpa-onnx-paraformer-zh-small-2024-03-09/test_wavs/0.wav";
+  const char *model_filename =
+      "sherpa-onnx-paraformer-zh-small-2024-03-09/model.int8.onnx";
+  const char *tokens_filename =
+      "sherpa-onnx-paraformer-zh-small-2024-03-09/tokens.txt";
+  const char *provider = "cpu";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // Paraformer config
+  SherpaMnnOfflineParaformerModelConfig paraformer_config;
+  memset(&paraformer_config, 0, sizeof(paraformer_config));
+  paraformer_config.model = model_filename;
+
+  // Offline model config
+  SherpaMnnOfflineModelConfig offline_model_config;
+  memset(&offline_model_config, 0, sizeof(offline_model_config));
+  offline_model_config.debug = 1;
+  offline_model_config.num_threads = 1;
+  offline_model_config.provider = provider;
+  offline_model_config.tokens = tokens_filename;
+  offline_model_config.paraformer = paraformer_config;
+
+  // Recognizer config
+  SherpaMnnOfflineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = offline_model_config;
+
+  const SherpaMnnOfflineRecognizer *recognizer =
+      SherpaMnnCreateOfflineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOfflineStream *stream =
+      SherpaMnnCreateOfflineStream(recognizer);
+
+  SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,
+                                  wave->num_samples);
+  SherpaMnnDecodeOfflineStream(recognizer, stream);
+  const SherpaMnnOfflineRecognizerResult *result =
+      SherpaMnnGetOfflineStreamResult(stream);
+
+  fprintf(stderr, "Decoded text: %s\n", result->text);
+
+  SherpaMnnDestroyOfflineRecognizerResult(result);
+  SherpaMnnDestroyOfflineStream(stream);
+  SherpaMnnDestroyOfflineRecognizer(recognizer);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/run.sh
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/run.sh
@ -0,0 +1,48 @@
+#!/usr/bin/env bash
+
+set -ex
+
+if [ ! -d ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20 ]; then
+  echo "Please download the pre-trained model for testing."
+  echo "You can refer to"
+  echo ""
+  echo "https://k2-fsa.github.io/sherpa/onnx/pretrained_models/zipformer-transducer-models.html#sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english"
+  echo "for help"
+  exit 1
+fi
+
+if [[ ! -f ../build/lib/libsherpa-onnx-c-api.a && ! -f ../build/lib/libsherpa-onnx-c-api.dylib  && ! -f ../build/lib/libsherpa-onnx-c-api.so ]]; then
+  echo "Please build sherpa-onnx first. You can use"
+  echo ""
+  echo "  cd /path/to/sherpa-onnx"
+  echo "  mkdir build"
+  echo "  cd build"
+  echo "  cmake .."
+  echo "  make -j4"
+  exit 1
+fi
+
+if [ ! -f ./decode-file-c-api ]; then
+  make
+fi
+
+./decode-file-c-api \
+  --tokens=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \
+  --encoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx \
+  --decoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \
+  --joiner=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx \
+  ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav
+
+# Run with hotwords
+
+echo "礼 拜 二" > hotwords.txt
+
+./decode-file-c-api \
+  --tokens=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt \
+  --encoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/encoder-epoch-99-avg-1.onnx \
+  --decoder=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/decoder-epoch-99-avg-1.onnx \
+  --joiner=./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/joiner-epoch-99-avg-1.onnx \
+  --hotwords-file=hotwords.txt \
+  --hotwords-score=1.5 \
+  --decoding-method=modified_beam_search \
+  ./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/0.wav
--- a/apps/frameworks/sherpa-mnn/c-api-examples/sense-voice-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/sense-voice-c-api.c
@ -0,0 +1,85 @@
+// c-api-examples/sense-voice-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use SenseVoice with sherpa-onnx's C API.
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
+// tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
+// rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename =
+      "./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/en.wav";
+  const char *model_filename =
+      "./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx";
+  const char *tokens_filename =
+      "./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt";
+  const char *language = "auto";
+  const char *provider = "cpu";
+  int32_t use_inverse_text_normalization = 1;
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  SherpaMnnOfflineSenseVoiceModelConfig sense_voice_config;
+  memset(&sense_voice_config, 0, sizeof(sense_voice_config));
+  sense_voice_config.model = model_filename;
+  sense_voice_config.language = language;
+  sense_voice_config.use_itn = use_inverse_text_normalization;
+
+  // Offline model config
+  SherpaMnnOfflineModelConfig offline_model_config;
+  memset(&offline_model_config, 0, sizeof(offline_model_config));
+  offline_model_config.debug = 1;
+  offline_model_config.num_threads = 1;
+  offline_model_config.provider = provider;
+  offline_model_config.tokens = tokens_filename;
+  offline_model_config.sense_voice = sense_voice_config;
+
+  // Recognizer config
+  SherpaMnnOfflineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = offline_model_config;
+
+  const SherpaMnnOfflineRecognizer *recognizer =
+      SherpaMnnCreateOfflineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOfflineStream *stream =
+      SherpaMnnCreateOfflineStream(recognizer);
+
+  SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,
+                                  wave->num_samples);
+  SherpaMnnDecodeOfflineStream(recognizer, stream);
+  const SherpaMnnOfflineRecognizerResult *result =
+      SherpaMnnGetOfflineStreamResult(stream);
+
+  fprintf(stderr, "Decoded text: %s\n", result->text);
+
+  SherpaMnnDestroyOfflineRecognizerResult(result);
+  SherpaMnnDestroyOfflineStream(stream);
+  SherpaMnnDestroyOfflineRecognizer(recognizer);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/speaker-identification-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/speaker-identification-c-api.c
@ -0,0 +1,257 @@
+// c-api-examples/speaker-identification-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+// We assume you have pre-downloaded the speaker embedding extractor model
+// from
+// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models
+//
+// An example command to download
+// "3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx"
+// is given below:
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speaker-recongition-models/3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx
+//
+// clang-format on
+//
+// Also, please download the test wave files from
+//
+// https://github.com/csukuangfj/sr-data
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+static const float *ComputeEmbedding(
+    const SherpaMnnSpeakerEmbeddingExtractor *ex, const char *wav_filename) {
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    exit(-1);
+  }
+
+  const SherpaMnnOnlineStream *stream =
+      SherpaMnnSpeakerEmbeddingExtractorCreateStream(ex);
+
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, wave->samples,
+                                       wave->num_samples);
+  SherpaMnnOnlineStreamInputFinished(stream);
+
+  if (!SherpaMnnSpeakerEmbeddingExtractorIsReady(ex, stream)) {
+    fprintf(stderr, "The input wave file %s is too short!\n", wav_filename);
+    exit(-1);
+  }
+
+  // we will free `v` outside of this function
+  const float *v =
+      SherpaMnnSpeakerEmbeddingExtractorComputeEmbedding(ex, stream);
+
+  SherpaMnnDestroyOnlineStream(stream);
+  SherpaMnnFreeWave(wave);
+
+  // Remeber to free v to avoid memory leak
+  return v;
+}
+
+int32_t main() {
+  SherpaMnnSpeakerEmbeddingExtractorConfig config;
+
+  memset(&config, 0, sizeof(config));
+
+  // please download the model from
+  // https://github.com/k2-fsa/sherpa-onnx/releases/tag/speaker-recongition-models
+  config.model = "./3dspeaker_speech_campplus_sv_zh-cn_16k-common.onnx";
+
+  config.num_threads = 1;
+  config.debug = 0;
+  config.provider = "cpu";
+
+  const SherpaMnnSpeakerEmbeddingExtractor *ex =
+      SherpaMnnCreateSpeakerEmbeddingExtractor(&config);
+  if (!ex) {
+    fprintf(stderr, "Failed to create speaker embedding extractor");
+    return -1;
+  }
+
+  int32_t dim = SherpaMnnSpeakerEmbeddingExtractorDim(ex);
+
+  const SherpaMnnSpeakerEmbeddingManager *manager =
+      SherpaMnnCreateSpeakerEmbeddingManager(dim);
+
+  // Please download the test data from
+  // https://github.com/csukuangfj/sr-data
+  const char *spk1_1 = "./sr-data/enroll/fangjun-sr-1.wav";
+  const char *spk1_2 = "./sr-data/enroll/fangjun-sr-2.wav";
+  const char *spk1_3 = "./sr-data/enroll/fangjun-sr-3.wav";
+
+  const char *spk2_1 = "./sr-data/enroll/leijun-sr-1.wav";
+  const char *spk2_2 = "./sr-data/enroll/leijun-sr-2.wav";
+
+  const float *spk1_vec[4] = {NULL};
+  spk1_vec[0] = ComputeEmbedding(ex, spk1_1);
+  spk1_vec[1] = ComputeEmbedding(ex, spk1_2);
+  spk1_vec[2] = ComputeEmbedding(ex, spk1_3);
+
+  const float *spk2_vec[3] = {NULL};
+  spk2_vec[0] = ComputeEmbedding(ex, spk2_1);
+  spk2_vec[1] = ComputeEmbedding(ex, spk2_2);
+
+  if (!SherpaMnnSpeakerEmbeddingManagerAddList(manager, "fangjun", spk1_vec)) {
+    fprintf(stderr, "Failed to register fangjun\n");
+    exit(-1);
+  }
+
+  if (!SherpaMnnSpeakerEmbeddingManagerContains(manager, "fangjun")) {
+    fprintf(stderr, "Failed to find fangjun\n");
+    exit(-1);
+  }
+
+  if (!SherpaMnnSpeakerEmbeddingManagerAddList(manager, "leijun", spk2_vec)) {
+    fprintf(stderr, "Failed to register leijun\n");
+    exit(-1);
+  }
+
+  if (!SherpaMnnSpeakerEmbeddingManagerContains(manager, "leijun")) {
+    fprintf(stderr, "Failed to find leijun\n");
+    exit(-1);
+  }
+
+  if (SherpaMnnSpeakerEmbeddingManagerNumSpeakers(manager) != 2) {
+    fprintf(stderr, "There should be two speakers: fangjun and leijun\n");
+    exit(-1);
+  }
+
+  const char *const *all_speakers =
+      SherpaMnnSpeakerEmbeddingManagerGetAllSpeakers(manager);
+  const char *const *p = all_speakers;
+  fprintf(stderr, "list of registered speakers\n-----\n");
+  while (p[0]) {
+    fprintf(stderr, "speaker: %s\n", p[0]);
+    ++p;
+  }
+  fprintf(stderr, "----\n");
+
+  SherpaMnnSpeakerEmbeddingManagerFreeAllSpeakers(all_speakers);
+
+  const char *test1 = "./sr-data/test/fangjun-test-sr-1.wav";
+  const char *test2 = "./sr-data/test/leijun-test-sr-1.wav";
+  const char *test3 = "./sr-data/test/liudehua-test-sr-1.wav";
+
+  const float *v1 = ComputeEmbedding(ex, test1);
+  const float *v2 = ComputeEmbedding(ex, test2);
+  const float *v3 = ComputeEmbedding(ex, test3);
+
+  float threshold = 0.6;
+
+  const char *name1 =
+      SherpaMnnSpeakerEmbeddingManagerSearch(manager, v1, threshold);
+  if (name1) {
+    fprintf(stderr, "%s: Found %s\n", test1, name1);
+    SherpaMnnSpeakerEmbeddingManagerFreeSearch(name1);
+  } else {
+    fprintf(stderr, "%s: Not found\n", test1);
+  }
+
+  const char *name2 =
+      SherpaMnnSpeakerEmbeddingManagerSearch(manager, v2, threshold);
+  if (name2) {
+    fprintf(stderr, "%s: Found %s\n", test2, name2);
+    SherpaMnnSpeakerEmbeddingManagerFreeSearch(name2);
+  } else {
+    fprintf(stderr, "%s: Not found\n", test2);
+  }
+
+  const char *name3 =
+      SherpaMnnSpeakerEmbeddingManagerSearch(manager, v3, threshold);
+  if (name3) {
+    fprintf(stderr, "%s: Found %s\n", test3, name3);
+    SherpaMnnSpeakerEmbeddingManagerFreeSearch(name3);
+  } else {
+    fprintf(stderr, "%s: Not found\n", test3);
+  }
+
+  int32_t ok = SherpaMnnSpeakerEmbeddingManagerVerify(manager, "fangjun", v1,
+                                                       threshold);
+  if (ok) {
+    fprintf(stderr, "%s matches fangjun\n", test1);
+  } else {
+    fprintf(stderr, "%s does NOT match fangjun\n", test1);
+  }
+
+  ok = SherpaMnnSpeakerEmbeddingManagerVerify(manager, "fangjun", v2,
+                                               threshold);
+  if (ok) {
+    fprintf(stderr, "%s matches fangjun\n", test2);
+  } else {
+    fprintf(stderr, "%s does NOT match fangjun\n", test2);
+  }
+
+  fprintf(stderr, "Removing fangjun\n");
+  if (!SherpaMnnSpeakerEmbeddingManagerRemove(manager, "fangjun")) {
+    fprintf(stderr, "Failed to remove fangjun\n");
+    exit(-1);
+  }
+
+  if (SherpaMnnSpeakerEmbeddingManagerNumSpeakers(manager) != 1) {
+    fprintf(stderr, "There should be only 1 speaker left\n");
+    exit(-1);
+  }
+
+  name1 = SherpaMnnSpeakerEmbeddingManagerSearch(manager, v1, threshold);
+  if (name1) {
+    fprintf(stderr, "%s: Found %s\n", test1, name1);
+    SherpaMnnSpeakerEmbeddingManagerFreeSearch(name1);
+  } else {
+    fprintf(stderr, "%s: Not found\n", test1);
+  }
+
+  fprintf(stderr, "Removing leijun\n");
+  if (!SherpaMnnSpeakerEmbeddingManagerRemove(manager, "leijun")) {
+    fprintf(stderr, "Failed to remove leijun\n");
+    exit(-1);
+  }
+
+  if (SherpaMnnSpeakerEmbeddingManagerNumSpeakers(manager) != 0) {
+    fprintf(stderr, "There should be only 1 speaker left\n");
+    exit(-1);
+  }
+
+  name2 = SherpaMnnSpeakerEmbeddingManagerSearch(manager, v2, threshold);
+  if (name2) {
+    fprintf(stderr, "%s: Found %s\n", test2, name2);
+    SherpaMnnSpeakerEmbeddingManagerFreeSearch(name2);
+  } else {
+    fprintf(stderr, "%s: Not found\n", test2);
+  }
+
+  all_speakers = SherpaMnnSpeakerEmbeddingManagerGetAllSpeakers(manager);
+
+  p = all_speakers;
+  fprintf(stderr, "list of registered speakers\n-----\n");
+  while (p[0]) {
+    fprintf(stderr, "speaker: %s\n", p[0]);
+    ++p;
+  }
+  fprintf(stderr, "----\n");
+
+  SherpaMnnSpeakerEmbeddingManagerFreeAllSpeakers(all_speakers);
+  SherpaMnnSpeakerEmbeddingExtractorDestroyEmbedding(v1);
+  SherpaMnnSpeakerEmbeddingExtractorDestroyEmbedding(v2);
+  SherpaMnnSpeakerEmbeddingExtractorDestroyEmbedding(v3);
+
+  SherpaMnnSpeakerEmbeddingExtractorDestroyEmbedding(spk1_vec[0]);
+  SherpaMnnSpeakerEmbeddingExtractorDestroyEmbedding(spk1_vec[1]);
+  SherpaMnnSpeakerEmbeddingExtractorDestroyEmbedding(spk1_vec[2]);
+
+  SherpaMnnSpeakerEmbeddingExtractorDestroyEmbedding(spk2_vec[0]);
+  SherpaMnnSpeakerEmbeddingExtractorDestroyEmbedding(spk2_vec[1]);
+
+  SherpaMnnDestroySpeakerEmbeddingManager(manager);
+  SherpaMnnDestroySpeakerEmbeddingExtractor(ex);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/speech-enhancement-gtcrn-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/speech-enhancement-gtcrn-c-api.c
@ -0,0 +1,55 @@
+// c-api-examples/speech-enhancement-gtcrn-c-api.c
+//
+// Copyright (c)  2025  Xiaomi Corporation
+//
+// We assume you have pre-downloaded model
+// from
+// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models
+//
+//
+// An example command to download
+// clang-format off
+/*
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav
+*/
+// clang-format on
+#include <stdio.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  SherpaMnnOfflineSpeechDenoiserConfig config;
+  const char *wav_filename = "./inp_16k.wav";
+  const char *out_wave_filename = "./enhanced_16k.wav";
+
+  memset(&config, 0, sizeof(config));
+  config.model.gtcrn.model = "./gtcrn_simple.onnx";
+
+  const SherpaMnnOfflineSpeechDenoiser *sd =
+      SherpaMnnCreateOfflineSpeechDenoiser(&config);
+  if (!sd) {
+    fprintf(stderr, "Please check your config");
+    return -1;
+  }
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    SherpaMnnDestroyOfflineSpeechDenoiser(sd);
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  const SherpaMnnDenoisedAudio *denoised = SherpaMnnOfflineSpeechDenoiserRun(
+      sd, wave->samples, wave->num_samples, wave->sample_rate);
+
+  SherpaMnnWriteWave(denoised->samples, denoised->n, denoised->sample_rate,
+                      out_wave_filename);
+
+  SherpaMnnDestroyDenoisedAudio(denoised);
+  SherpaMnnFreeWave(wave);
+  SherpaMnnDestroyOfflineSpeechDenoiser(sd);
+
+  fprintf(stdout, "Saved to %s\n", out_wave_filename);
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/spoken-language-identification-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/spoken-language-identification-c-api.c
@ -0,0 +1,68 @@
+// c-api-examples/spoken-language-identification-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+// We assume you have pre-downloaded the whisper multi-lingual models
+// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models
+// An example command to download the "tiny" whisper model is given below:
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2
+// tar xvf sherpa-onnx-whisper-tiny.tar.bz2
+// rm sherpa-onnx-whisper-tiny.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  SherpaMnnSpokenLanguageIdentificationConfig config;
+
+  memset(&config, 0, sizeof(config));
+
+  config.whisper.encoder = "./sherpa-onnx-whisper-tiny/tiny-encoder.int8.onnx";
+  config.whisper.decoder = "./sherpa-onnx-whisper-tiny/tiny-decoder.int8.onnx";
+  config.num_threads = 1;
+  config.debug = 1;
+  config.provider = "cpu";
+
+  const SherpaMnnSpokenLanguageIdentification *slid =
+      SherpaMnnCreateSpokenLanguageIdentification(&config);
+  if (!slid) {
+    fprintf(stderr, "Failed to create spoken language identifier");
+    return -1;
+  }
+
+  // You can find more test waves from
+  // https://hf-mirror.com/spaces/k2-fsa/spoken-language-identification/tree/main/test_wavs
+  const char *wav_filename = "./sherpa-onnx-whisper-tiny/test_wavs/0.wav";
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  SherpaMnnOfflineStream *stream =
+      SherpaMnnSpokenLanguageIdentificationCreateOfflineStream(slid);
+
+  SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,
+                                  wave->num_samples);
+
+  const SherpaMnnSpokenLanguageIdentificationResult *result =
+      SherpaMnnSpokenLanguageIdentificationCompute(slid, stream);
+
+  fprintf(stderr, "wav_filename: %s\n", wav_filename);
+  fprintf(stderr, "Detected language: %s\n", result->lang);
+
+  SherpaMnnDestroySpokenLanguageIdentificationResult(result);
+  SherpaMnnDestroyOfflineStream(stream);
+  SherpaMnnFreeWave(wave);
+  SherpaMnnDestroySpokenLanguageIdentification(slid);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/streaming-ctc-buffered-tokens-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/streaming-ctc-buffered-tokens-c-api.c
@ -0,0 +1,180 @@
+// c-api-examples/streaming-ctc-buffered-tokens-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+// Copyright (c)  2024  Luo Xiao
+
+//
+// This file demonstrates how to use streaming Zipformer2 Ctc with sherpa-onnx's
+// C API and with tokens loaded from buffered strings instead of
+// from external files API.
+// clang-format off
+// 
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2
+// tar xvf sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2
+// rm sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+static size_t ReadFile(const char *filename, const char **buffer_out) {
+  FILE *file = fopen(filename, "r");
+  if (file == NULL) {
+    fprintf(stderr, "Failed to open %s\n", filename);
+    return -1;
+  }
+  fseek(file, 0L, SEEK_END);
+  long size = ftell(file);
+  rewind(file);
+  *buffer_out = malloc(size);
+  if (*buffer_out == NULL) {
+    fclose(file);
+    fprintf(stderr, "Memory error\n");
+    return -1;
+  }
+  size_t read_bytes = fread((void *)*buffer_out, 1, size, file);
+  if (read_bytes != size) {
+    printf("Errors occured in reading the file %s\n", filename);
+    free((void *)*buffer_out);
+    *buffer_out = NULL;
+    fclose(file);
+    return -1;
+  }
+  fclose(file);
+  return read_bytes;
+}
+
+int32_t main() {
+  const char *wav_filename =
+      "sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/test_wavs/"
+      "DEV_T0000000000.wav";
+  const char *model_filename =
+      "sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/"
+      "ctc-epoch-20-avg-1-chunk-16-left-128.int8.onnx";
+  const char *tokens_filename =
+      "sherpa-onnx-streaming-zipformer-ctc-multi-zh-hans-2023-12-13/tokens.txt";
+  const char *provider = "cpu";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // reading tokens to buffers
+  const char *tokens_buf;
+  size_t token_buf_size = ReadFile(tokens_filename, &tokens_buf);
+  if (token_buf_size < 1) {
+    fprintf(stderr, "Please check your tokens.txt!\n");
+    free((void *)tokens_buf);
+    return -1;
+  }
+
+  // Zipformer2Ctc config
+  SherpaMnnOnlineZipformer2CtcModelConfig zipformer2_ctc_config;
+  memset(&zipformer2_ctc_config, 0, sizeof(zipformer2_ctc_config));
+  zipformer2_ctc_config.model = model_filename;
+
+  // Online model config
+  SherpaMnnOnlineModelConfig online_model_config;
+  memset(&online_model_config, 0, sizeof(online_model_config));
+  online_model_config.debug = 1;
+  online_model_config.num_threads = 1;
+  online_model_config.provider = provider;
+  online_model_config.tokens_buf = tokens_buf;
+  online_model_config.tokens_buf_size = token_buf_size;
+  online_model_config.zipformer2_ctc = zipformer2_ctc_config;
+
+  // Recognizer config
+  SherpaMnnOnlineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = online_model_config;
+
+  const SherpaMnnOnlineRecognizer *recognizer =
+      SherpaMnnCreateOnlineRecognizer(&recognizer_config);
+
+  free((void *)tokens_buf);
+  tokens_buf = NULL;
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOnlineStream *stream =
+      SherpaMnnCreateOnlineStream(recognizer);
+
+  const SherpaMnnDisplay *display = SherpaMnnCreateDisplay(50);
+  int32_t segment_id = 0;
+
+// simulate streaming. You can choose an arbitrary N
+#define N 3200
+
+  fprintf(stderr, "sample rate: %d, num samples: %d, duration: %.2f s\n",
+          wave->sample_rate, wave->num_samples,
+          (float)wave->num_samples / wave->sample_rate);
+
+  int32_t k = 0;
+  while (k < wave->num_samples) {
+    int32_t start = k;
+    int32_t end =
+        (start + N > wave->num_samples) ? wave->num_samples : (start + N);
+    k += N;
+
+    SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate,
+                                         wave->samples + start, end - start);
+    while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+      SherpaMnnDecodeOnlineStream(recognizer, stream);
+    }
+
+    const SherpaMnnOnlineRecognizerResult *r =
+        SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+    if (strlen(r->text)) {
+      SherpaMnnPrint(display, segment_id, r->text);
+    }
+
+    if (SherpaMnnOnlineStreamIsEndpoint(recognizer, stream)) {
+      if (strlen(r->text)) {
+        ++segment_id;
+      }
+      SherpaMnnOnlineStreamReset(recognizer, stream);
+    }
+
+    SherpaMnnDestroyOnlineRecognizerResult(r);
+  }
+
+  // add some tail padding
+  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       4800);
+
+  SherpaMnnFreeWave(wave);
+
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+    SherpaMnnDecodeOnlineStream(recognizer, stream);
+  }
+
+  const SherpaMnnOnlineRecognizerResult *r =
+      SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+  if (strlen(r->text)) {
+    SherpaMnnPrint(display, segment_id, r->text);
+  }
+
+  SherpaMnnDestroyOnlineRecognizerResult(r);
+
+  SherpaMnnDestroyDisplay(display);
+  SherpaMnnDestroyOnlineStream(stream);
+  SherpaMnnDestroyOnlineRecognizer(recognizer);
+  fprintf(stderr, "\n");
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/streaming-hlg-decode-file-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/streaming-hlg-decode-file-c-api.c
@ -0,0 +1,130 @@
+// c-api-examples/streaming-hlg-decode-file-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+/*
+We use the following model as an example
+
+// clang-format off
+
+Download the model from
+https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2
+
+tar xvf sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2
+rm sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2
+
+build/bin/streaming-hlg-decode-file-c-api
+
+(The above model is from https://github.com/k2-fsa/icefall/pull/1557)
+*/
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  // clang-format off
+  //
+  // Please download the model from
+  // https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18.tar.bz2
+  const char *model = "./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/ctc-epoch-30-avg-3-chunk-16-left-128.int8.onnx";
+  const char *tokens = "./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/tokens.txt";
+  const char *graph = "./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/HLG.fst";
+  const char *wav_filename = "./sherpa-onnx-streaming-zipformer-ctc-small-2024-03-18/test_wavs/8k.wav";
+  // clang-format on
+
+  SherpaMnnOnlineRecognizerConfig config;
+
+  memset(&config, 0, sizeof(config));
+  config.feat_config.sample_rate = 16000;
+  config.feat_config.feature_dim = 80;
+  config.model_config.zipformer2_ctc.model = model;
+  config.model_config.tokens = tokens;
+  config.model_config.num_threads = 1;
+  config.model_config.provider = "cpu";
+  config.model_config.debug = 0;
+  config.ctc_fst_decoder_config.graph = graph;
+  const SherpaMnnOnlineRecognizer *recognizer =
+      SherpaMnnCreateOnlineRecognizer(&config);
+  if (!recognizer) {
+    fprintf(stderr, "Failed to create recognizer");
+    exit(-1);
+  }
+
+  const SherpaMnnOnlineStream *stream =
+      SherpaMnnCreateOnlineStream(recognizer);
+
+  const SherpaMnnDisplay *display = SherpaMnnCreateDisplay(50);
+  int32_t segment_id = 0;
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    exit(-1);
+  }
+
+// simulate streaming. You can choose an arbitrary N
+#define N 3200
+
+  fprintf(stderr, "sample rate: %d, num samples: %d, duration: %.2f s\n",
+          wave->sample_rate, wave->num_samples,
+          (float)wave->num_samples / wave->sample_rate);
+
+  int32_t k = 0;
+  while (k < wave->num_samples) {
+    int32_t start = k;
+    int32_t end =
+        (start + N > wave->num_samples) ? wave->num_samples : (start + N);
+    k += N;
+
+    SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate,
+                                         wave->samples + start, end - start);
+    while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+      SherpaMnnDecodeOnlineStream(recognizer, stream);
+    }
+
+    const SherpaMnnOnlineRecognizerResult *r =
+        SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+    if (strlen(r->text)) {
+      SherpaMnnPrint(display, segment_id, r->text);
+    }
+
+    if (SherpaMnnOnlineStreamIsEndpoint(recognizer, stream)) {
+      if (strlen(r->text)) {
+        ++segment_id;
+      }
+      SherpaMnnOnlineStreamReset(recognizer, stream);
+    }
+
+    SherpaMnnDestroyOnlineRecognizerResult(r);
+  }
+
+  // add some tail padding
+  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       4800);
+
+  SherpaMnnFreeWave(wave);
+
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+    SherpaMnnDecodeOnlineStream(recognizer, stream);
+  }
+
+  const SherpaMnnOnlineRecognizerResult *r =
+      SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+  if (strlen(r->text)) {
+    SherpaMnnPrint(display, segment_id, r->text);
+  }
+
+  SherpaMnnDestroyOnlineRecognizerResult(r);
+
+  SherpaMnnDestroyDisplay(display);
+  SherpaMnnDestroyOnlineStream(stream);
+  SherpaMnnDestroyOnlineRecognizer(recognizer);
+  fprintf(stderr, "\n");
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/streaming-paraformer-buffered-tokens-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/streaming-paraformer-buffered-tokens-c-api.c
@ -0,0 +1,181 @@
+// c-api-examples/streaming-paraformer-buffered-tokens-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+// Copyright (c)  2024  Luo Xiao
+
+//
+// This file demonstrates how to use streaming Paraformer with sherpa-onnx's C
+// API and with tokens loaded from buffered strings instead of from
+// external files API.
+// clang-format off
+// 
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
+// tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
+// rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+static size_t ReadFile(const char *filename, const char **buffer_out) {
+  FILE *file = fopen(filename, "r");
+  if (file == NULL) {
+    fprintf(stderr, "Failed to open %s\n", filename);
+    return -1;
+  }
+  fseek(file, 0L, SEEK_END);
+  long size = ftell(file);
+  rewind(file);
+  *buffer_out = malloc(size);
+  if (*buffer_out == NULL) {
+    fclose(file);
+    fprintf(stderr, "Memory error\n");
+    return -1;
+  }
+  size_t read_bytes = fread((void *)*buffer_out, 1, size, file);
+  if (read_bytes != size) {
+    printf("Errors occured in reading the file %s\n", filename);
+    free((void *)*buffer_out);
+    *buffer_out = NULL;
+    fclose(file);
+    return -1;
+  }
+  fclose(file);
+  return read_bytes;
+}
+
+int32_t main() {
+  const char *wav_filename =
+      "sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav";
+  const char *encoder_filename =
+      "sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx";
+  const char *decoder_filename =
+      "sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx";
+  const char *tokens_filename =
+      "sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt";
+  const char *provider = "cpu";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // reading tokens to buffers
+  const char *tokens_buf;
+  size_t token_buf_size = ReadFile(tokens_filename, &tokens_buf);
+  if (token_buf_size < 1) {
+    fprintf(stderr, "Please check your tokens.txt!\n");
+    free((void *)tokens_buf);
+    return -1;
+  }
+
+  // Paraformer config
+  SherpaMnnOnlineParaformerModelConfig paraformer_config;
+  memset(&paraformer_config, 0, sizeof(paraformer_config));
+  paraformer_config.encoder = encoder_filename;
+  paraformer_config.decoder = decoder_filename;
+
+  // Online model config
+  SherpaMnnOnlineModelConfig online_model_config;
+  memset(&online_model_config, 0, sizeof(online_model_config));
+  online_model_config.debug = 1;
+  online_model_config.num_threads = 1;
+  online_model_config.provider = provider;
+  online_model_config.tokens_buf = tokens_buf;
+  online_model_config.tokens_buf_size = token_buf_size;
+  online_model_config.paraformer = paraformer_config;
+
+  // Recognizer config
+  SherpaMnnOnlineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = online_model_config;
+
+  const SherpaMnnOnlineRecognizer *recognizer =
+      SherpaMnnCreateOnlineRecognizer(&recognizer_config);
+
+  free((void *)tokens_buf);
+  tokens_buf = NULL;
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOnlineStream *stream =
+      SherpaMnnCreateOnlineStream(recognizer);
+
+  const SherpaMnnDisplay *display = SherpaMnnCreateDisplay(50);
+  int32_t segment_id = 0;
+
+// simulate streaming. You can choose an arbitrary N
+#define N 3200
+
+  fprintf(stderr, "sample rate: %d, num samples: %d, duration: %.2f s\n",
+          wave->sample_rate, wave->num_samples,
+          (float)wave->num_samples / wave->sample_rate);
+
+  int32_t k = 0;
+  while (k < wave->num_samples) {
+    int32_t start = k;
+    int32_t end =
+        (start + N > wave->num_samples) ? wave->num_samples : (start + N);
+    k += N;
+
+    SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate,
+                                         wave->samples + start, end - start);
+    while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+      SherpaMnnDecodeOnlineStream(recognizer, stream);
+    }
+
+    const SherpaMnnOnlineRecognizerResult *r =
+        SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+    if (strlen(r->text)) {
+      SherpaMnnPrint(display, segment_id, r->text);
+    }
+
+    if (SherpaMnnOnlineStreamIsEndpoint(recognizer, stream)) {
+      if (strlen(r->text)) {
+        ++segment_id;
+      }
+      SherpaMnnOnlineStreamReset(recognizer, stream);
+    }
+
+    SherpaMnnDestroyOnlineRecognizerResult(r);
+  }
+
+  // add some tail padding
+  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       4800);
+
+  SherpaMnnFreeWave(wave);
+
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+    SherpaMnnDecodeOnlineStream(recognizer, stream);
+  }
+
+  const SherpaMnnOnlineRecognizerResult *r =
+      SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+  if (strlen(r->text)) {
+    SherpaMnnPrint(display, segment_id, r->text);
+  }
+
+  SherpaMnnDestroyOnlineRecognizerResult(r);
+
+  SherpaMnnDestroyDisplay(display);
+  SherpaMnnDestroyOnlineStream(stream);
+  SherpaMnnDestroyOnlineRecognizer(recognizer);
+  fprintf(stderr, "\n");
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/streaming-paraformer-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/streaming-paraformer-c-api.c
@ -0,0 +1,139 @@
+// c-api-examples/streaming-paraformer-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use streaming Paraformer with sherpa-onnx's C
+// API.
+// clang-format off
+// 
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
+// tar xvf sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
+// rm sherpa-onnx-streaming-paraformer-bilingual-zh-en.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename =
+      "sherpa-onnx-streaming-paraformer-bilingual-zh-en/test_wavs/0.wav";
+  const char *encoder_filename =
+      "sherpa-onnx-streaming-paraformer-bilingual-zh-en/encoder.int8.onnx";
+  const char *decoder_filename =
+      "sherpa-onnx-streaming-paraformer-bilingual-zh-en/decoder.int8.onnx";
+  const char *tokens_filename =
+      "sherpa-onnx-streaming-paraformer-bilingual-zh-en/tokens.txt";
+  const char *provider = "cpu";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // Paraformer config
+  SherpaMnnOnlineParaformerModelConfig paraformer_config;
+  memset(&paraformer_config, 0, sizeof(paraformer_config));
+  paraformer_config.encoder = encoder_filename;
+  paraformer_config.decoder = decoder_filename;
+
+  // Online model config
+  SherpaMnnOnlineModelConfig online_model_config;
+  memset(&online_model_config, 0, sizeof(online_model_config));
+  online_model_config.debug = 1;
+  online_model_config.num_threads = 1;
+  online_model_config.provider = provider;
+  online_model_config.tokens = tokens_filename;
+  online_model_config.paraformer = paraformer_config;
+
+  // Recognizer config
+  SherpaMnnOnlineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = online_model_config;
+
+  const SherpaMnnOnlineRecognizer *recognizer =
+      SherpaMnnCreateOnlineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOnlineStream *stream =
+      SherpaMnnCreateOnlineStream(recognizer);
+
+  const SherpaMnnDisplay *display = SherpaMnnCreateDisplay(50);
+  int32_t segment_id = 0;
+
+// simulate streaming. You can choose an arbitrary N
+#define N 3200
+
+  fprintf(stderr, "sample rate: %d, num samples: %d, duration: %.2f s\n",
+          wave->sample_rate, wave->num_samples,
+          (float)wave->num_samples / wave->sample_rate);
+
+  int32_t k = 0;
+  while (k < wave->num_samples) {
+    int32_t start = k;
+    int32_t end =
+        (start + N > wave->num_samples) ? wave->num_samples : (start + N);
+    k += N;
+
+    SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate,
+                                         wave->samples + start, end - start);
+    while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+      SherpaMnnDecodeOnlineStream(recognizer, stream);
+    }
+
+    const SherpaMnnOnlineRecognizerResult *r =
+        SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+    if (strlen(r->text)) {
+      SherpaMnnPrint(display, segment_id, r->text);
+    }
+
+    if (SherpaMnnOnlineStreamIsEndpoint(recognizer, stream)) {
+      if (strlen(r->text)) {
+        ++segment_id;
+      }
+      SherpaMnnOnlineStreamReset(recognizer, stream);
+    }
+
+    SherpaMnnDestroyOnlineRecognizerResult(r);
+  }
+
+  // add some tail padding
+  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       4800);
+
+  SherpaMnnFreeWave(wave);
+
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+    SherpaMnnDecodeOnlineStream(recognizer, stream);
+  }
+
+  const SherpaMnnOnlineRecognizerResult *r =
+      SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+  if (strlen(r->text)) {
+    SherpaMnnPrint(display, segment_id, r->text);
+  }
+
+  SherpaMnnDestroyOnlineRecognizerResult(r);
+
+  SherpaMnnDestroyDisplay(display);
+  SherpaMnnDestroyOnlineStream(stream);
+  SherpaMnnDestroyOnlineRecognizer(recognizer);
+  fprintf(stderr, "\n");
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/streaming-zipformer-buffered-tokens-hotwords-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/streaming-zipformer-buffered-tokens-hotwords-c-api.c
@ -0,0 +1,203 @@
+// c-api-examples/streaming-zipformer-buffered-tokens-hotwords-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+// Copyright (c)  2024  Luo Xiao
+
+//
+// This file demonstrates how to use streaming Zipformer with sherpa-onnx's C
+// API and with tokens and hotwords loaded from buffered strings instead of from
+// external files API.
+// clang-format off
+// 
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2
+// tar xvf sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2
+// rm sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+static size_t ReadFile(const char *filename, const char **buffer_out) {
+  FILE *file = fopen(filename, "r");
+  if (file == NULL) {
+    fprintf(stderr, "Failed to open %s\n", filename);
+    return -1;
+  }
+  fseek(file, 0L, SEEK_END);
+  long size = ftell(file);
+  rewind(file);
+  *buffer_out = malloc(size);
+  if (*buffer_out == NULL) {
+    fclose(file);
+    fprintf(stderr, "Memory error\n");
+    return -1;
+  }
+  size_t read_bytes = fread((void *)*buffer_out, 1, size, file);
+  if (read_bytes != size) {
+    printf("Errors occured in reading the file %s\n", filename);
+    free((void *)*buffer_out);
+    *buffer_out = NULL;
+    fclose(file);
+    return -1;
+  }
+  fclose(file);
+  return read_bytes;
+}
+
+int32_t main() {
+  const char *wav_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/test_wavs/0.wav";
+  const char *encoder_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/"
+      "encoder-epoch-99-avg-1.onnx";
+  const char *decoder_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/"
+      "decoder-epoch-99-avg-1.onnx";
+  const char *joiner_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/"
+      "joiner-epoch-99-avg-1.onnx";
+  const char *provider = "cpu";
+  const char *modeling_unit = "bpe";
+  const char *tokens_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/tokens.txt";
+  const char *hotwords_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/hotwords.txt";
+  const char *bpe_vocab =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/"
+      "bpe.vocab";
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // reading tokens and hotwords to buffers
+  const char *tokens_buf;
+  size_t token_buf_size = ReadFile(tokens_filename, &tokens_buf);
+  if (token_buf_size < 1) {
+    fprintf(stderr, "Please check your tokens.txt!\n");
+    free((void *)tokens_buf);
+    return -1;
+  }
+  const char *hotwords_buf;
+  size_t hotwords_buf_size = ReadFile(hotwords_filename, &hotwords_buf);
+  if (hotwords_buf_size < 1) {
+    fprintf(stderr, "Please check your hotwords.txt!\n");
+    free((void *)hotwords_buf);
+    return -1;
+  }
+
+  // Zipformer config
+  SherpaMnnOnlineTransducerModelConfig zipformer_config;
+  memset(&zipformer_config, 0, sizeof(zipformer_config));
+  zipformer_config.encoder = encoder_filename;
+  zipformer_config.decoder = decoder_filename;
+  zipformer_config.joiner = joiner_filename;
+
+  // Online model config
+  SherpaMnnOnlineModelConfig online_model_config;
+  memset(&online_model_config, 0, sizeof(online_model_config));
+  online_model_config.debug = 1;
+  online_model_config.num_threads = 1;
+  online_model_config.provider = provider;
+  online_model_config.tokens_buf = tokens_buf;
+  online_model_config.tokens_buf_size = token_buf_size;
+  online_model_config.transducer = zipformer_config;
+
+  // Recognizer config
+  SherpaMnnOnlineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "modified_beam_search";
+  recognizer_config.model_config = online_model_config;
+  recognizer_config.hotwords_buf = hotwords_buf;
+  recognizer_config.hotwords_buf_size = hotwords_buf_size;
+
+  const SherpaMnnOnlineRecognizer *recognizer =
+      SherpaMnnCreateOnlineRecognizer(&recognizer_config);
+
+  free((void *)tokens_buf);
+  tokens_buf = NULL;
+  free((void *)hotwords_buf);
+  hotwords_buf = NULL;
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOnlineStream *stream =
+      SherpaMnnCreateOnlineStream(recognizer);
+
+  const SherpaMnnDisplay *display = SherpaMnnCreateDisplay(50);
+  int32_t segment_id = 0;
+
+// simulate streaming. You can choose an arbitrary N
+#define N 3200
+
+  fprintf(stderr, "sample rate: %d, num samples: %d, duration: %.2f s\n",
+          wave->sample_rate, wave->num_samples,
+          (float)wave->num_samples / wave->sample_rate);
+
+  int32_t k = 0;
+  while (k < wave->num_samples) {
+    int32_t start = k;
+    int32_t end =
+        (start + N > wave->num_samples) ? wave->num_samples : (start + N);
+    k += N;
+
+    SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate,
+                                         wave->samples + start, end - start);
+    while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+      SherpaMnnDecodeOnlineStream(recognizer, stream);
+    }
+
+    const SherpaMnnOnlineRecognizerResult *r =
+        SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+    if (strlen(r->text)) {
+      SherpaMnnPrint(display, segment_id, r->text);
+    }
+
+    if (SherpaMnnOnlineStreamIsEndpoint(recognizer, stream)) {
+      if (strlen(r->text)) {
+        ++segment_id;
+      }
+      SherpaMnnOnlineStreamReset(recognizer, stream);
+    }
+
+    SherpaMnnDestroyOnlineRecognizerResult(r);
+  }
+
+  // add some tail padding
+  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       4800);
+
+  SherpaMnnFreeWave(wave);
+
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+    SherpaMnnDecodeOnlineStream(recognizer, stream);
+  }
+
+  const SherpaMnnOnlineRecognizerResult *r =
+      SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+  if (strlen(r->text)) {
+    SherpaMnnPrint(display, segment_id, r->text);
+  }
+
+  SherpaMnnDestroyOnlineRecognizerResult(r);
+
+  SherpaMnnDestroyDisplay(display);
+  SherpaMnnDestroyOnlineStream(stream);
+  SherpaMnnDestroyOnlineRecognizer(recognizer);
+  fprintf(stderr, "\n");
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/streaming-zipformer-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/streaming-zipformer-c-api.c
@ -0,0 +1,145 @@
+// c-api-examples/streaming-zipformer-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use streaming Zipformer with sherpa-onnx's C
+// API.
+// clang-format off
+// 
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2
+// tar xvf sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2
+// rm sherpa-onnx-streaming-zipformer-en-20M-2023-02-17.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/test_wavs/0.wav";
+  const char *encoder_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/"
+      "encoder-epoch-99-avg-1.onnx";
+  const char *decoder_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/"
+      "decoder-epoch-99-avg-1.onnx";
+  const char *joiner_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/"
+      "joiner-epoch-99-avg-1.onnx";
+  const char *tokens_filename =
+      "sherpa-onnx-streaming-zipformer-en-20M-2023-02-17/tokens.txt";
+  const char *provider = "cpu";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // Zipformer config
+  SherpaMnnOnlineTransducerModelConfig zipformer_config;
+  memset(&zipformer_config, 0, sizeof(zipformer_config));
+  zipformer_config.encoder = encoder_filename;
+  zipformer_config.decoder = decoder_filename;
+  zipformer_config.joiner = joiner_filename;
+
+  // Online model config
+  SherpaMnnOnlineModelConfig online_model_config;
+  memset(&online_model_config, 0, sizeof(online_model_config));
+  online_model_config.debug = 1;
+  online_model_config.num_threads = 1;
+  online_model_config.provider = provider;
+  online_model_config.tokens = tokens_filename;
+  online_model_config.transducer = zipformer_config;
+
+  // Recognizer config
+  SherpaMnnOnlineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = online_model_config;
+
+  const SherpaMnnOnlineRecognizer *recognizer =
+      SherpaMnnCreateOnlineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOnlineStream *stream =
+      SherpaMnnCreateOnlineStream(recognizer);
+
+  const SherpaMnnDisplay *display = SherpaMnnCreateDisplay(50);
+  int32_t segment_id = 0;
+
+// simulate streaming. You can choose an arbitrary N
+#define N 3200
+
+  fprintf(stderr, "sample rate: %d, num samples: %d, duration: %.2f s\n",
+          wave->sample_rate, wave->num_samples,
+          (float)wave->num_samples / wave->sample_rate);
+
+  int32_t k = 0;
+  while (k < wave->num_samples) {
+    int32_t start = k;
+    int32_t end =
+        (start + N > wave->num_samples) ? wave->num_samples : (start + N);
+    k += N;
+
+    SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate,
+                                         wave->samples + start, end - start);
+    while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+      SherpaMnnDecodeOnlineStream(recognizer, stream);
+    }
+
+    const SherpaMnnOnlineRecognizerResult *r =
+        SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+    if (strlen(r->text)) {
+      SherpaMnnPrint(display, segment_id, r->text);
+    }
+
+    if (SherpaMnnOnlineStreamIsEndpoint(recognizer, stream)) {
+      if (strlen(r->text)) {
+        ++segment_id;
+      }
+      SherpaMnnOnlineStreamReset(recognizer, stream);
+    }
+
+    SherpaMnnDestroyOnlineRecognizerResult(r);
+  }
+
+  // add some tail padding
+  float tail_paddings[4800] = {0};  // 0.3 seconds at 16 kHz sample rate
+  SherpaMnnOnlineStreamAcceptWaveform(stream, wave->sample_rate, tail_paddings,
+                                       4800);
+
+  SherpaMnnFreeWave(wave);
+
+  SherpaMnnOnlineStreamInputFinished(stream);
+  while (SherpaMnnIsOnlineStreamReady(recognizer, stream)) {
+    SherpaMnnDecodeOnlineStream(recognizer, stream);
+  }
+
+  const SherpaMnnOnlineRecognizerResult *r =
+      SherpaMnnGetOnlineStreamResult(recognizer, stream);
+
+  if (strlen(r->text)) {
+    SherpaMnnPrint(display, segment_id, r->text);
+  }
+
+  SherpaMnnDestroyOnlineRecognizerResult(r);
+
+  SherpaMnnDestroyDisplay(display);
+  SherpaMnnDestroyOnlineStream(stream);
+  SherpaMnnDestroyOnlineRecognizer(recognizer);
+  fprintf(stderr, "\n");
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/telespeech-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/telespeech-c-api.c
@ -0,0 +1,78 @@
+// c-api-examples/telespeech-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use TeleSpeech-ASR CTC model with sherpa-onnx's
+// C API.
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2
+// tar xvf sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2
+// rm sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename =
+      "sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/test_wavs/3-sichuan.wav";
+  const char *model_filename =
+      "sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/model.int8.onnx";
+  const char *tokens_filename =
+      "sherpa-onnx-telespeech-ctc-int8-zh-2024-06-04/tokens.txt";
+  const char *provider = "cpu";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // Offline model config
+  SherpaMnnOfflineModelConfig offline_model_config;
+  memset(&offline_model_config, 0, sizeof(offline_model_config));
+  offline_model_config.debug = 1;
+  offline_model_config.num_threads = 1;
+  offline_model_config.provider = provider;
+  offline_model_config.tokens = tokens_filename;
+  offline_model_config.telespeech_ctc = model_filename;
+
+  // Recognizer config
+  SherpaMnnOfflineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = offline_model_config;
+
+  const SherpaMnnOfflineRecognizer *recognizer =
+      SherpaMnnCreateOfflineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOfflineStream *stream =
+      SherpaMnnCreateOfflineStream(recognizer);
+
+  SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,
+                                  wave->num_samples);
+  SherpaMnnDecodeOfflineStream(recognizer, stream);
+  const SherpaMnnOfflineRecognizerResult *result =
+      SherpaMnnGetOfflineStreamResult(stream);
+
+  fprintf(stderr, "Decoded text: %s\n", result->text);
+
+  SherpaMnnDestroyOfflineRecognizerResult(result);
+  SherpaMnnDestroyOfflineStream(stream);
+  SherpaMnnDestroyOfflineRecognizer(recognizer);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/vad-moonshine-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/vad-moonshine-c-api.c
@ -0,0 +1,146 @@
+// c-api-examples/vad-moonshine-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use VAD + Moonshine with sherpa-onnx's C API.
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
+// tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
+// rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename = "./Obama.wav";
+  const char *vad_filename = "./silero_vad.onnx";
+
+  const char *preprocessor =
+      "./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx";
+  const char *encoder = "./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx";
+  const char *uncached_decoder =
+      "./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx";
+  const char *cached_decoder =
+      "./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx";
+  const char *tokens = "./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  if (wave->sample_rate != 16000) {
+    fprintf(stderr, "Expect the sample rate to be 16000. Given: %d\n",
+            wave->sample_rate);
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  // Offline model config
+  SherpaMnnOfflineModelConfig offline_model_config;
+  memset(&offline_model_config, 0, sizeof(offline_model_config));
+  offline_model_config.debug = 0;
+  offline_model_config.num_threads = 1;
+  offline_model_config.provider = "cpu";
+  offline_model_config.tokens = tokens;
+  offline_model_config.moonshine.preprocessor = preprocessor;
+  offline_model_config.moonshine.encoder = encoder;
+  offline_model_config.moonshine.uncached_decoder = uncached_decoder;
+  offline_model_config.moonshine.cached_decoder = cached_decoder;
+
+  // Recognizer config
+  SherpaMnnOfflineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = offline_model_config;
+
+  const SherpaMnnOfflineRecognizer *recognizer =
+      SherpaMnnCreateOfflineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your recognizer config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  SherpaMnnVadModelConfig vadConfig;
+  memset(&vadConfig, 0, sizeof(vadConfig));
+  vadConfig.silero_vad.model = vad_filename;
+  vadConfig.silero_vad.threshold = 0.5;
+  vadConfig.silero_vad.min_silence_duration = 0.5;
+  vadConfig.silero_vad.min_speech_duration = 0.5;
+  vadConfig.silero_vad.max_speech_duration = 10;
+  vadConfig.silero_vad.window_size = 512;
+  vadConfig.sample_rate = 16000;
+  vadConfig.num_threads = 1;
+  vadConfig.debug = 1;
+
+  SherpaMnnVoiceActivityDetector *vad =
+      SherpaMnnCreateVoiceActivityDetector(&vadConfig, 30);
+
+  if (vad == NULL) {
+    fprintf(stderr, "Please check your recognizer config!\n");
+    SherpaMnnFreeWave(wave);
+    SherpaMnnDestroyOfflineRecognizer(recognizer);
+    return -1;
+  }
+
+  int32_t window_size = vadConfig.silero_vad.window_size;
+  int32_t i = 0;
+  int is_eof = 0;
+
+  while (!is_eof) {
+    if (i + window_size < wave->num_samples) {
+      SherpaMnnVoiceActivityDetectorAcceptWaveform(vad, wave->samples + i,
+                                                    window_size);
+    } else {
+      SherpaMnnVoiceActivityDetectorFlush(vad);
+      is_eof = 1;
+    }
+    while (!SherpaMnnVoiceActivityDetectorEmpty(vad)) {
+      const SherpaMnnSpeechSegment *segment =
+          SherpaMnnVoiceActivityDetectorFront(vad);
+
+      const SherpaMnnOfflineStream *stream =
+          SherpaMnnCreateOfflineStream(recognizer);
+
+      SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate,
+                                      segment->samples, segment->n);
+
+      SherpaMnnDecodeOfflineStream(recognizer, stream);
+
+      const SherpaMnnOfflineRecognizerResult *result =
+          SherpaMnnGetOfflineStreamResult(stream);
+
+      float start = segment->start / 16000.0f;
+      float duration = segment->n / 16000.0f;
+      float stop = start + duration;
+
+      fprintf(stderr, "%.3f -- %.3f: %s\n", start, stop, result->text);
+
+      SherpaMnnDestroyOfflineRecognizerResult(result);
+      SherpaMnnDestroyOfflineStream(stream);
+
+      SherpaMnnDestroySpeechSegment(segment);
+      SherpaMnnVoiceActivityDetectorPop(vad);
+    }
+    i += window_size;
+  }
+
+  SherpaMnnDestroyOfflineRecognizer(recognizer);
+  SherpaMnnDestroyVoiceActivityDetector(vad);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/vad-sense-voice-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/vad-sense-voice-c-api.c
@ -0,0 +1,148 @@
+// c-api-examples/vad-sense-voice-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use VAD + SenseVoice with sherpa-onnx's C API.
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/lei-jun-test.wav
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
+// tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
+// rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename = "./lei-jun-test.wav";
+  const char *vad_filename = "./silero_vad.onnx";
+  const char *model_filename =
+      "./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx";
+  const char *tokens_filename =
+      "./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt";
+  const char *language = "auto";
+  const char *provider = "cpu";
+  int32_t use_inverse_text_normalization = 1;
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  if (wave->sample_rate != 16000) {
+    fprintf(stderr, "Expect the sample rate to be 16000. Given: %d\n",
+            wave->sample_rate);
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  SherpaMnnOfflineSenseVoiceModelConfig sense_voice_config;
+  memset(&sense_voice_config, 0, sizeof(sense_voice_config));
+  sense_voice_config.model = model_filename;
+  sense_voice_config.language = language;
+  sense_voice_config.use_itn = use_inverse_text_normalization;
+
+  // Offline model config
+  SherpaMnnOfflineModelConfig offline_model_config;
+  memset(&offline_model_config, 0, sizeof(offline_model_config));
+  offline_model_config.debug = 0;
+  offline_model_config.num_threads = 1;
+  offline_model_config.provider = provider;
+  offline_model_config.tokens = tokens_filename;
+  offline_model_config.sense_voice = sense_voice_config;
+
+  // Recognizer config
+  SherpaMnnOfflineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = offline_model_config;
+
+  const SherpaMnnOfflineRecognizer *recognizer =
+      SherpaMnnCreateOfflineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your recognizer config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  SherpaMnnVadModelConfig vadConfig;
+  memset(&vadConfig, 0, sizeof(vadConfig));
+  vadConfig.silero_vad.model = vad_filename;
+  vadConfig.silero_vad.threshold = 0.5;
+  vadConfig.silero_vad.min_silence_duration = 0.5;
+  vadConfig.silero_vad.min_speech_duration = 0.5;
+  vadConfig.silero_vad.max_speech_duration = 5;
+  vadConfig.silero_vad.window_size = 512;
+  vadConfig.sample_rate = 16000;
+  vadConfig.num_threads = 1;
+  vadConfig.debug = 1;
+
+  SherpaMnnVoiceActivityDetector *vad =
+      SherpaMnnCreateVoiceActivityDetector(&vadConfig, 30);
+
+  if (vad == NULL) {
+    fprintf(stderr, "Please check your recognizer config!\n");
+    SherpaMnnFreeWave(wave);
+    SherpaMnnDestroyOfflineRecognizer(recognizer);
+    return -1;
+  }
+
+  int32_t window_size = vadConfig.silero_vad.window_size;
+  int32_t i = 0;
+  int is_eof = 0;
+
+  while (!is_eof) {
+    if (i + window_size < wave->num_samples) {
+      SherpaMnnVoiceActivityDetectorAcceptWaveform(vad, wave->samples + i,
+                                                    window_size);
+    } else {
+      SherpaMnnVoiceActivityDetectorFlush(vad);
+      is_eof = 1;
+    }
+
+    while (!SherpaMnnVoiceActivityDetectorEmpty(vad)) {
+      const SherpaMnnSpeechSegment *segment =
+          SherpaMnnVoiceActivityDetectorFront(vad);
+
+      const SherpaMnnOfflineStream *stream =
+          SherpaMnnCreateOfflineStream(recognizer);
+
+      SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate,
+                                      segment->samples, segment->n);
+
+      SherpaMnnDecodeOfflineStream(recognizer, stream);
+
+      const SherpaMnnOfflineRecognizerResult *result =
+          SherpaMnnGetOfflineStreamResult(stream);
+
+      float start = segment->start / 16000.0f;
+      float duration = segment->n / 16000.0f;
+      float stop = start + duration;
+
+      fprintf(stderr, "%.3f -- %.3f: %s\n", start, stop, result->text);
+
+      SherpaMnnDestroyOfflineRecognizerResult(result);
+      SherpaMnnDestroyOfflineStream(stream);
+
+      SherpaMnnDestroySpeechSegment(segment);
+      SherpaMnnVoiceActivityDetectorPop(vad);
+    }
+    i += window_size;
+  }
+
+  SherpaMnnDestroyOfflineRecognizer(recognizer);
+  SherpaMnnDestroyVoiceActivityDetector(vad);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/vad-whisper-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/vad-whisper-c-api.c
@ -0,0 +1,145 @@
+// c-api-examples/vad-whisper-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use VAD + Whisper tiny.en with
+// sherpa-onnx's C API.
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/Obama.wav
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2
+// tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2
+// rm sherpa-onnx-whisper-tiny.en.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename = "./Obama.wav";
+  const char *vad_filename = "./silero_vad.onnx";
+
+  const char *encoder = "sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx";
+  const char *decoder = "sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx";
+  const char *tokens = "sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  if (wave->sample_rate != 16000) {
+    fprintf(stderr, "Expect the sample rate to be 16000. Given: %d\n",
+            wave->sample_rate);
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  // Offline model config
+  SherpaMnnOfflineModelConfig offline_model_config;
+  memset(&offline_model_config, 0, sizeof(offline_model_config));
+  offline_model_config.debug = 0;
+  offline_model_config.num_threads = 1;
+  offline_model_config.provider = "cpu";
+  offline_model_config.tokens = tokens;
+  offline_model_config.whisper.encoder = encoder;
+  offline_model_config.whisper.decoder = decoder;
+  offline_model_config.whisper.language = "en";
+  offline_model_config.whisper.tail_paddings = 0;
+  offline_model_config.whisper.task = "transcribe";
+
+  // Recognizer config
+  SherpaMnnOfflineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = offline_model_config;
+
+  const SherpaMnnOfflineRecognizer *recognizer =
+      SherpaMnnCreateOfflineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your recognizer config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  SherpaMnnVadModelConfig vadConfig;
+  memset(&vadConfig, 0, sizeof(vadConfig));
+  vadConfig.silero_vad.model = vad_filename;
+  vadConfig.silero_vad.threshold = 0.5;
+  vadConfig.silero_vad.min_silence_duration = 0.5;
+  vadConfig.silero_vad.min_speech_duration = 0.5;
+  vadConfig.silero_vad.max_speech_duration = 10;
+  vadConfig.silero_vad.window_size = 512;
+  vadConfig.sample_rate = 16000;
+  vadConfig.num_threads = 1;
+  vadConfig.debug = 1;
+
+  SherpaMnnVoiceActivityDetector *vad =
+      SherpaMnnCreateVoiceActivityDetector(&vadConfig, 30);
+
+  if (vad == NULL) {
+    fprintf(stderr, "Please check your recognizer config!\n");
+    SherpaMnnFreeWave(wave);
+    SherpaMnnDestroyOfflineRecognizer(recognizer);
+    return -1;
+  }
+
+  int32_t window_size = vadConfig.silero_vad.window_size;
+  int32_t i = 0;
+  int is_eof = 0;
+
+  while (!is_eof) {
+    if (i + window_size < wave->num_samples) {
+        SherpaMnnVoiceActivityDetectorAcceptWaveform(vad, wave->samples + i,
+            window_size);
+    }
+    else {
+        SherpaMnnVoiceActivityDetectorFlush(vad);
+        is_eof = 1;
+    }
+    while (!SherpaMnnVoiceActivityDetectorEmpty(vad)) {
+      const SherpaMnnSpeechSegment *segment =
+          SherpaMnnVoiceActivityDetectorFront(vad);
+
+      const SherpaMnnOfflineStream *stream =
+          SherpaMnnCreateOfflineStream(recognizer);
+
+      SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate,
+                                      segment->samples, segment->n);
+
+      SherpaMnnDecodeOfflineStream(recognizer, stream);
+
+      const SherpaMnnOfflineRecognizerResult *result =
+          SherpaMnnGetOfflineStreamResult(stream);
+
+      float start = segment->start / 16000.0f;
+      float duration = segment->n / 16000.0f;
+      float stop = start + duration;
+
+      fprintf(stderr, "%.3f -- %.3f: %s\n", start, stop, result->text);
+
+      SherpaMnnDestroyOfflineRecognizerResult(result);
+      SherpaMnnDestroyOfflineStream(stream);
+
+      SherpaMnnDestroySpeechSegment(segment);
+      SherpaMnnVoiceActivityDetectorPop(vad);
+    }
+    i += window_size;
+  }
+
+  SherpaMnnDestroyOfflineRecognizer(recognizer);
+  SherpaMnnDestroyVoiceActivityDetector(vad);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/whisper-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/whisper-c-api.c
@ -0,0 +1,89 @@
+// c-api-examples/whisper-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+// We assume you have pre-downloaded the whisper multi-lingual models
+// from https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-models
+// An example command to download the "tiny" whisper model is given below:
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.tar.bz2
+// tar xvf sherpa-onnx-whisper-tiny.tar.bz2
+// rm sherpa-onnx-whisper-tiny.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename = "./sherpa-onnx-whisper-tiny/test_wavs/0.wav";
+  const char *encoder_filename = "sherpa-onnx-whisper-tiny/tiny-encoder.onnx";
+  const char *decoder_filename = "sherpa-onnx-whisper-tiny/tiny-decoder.onnx";
+  const char *tokens_filename = "sherpa-onnx-whisper-tiny/tiny-tokens.txt";
+  const char *language = "en";
+  const char *provider = "cpu";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // Whisper config
+  SherpaMnnOfflineWhisperModelConfig whisper_config;
+  memset(&whisper_config, 0, sizeof(whisper_config));
+  whisper_config.decoder = decoder_filename;
+  whisper_config.encoder = encoder_filename;
+  whisper_config.language = language;
+  whisper_config.tail_paddings = 0;
+  whisper_config.task = "transcribe";
+
+  // Offline model config
+  SherpaMnnOfflineModelConfig offline_model_config;
+  memset(&offline_model_config, 0, sizeof(offline_model_config));
+  offline_model_config.debug = 1;
+  offline_model_config.num_threads = 1;
+  offline_model_config.provider = provider;
+  offline_model_config.tokens = tokens_filename;
+  offline_model_config.whisper = whisper_config;
+
+  // Recognizer config
+  SherpaMnnOfflineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = offline_model_config;
+
+  const SherpaMnnOfflineRecognizer *recognizer =
+      SherpaMnnCreateOfflineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+
+    SherpaMnnFreeWave(wave);
+
+    return -1;
+  }
+
+  const SherpaMnnOfflineStream *stream =
+      SherpaMnnCreateOfflineStream(recognizer);
+
+  SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,
+                                  wave->num_samples);
+  SherpaMnnDecodeOfflineStream(recognizer, stream);
+  const SherpaMnnOfflineRecognizerResult *result =
+      SherpaMnnGetOfflineStreamResult(stream);
+
+  fprintf(stderr, "Decoded text: %s\n", result->text);
+
+  SherpaMnnDestroyOfflineRecognizerResult(result);
+  SherpaMnnDestroyOfflineStream(stream);
+  SherpaMnnDestroyOfflineRecognizer(recognizer);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/c-api-examples/zipformer-c-api.c
+++ b/apps/frameworks/sherpa-mnn/c-api-examples/zipformer-c-api.c
@ -0,0 +1,89 @@
+// c-api-examples/zipformer-c-api.c
+//
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use non-streaming Zipformer with sherpa-onnx's
+// C API.
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-zipformer-small-en-2023-06-26.tar.bz2
+// tar xvf sherpa-onnx-zipformer-small-en-2023-06-26.tar.bz2
+// rm sherpa-onnx-zipformer-small-en-2023-06-26.tar.bz2
+//
+// clang-format on
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+
+#include "sherpa-mnn/c-api/c-api.h"
+
+int32_t main() {
+  const char *wav_filename =
+      "sherpa-onnx-zipformer-small-en-2023-06-26/test_wavs/0.wav";
+  const char *encoder_filename =
+      "sherpa-onnx-zipformer-small-en-2023-06-26/encoder-epoch-99-avg-1.onnx";
+  const char *decoder_filename =
+      "sherpa-onnx-zipformer-small-en-2023-06-26/decoder-epoch-99-avg-1.onnx";
+  const char *joiner_filename =
+      "sherpa-onnx-zipformer-small-en-2023-06-26/joiner-epoch-99-avg-1.onnx";
+  const char *tokens_filename =
+      "sherpa-onnx-zipformer-small-en-2023-06-26/tokens.txt";
+  const char *provider = "cpu";
+
+  const SherpaMnnWave *wave = SherpaMnnReadWave(wav_filename);
+  if (wave == NULL) {
+    fprintf(stderr, "Failed to read %s\n", wav_filename);
+    return -1;
+  }
+
+  // Zipformer config
+  SherpaMnnOfflineTransducerModelConfig zipformer_config;
+  memset(&zipformer_config, 0, sizeof(zipformer_config));
+  zipformer_config.encoder = encoder_filename;
+  zipformer_config.decoder = decoder_filename;
+  zipformer_config.joiner = joiner_filename;
+
+  // Offline model config
+  SherpaMnnOfflineModelConfig offline_model_config;
+  memset(&offline_model_config, 0, sizeof(offline_model_config));
+  offline_model_config.debug = 1;
+  offline_model_config.num_threads = 1;
+  offline_model_config.provider = provider;
+  offline_model_config.tokens = tokens_filename;
+  offline_model_config.transducer = zipformer_config;
+
+  // Recognizer config
+  SherpaMnnOfflineRecognizerConfig recognizer_config;
+  memset(&recognizer_config, 0, sizeof(recognizer_config));
+  recognizer_config.decoding_method = "greedy_search";
+  recognizer_config.model_config = offline_model_config;
+
+  const SherpaMnnOfflineRecognizer *recognizer =
+      SherpaMnnCreateOfflineRecognizer(&recognizer_config);
+
+  if (recognizer == NULL) {
+    fprintf(stderr, "Please check your config!\n");
+    SherpaMnnFreeWave(wave);
+    return -1;
+  }
+
+  const SherpaMnnOfflineStream *stream =
+      SherpaMnnCreateOfflineStream(recognizer);
+
+  SherpaMnnAcceptWaveformOffline(stream, wave->sample_rate, wave->samples,
+                                  wave->num_samples);
+  SherpaMnnDecodeOfflineStream(recognizer, stream);
+  const SherpaMnnOfflineRecognizerResult *result =
+      SherpaMnnGetOfflineStreamResult(stream);
+
+  fprintf(stderr, "Decoded text: %s\n", result->text);
+
+  SherpaMnnDestroyOfflineRecognizerResult(result);
+  SherpaMnnDestroyOfflineStream(stream);
+  SherpaMnnDestroyOfflineRecognizer(recognizer);
+  SherpaMnnFreeWave(wave);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cmake/.gitignore
+++ b/apps/frameworks/sherpa-mnn/cmake/.gitignore
@ -0,0 +1 @@
+!*.cmake
--- a/apps/frameworks/sherpa-mnn/cmake/init.py
+++ b/apps/frameworks/sherpa-mnn/cmake/init.py
--- a/apps/frameworks/sherpa-mnn/cmake/asio.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/asio.cmake
@ -0,0 +1,45 @@
+function(download_asio)
+  include(FetchContent)
+
+  set(asio_URL  "https://github.com/chriskohlhoff/asio/archive/refs/tags/asio-1-24-0.tar.gz")
+  set(asio_URL2  "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/asio-asio-1-24-0.tar.gz")
+  set(asio_HASH "SHA256=cbcaaba0f66722787b1a7c33afe1befb3a012b5af3ad7da7ff0f6b8c9b7a8a5b")
+
+  # If you don't have access to the Internet,
+  # please pre-download asio
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/asio-asio-1-24-0.tar.gz
+    ${CMAKE_SOURCE_DIR}/asio-asio-1-24-0.tar.gz
+    ${CMAKE_BINARY_DIR}/asio-asio-1-24-0.tar.gz
+    /tmp/asio-asio-1-24-0.tar.gz
+    /star-fj/fangjun/download/github/asio-asio-1-24-0.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(asio_URL  "${f}")
+      file(TO_CMAKE_PATH "${asio_URL}" asio_URL)
+      message(STATUS "Found local downloaded asio: ${asio_URL}")
+      set(asio_URL2)
+      break()
+    endif()
+  endforeach()
+
+  FetchContent_Declare(asio
+    URL
+      ${asio_URL}
+      ${asio_URL2}
+    URL_HASH          ${asio_HASH}
+  )
+
+  FetchContent_GetProperties(asio)
+  if(NOT asio_POPULATED)
+    message(STATUS "Downloading asio ${asio_URL}")
+    FetchContent_Populate(asio)
+  endif()
+  message(STATUS "asio is downloaded to ${asio_SOURCE_DIR}")
+  # add_subdirectory(${asio_SOURCE_DIR} ${asio_BINARY_DIR} EXCLUDE_FROM_ALL)
+  include_directories(${asio_SOURCE_DIR}/asio/include)
+endfunction()
+
+download_asio()
--- a/apps/frameworks/sherpa-mnn/cmake/cargs.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/cargs.cmake
@ -0,0 +1,50 @@
+function(download_cargs)
+  include(FetchContent)
+
+  set(cargs_URL "https://github.com/likle/cargs/archive/refs/tags/v1.0.3.tar.gz")
+  set(cargs_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/cargs-1.0.3.tar.gz")
+  set(cargs_HASH "SHA256=ddba25bd35e9c6c75bc706c126001b8ce8e084d40ef37050e6aa6963e836eb8b")
+
+  # If you don't have access to the Internet,
+  # please pre-download cargs
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/cargs-1.0.3.tar.gz
+    ${CMAKE_SOURCE_DIR}/cargs-1.0.3.tar.gz
+    ${CMAKE_BINARY_DIR}/cargs-1.0.3.tar.gz
+    /tmp/cargs-1.0.3.tar.gz
+    /star-fj/fangjun/download/github/cargs-1.0.3.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(cargs_URL  "${f}")
+      file(TO_CMAKE_PATH "${cargs_URL}" cargs_URL)
+      message(STATUS "Found local downloaded cargs: ${cargs_URL}")
+      set(cargs_URL2)
+      break()
+    endif()
+  endforeach()
+
+  FetchContent_Declare(cargs
+    URL
+      ${cargs_URL}
+      ${cargs_URL2}
+    URL_HASH
+      ${cargs_HASH}
+  )
+
+  FetchContent_GetProperties(cargs)
+  if(NOT cargs_POPULATED)
+    message(STATUS "Downloading cargs ${cargs_URL}")
+    FetchContent_Populate(cargs)
+  endif()
+  message(STATUS "cargs is downloaded to ${cargs_SOURCE_DIR}")
+  add_subdirectory(${cargs_SOURCE_DIR} ${cargs_BINARY_DIR} EXCLUDE_FROM_ALL)
+
+  install(TARGETS cargs DESTINATION lib)
+  install(FILES ${cargs_SOURCE_DIR}/include/cargs.h
+    DESTINATION include
+  )
+endfunction()
+
+download_cargs()
--- a/apps/frameworks/sherpa-mnn/cmake/cmake_extension.py
+++ b/apps/frameworks/sherpa-mnn/cmake/cmake_extension.py
@ -0,0 +1,227 @@
+# cmake/cmake_extension.py
+# Copyright (c)  2023  Xiaomi Corporation
+#
+# flake8: noqa
+
+import os
+import platform
+import shutil
+import sys
+from pathlib import Path
+
+import setuptools
+from setuptools.command.build_ext import build_ext
+
+
+def is_for_pypi():
+    ans = os.environ.get("SHERPA_ONNX_IS_FOR_PYPI", None)
+    return ans is not None
+
+
+def is_macos():
+    return platform.system() == "Darwin"
+
+
+def is_windows():
+    return platform.system() == "Windows"
+
+
+def is_linux():
+    return platform.system() == "Linux"
+
+
+def is_arm64():
+    return platform.machine() in ["arm64", "aarch64"]
+
+
+def is_x86():
+    return platform.machine() in ["i386", "i686", "x86_64"]
+
+
+def enable_alsa():
+    build_alsa = os.environ.get("SHERPA_ONNX_ENABLE_ALSA", None)
+    return build_alsa and is_linux() and (is_arm64() or is_x86())
+
+
+def get_binaries():
+    binaries = [
+        "sherpa-onnx",
+        "sherpa-onnx-keyword-spotter",
+        "sherpa-onnx-microphone",
+        "sherpa-onnx-microphone-offline",
+        "sherpa-onnx-microphone-offline-audio-tagging",
+        "sherpa-onnx-microphone-offline-speaker-identification",
+        "sherpa-onnx-offline",
+        "sherpa-onnx-offline-audio-tagging",
+        "sherpa-onnx-offline-language-identification",
+        "sherpa-onnx-offline-punctuation",
+        "sherpa-onnx-offline-speaker-diarization",
+        "sherpa-onnx-offline-tts",
+        "sherpa-onnx-offline-tts-play",
+        "sherpa-onnx-offline-websocket-server",
+        "sherpa-onnx-online-punctuation",
+        "sherpa-onnx-online-websocket-client",
+        "sherpa-onnx-online-websocket-server",
+        "sherpa-onnx-vad-microphone",
+        "sherpa-onnx-vad-microphone-offline-asr",
+        "sherpa-onnx-vad-with-offline-asr",
+    ]
+
+    if enable_alsa():
+        binaries += [
+            "sherpa-onnx-alsa",
+            "sherpa-onnx-alsa-offline",
+            "sherpa-onnx-alsa-offline-speaker-identification",
+            "sherpa-onnx-offline-tts-play-alsa",
+            "sherpa-onnx-vad-alsa",
+            "sherpa-onnx-alsa-offline-audio-tagging",
+        ]
+
+    if is_windows():
+        binaries += [
+            "onnxruntime.dll",
+            "sherpa-onnx-c-api.dll",
+            "sherpa-onnx-cxx-api.dll",
+        ]
+
+    return binaries
+
+
+try:
+    from wheel.bdist_wheel import bdist_wheel as _bdist_wheel
+
+    class bdist_wheel(_bdist_wheel):
+        def finalize_options(self):
+            _bdist_wheel.finalize_options(self)
+            # In this case, the generated wheel has a name in the form
+            # sherpa-xxx-pyxx-none-any.whl
+            if is_for_pypi() and not is_macos():
+                self.root_is_pure = True
+            else:
+                # The generated wheel has a name ending with
+                # -linux_x86_64.whl
+                self.root_is_pure = False
+
+except ImportError:
+    bdist_wheel = None
+
+
+def cmake_extension(name, *args, **kwargs) -> setuptools.Extension:
+    kwargs["language"] = "c++"
+    sources = []
+    return setuptools.Extension(name, sources, *args, **kwargs)
+
+
+class BuildExtension(build_ext):
+    def build_extension(self, ext: setuptools.extension.Extension):
+        # build/temp.linux-x86_64-3.8
+        os.makedirs(self.build_temp, exist_ok=True)
+
+        # build/lib.linux-x86_64-3.8
+        os.makedirs(self.build_lib, exist_ok=True)
+
+        out_bin_dir = Path(self.build_lib).parent / "sherpa_onnx" / "bin"
+        install_dir = Path(self.build_lib).resolve() / "sherpa_onnx"
+
+        sherpa_onnx_dir = Path(__file__).parent.parent.resolve()
+
+        cmake_args = os.environ.get("SHERPA_ONNX_CMAKE_ARGS", "")
+        make_args = os.environ.get("SHERPA_ONNX_MAKE_ARGS", "")
+        system_make_args = os.environ.get("MAKEFLAGS", "")
+
+        if cmake_args == "":
+            cmake_args = "-DCMAKE_BUILD_TYPE=Release"
+
+        extra_cmake_args = f" -DCMAKE_INSTALL_PREFIX={install_dir} "
+        extra_cmake_args += " -DBUILD_SHARED_LIBS=ON "
+        extra_cmake_args += " -DBUILD_PIPER_PHONMIZE_EXE=OFF "
+        extra_cmake_args += " -DBUILD_PIPER_PHONMIZE_TESTS=OFF "
+        extra_cmake_args += " -DBUILD_ESPEAK_NG_EXE=OFF "
+        extra_cmake_args += " -DBUILD_ESPEAK_NG_TESTS=OFF "
+        extra_cmake_args += " -DSHERPA_ONNX_ENABLE_C_API=ON "
+
+        extra_cmake_args += " -DSHERPA_ONNX_BUILD_C_API_EXAMPLES=OFF "
+        extra_cmake_args += " -DSHERPA_ONNX_ENABLE_CHECK=OFF "
+        extra_cmake_args += " -DSHERPA_ONNX_ENABLE_PYTHON=ON "
+        extra_cmake_args += " -DSHERPA_ONNX_ENABLE_PORTAUDIO=ON "
+        extra_cmake_args += " -DSHERPA_ONNX_ENABLE_WEBSOCKET=ON "
+
+        if "PYTHON_EXECUTABLE" not in cmake_args:
+            print(f"Setting PYTHON_EXECUTABLE to {sys.executable}")
+            cmake_args += f" -DPYTHON_EXECUTABLE={sys.executable}"
+
+        cmake_args += extra_cmake_args
+
+        if is_windows():
+            build_cmd = f"""
+         cmake {cmake_args} -B {self.build_temp} -S {sherpa_onnx_dir}
+         cmake --build {self.build_temp} --target install --config Release -- -m:2
+            """
+            print(f"build command is:\n{build_cmd}")
+            ret = os.system(
+                f"cmake {cmake_args} -B {self.build_temp} -S {sherpa_onnx_dir}"
+            )
+            if ret != 0:
+                raise Exception("Failed to configure sherpa")
+
+            ret = os.system(
+                f"cmake --build {self.build_temp} --target install --config Release -- -m:2"  # noqa
+            )
+            if ret != 0:
+                raise Exception("Failed to build and install sherpa")
+        else:
+            if make_args == "" and system_make_args == "":
+                print("for fast compilation, run:")
+                print('export SHERPA_ONNX_MAKE_ARGS="-j"; python setup.py install')
+                print('Setting make_args to "-j4"')
+                make_args = "-j4"
+
+            if "-G Ninja" in cmake_args:
+                build_cmd = f"""
+                    cd {self.build_temp}
+                    cmake {cmake_args} {sherpa_onnx_dir}
+                    ninja {make_args} install
+                """
+            else:
+                build_cmd = f"""
+                    cd {self.build_temp}
+
+                    cmake {cmake_args} {sherpa_onnx_dir}
+
+                    make {make_args} install/strip
+                """
+            print(f"build command is:\n{build_cmd}")
+
+            ret = os.system(build_cmd)
+            if ret != 0:
+                raise Exception(
+                    "\nBuild sherpa-onnx failed. Please check the error message.\n"
+                    "You can ask for help by creating an issue on GitHub.\n"
+                    "\nClick:\n\thttps://github.com/k2-fsa/sherpa-onnx/issues/new\n"  # noqa
+                )
+
+        suffix = ".exe" if is_windows() else ""
+        # Remember to also change setup.py
+
+        binaries = get_binaries()
+
+        for f in binaries:
+            suffix = "" if ".dll" in f else suffix
+            src_file = install_dir / "bin" / (f + suffix)
+            if not src_file.is_file():
+                src_file = install_dir / "lib" / (f + suffix)
+            if not src_file.is_file():
+                src_file = install_dir / ".." / (f + suffix)
+
+            print(f"Copying {src_file} to {out_bin_dir}/")
+            shutil.copy(f"{src_file}", f"{out_bin_dir}/")
+
+        shutil.rmtree(f"{install_dir}/bin")
+        shutil.rmtree(f"{install_dir}/share")
+        shutil.rmtree(f"{install_dir}/lib/pkgconfig")
+
+        if is_macos():
+            os.remove(f"{install_dir}/lib/libonnxruntime.dylib")
+
+        if is_windows():
+            shutil.rmtree(f"{install_dir}/lib")
--- a/apps/frameworks/sherpa-mnn/cmake/cppjieba.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/cppjieba.cmake
@ -0,0 +1,45 @@
+function(download_cppjieba)
+  include(FetchContent)
+
+  set(cppjieba_URL  "https://github.com/csukuangfj/cppjieba/archive/refs/tags/sherpa-onnx-2024-04-19.tar.gz")
+  set(cppjieba_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/cppjieba-sherpa-onnx-2024-04-19.tar.gz")
+  set(cppjieba_HASH "SHA256=03e5264687f0efaef05487a07d49c3f4c0f743347bfbf825df4b30cc75ac5288")
+
+  # If you don't have access to the Internet,
+  # please pre-download cppjieba
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/cppjieba-sherpa-onnx-2024-04-19.tar.gz
+    ${CMAKE_SOURCE_DIR}/cppjieba-sherpa-onnx-2024-04-19.tar.gz
+    ${CMAKE_BINARY_DIR}/cppjieba-sherpa-onnx-2024-04-19.tar.gz
+    /tmp/cppjieba-sherpa-onnx-2024-04-19.tar.gz
+    /star-fj/fangjun/download/github/cppjieba-sherpa-onnx-2024-04-19.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(cppjieba_URL  "${f}")
+      file(TO_CMAKE_PATH "${cppjieba_URL}" cppjieba_URL)
+      message(STATUS "Found local downloaded cppjieba: ${cppjieba_URL}")
+      set(cppjieba_URL2)
+      break()
+    endif()
+  endforeach()
+
+  FetchContent_Declare(cppjieba
+    URL
+      ${cppjieba_URL}
+      ${cppjieba_URL2}
+    URL_HASH
+      ${cppjieba_HASH}
+  )
+
+  FetchContent_GetProperties(cppjieba)
+  if(NOT cppjieba_POPULATED)
+    message(STATUS "Downloading cppjieba ${cppjieba_URL}")
+    FetchContent_Populate(cppjieba)
+  endif()
+  message(STATUS "cppjieba is downloaded to ${cppjieba_SOURCE_DIR}")
+  add_subdirectory(${cppjieba_SOURCE_DIR} ${cppjieba_BINARY_DIR} EXCLUDE_FROM_ALL)
+endfunction()
+
+download_cppjieba()
--- a/apps/frameworks/sherpa-mnn/cmake/eigen.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/eigen.cmake
@ -0,0 +1,48 @@
+function(download_eigen)
+  include(FetchContent)
+
+  set(eigen_URL  "https://gitlab.com/libeigen/eigen/-/archive/3.4.0/eigen-3.4.0.tar.gz")
+  set(eigen_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/eigen-3.4.0.tar.gz")
+  set(eigen_HASH "SHA256=8586084f71f9bde545ee7fa6d00288b264a2b7ac3607b974e54d13e7162c1c72")
+
+  # If you don't have access to the Internet,
+  # please pre-download eigen
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/eigen-3.4.0.tar.gz
+    ${CMAKE_SOURCE_DIR}/eigen-3.4.0.tar.gz
+    ${CMAKE_BINARY_DIR}/eigen-3.4.0.tar.gz
+    /tmp/eigen-3.4.0.tar.gz
+    /star-fj/fangjun/download/github/eigen-3.4.0.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(eigen_URL  "${f}")
+      file(TO_CMAKE_PATH "${eigen_URL}" eigen_URL)
+      message(STATUS "Found local downloaded eigen: ${eigen_URL}")
+      set(eigen_URL2)
+      break()
+    endif()
+  endforeach()
+
+  set(BUILD_TESTING OFF CACHE BOOL "" FORCE)
+  set(EIGEN_BUILD_DOC OFF CACHE BOOL "" FORCE)
+
+  FetchContent_Declare(eigen
+    URL               ${eigen_URL}
+    URL_HASH          ${eigen_HASH}
+  )
+
+  FetchContent_GetProperties(eigen)
+  if(NOT eigen_POPULATED)
+    message(STATUS "Downloading eigen from ${eigen_URL}")
+    FetchContent_Populate(eigen)
+  endif()
+  message(STATUS "eigen is downloaded to ${eigen_SOURCE_DIR}")
+  message(STATUS "eigen's binary dir is ${eigen_BINARY_DIR}")
+
+  add_subdirectory(${eigen_SOURCE_DIR} ${eigen_BINARY_DIR} EXCLUDE_FROM_ALL)
+endfunction()
+
+download_eigen()
+
--- a/apps/frameworks/sherpa-mnn/cmake/espeak-ng-for-piper.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/espeak-ng-for-piper.cmake
@ -0,0 +1,134 @@
+function(download_espeak_ng_for_piper)
+  include(FetchContent)
+
+  set(espeak_ng_URL  "https://github.com/csukuangfj/espeak-ng/archive/f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip")
+  set(espeak_ng_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip")
+  set(espeak_ng_HASH "SHA256=70cbf4050e7a014aae19140b05e57249da4720f56128459fbe3a93beaf971ae6")
+
+  set(BUILD_ESPEAK_NG_TESTS OFF CACHE BOOL "" FORCE)
+  set(USE_ASYNC OFF CACHE BOOL "" FORCE)
+  set(USE_MBROLA OFF CACHE BOOL "" FORCE)
+  set(USE_LIBSONIC OFF CACHE BOOL "" FORCE)
+  set(USE_LIBPCAUDIO OFF CACHE BOOL "" FORCE)
+  set(USE_KLATT OFF CACHE BOOL "" FORCE)
+  set(USE_SPEECHPLAYER OFF CACHE BOOL "" FORCE)
+  set(EXTRA_cmn ON CACHE BOOL "" FORCE)
+  set(EXTRA_ru ON CACHE BOOL "" FORCE)
+  if (NOT SHERPA_ONNX_ENABLE_EPSEAK_NG_EXE)
+    set(BUILD_ESPEAK_NG_EXE OFF CACHE BOOL "" FORCE)
+  endif()
+
+  # If you don't have access to the Internet,
+  # please pre-download kaldi-decoder
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip
+    ${CMAKE_SOURCE_DIR}/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip
+    ${CMAKE_BINARY_DIR}/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip
+    /tmp/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip
+    /star-fj/fangjun/download/github/espeak-ng-f6fed6c58b5e0998b8e68c6610125e2d07d595a7.zip
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(espeak_ng_URL  "${f}")
+      file(TO_CMAKE_PATH "${espeak_ng_URL}" espeak_ng_URL)
+      message(STATUS "Found local downloaded espeak-ng: ${espeak_ng_URL}")
+      set(espeak_ng_URL2 )
+      break()
+    endif()
+  endforeach()
+
+  FetchContent_Declare(espeak_ng
+    URL
+      ${espeak_ng_URL}
+      ${espeak_ng_URL2}
+    URL_HASH          ${espeak_ng_HASH}
+  )
+
+  FetchContent_GetProperties(espeak_ng)
+  if(NOT espeak_ng_POPULATED)
+    message(STATUS "Downloading espeak-ng from ${espeak_ng_URL}")
+    FetchContent_Populate(espeak_ng)
+  endif()
+  message(STATUS "espeak-ng is downloaded to ${espeak_ng_SOURCE_DIR}")
+  message(STATUS "espeak-ng binary dir is ${espeak_ng_BINARY_DIR}")
+
+  if(BUILD_SHARED_LIBS)
+    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})
+    set(BUILD_SHARED_LIBS OFF)
+  endif()
+
+  add_subdirectory(${espeak_ng_SOURCE_DIR} ${espeak_ng_BINARY_DIR})
+
+  if(_build_shared_libs_bak)
+    set_target_properties(espeak-ng
+      PROPERTIES
+        POSITION_INDEPENDENT_CODE ON
+        C_VISIBILITY_PRESET hidden
+        CXX_VISIBILITY_PRESET hidden
+    )
+    set(BUILD_SHARED_LIBS ON)
+  endif()
+
+  set(espeak_ng_SOURCE_DIR ${espeak_ng_SOURCE_DIR} PARENT_SCOPE)
+
+  if(WIN32 AND MSVC)
+    target_compile_options(ucd PUBLIC
+      /wd4309
+    )
+
+    target_compile_options(espeak-ng PUBLIC
+      /wd4005
+      /wd4018
+      /wd4067
+      /wd4068
+      /wd4090
+      /wd4101
+      /wd4244
+      /wd4267
+      /wd4996
+    )
+
+    if(TARGET espeak-ng-bin)
+      target_compile_options(espeak-ng-bin PRIVATE
+        /wd4244
+        /wd4024
+        /wd4047
+        /wd4067
+        /wd4267
+        /wd4996
+      )
+    endif()
+  endif()
+
+  if(UNIX AND NOT APPLE)
+    target_compile_options(espeak-ng PRIVATE
+      -Wno-unused-result
+      -Wno-format-overflow
+      -Wno-format-truncation
+      -Wno-uninitialized
+      -Wno-format
+    )
+
+    if(TARGET espeak-ng-bin)
+      target_compile_options(espeak-ng-bin PRIVATE
+        -Wno-unused-result
+      )
+    endif()
+  endif()
+
+  target_include_directories(espeak-ng
+    INTERFACE
+      ${espeak_ng_SOURCE_DIR}/src/include
+      ${espeak_ng_SOURCE_DIR}/src/ucd-tools/src/include
+  )
+
+  if(NOT BUILD_SHARED_LIBS)
+    install(TARGETS
+      espeak-ng
+      ucd
+    DESTINATION lib)
+  endif()
+endfunction()
+
+download_espeak_ng_for_piper()
--- a/apps/frameworks/sherpa-mnn/cmake/googletest.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/googletest.cmake
@ -0,0 +1,76 @@
+function(download_googltest)
+  include(FetchContent)
+
+  set(googletest_URL  "https://github.com/google/googletest/archive/refs/tags/v1.13.0.tar.gz")
+  set(googletest_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/googletest-1.13.0.tar.gz")
+  set(googletest_HASH "SHA256=ad7fdba11ea011c1d925b3289cf4af2c66a352e18d4c7264392fead75e919363")
+
+  # If you don't have access to the Internet,
+  # please pre-download googletest
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/googletest-1.13.0.tar.gz
+    ${CMAKE_SOURCE_DIR}/googletest-1.13.0.tar.gz
+    ${CMAKE_BINARY_DIR}/googletest-1.13.0.tar.gz
+    /tmp/googletest-1.13.0.tar.gz
+    /star-fj/fangjun/download/github/googletest-1.13.0.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(googletest_URL  "${f}")
+      file(TO_CMAKE_PATH "${googletest_URL}" googletest_URL)
+      message(STATUS "Found local downloaded googletest: ${googletest_URL}")
+      set(googletest_URL2)
+      break()
+    endif()
+  endforeach()
+
+  set(BUILD_GMOCK ON CACHE BOOL "" FORCE)
+  set(INSTALL_GTEST OFF CACHE BOOL "" FORCE)
+  set(gtest_disable_pthreads ON CACHE BOOL "" FORCE)
+  set(gtest_force_shared_crt ON CACHE BOOL "" FORCE)
+
+  FetchContent_Declare(googletest
+    URL
+      ${googletest_URL}
+      ${googletest_URL2}
+    URL_HASH          ${googletest_HASH}
+  )
+
+  FetchContent_GetProperties(googletest)
+  if(NOT googletest_POPULATED)
+    message(STATUS "Downloading googletest from ${googletest_URL}")
+    FetchContent_Populate(googletest)
+  endif()
+  message(STATUS "googletest is downloaded to ${googletest_SOURCE_DIR}")
+  message(STATUS "googletest's binary dir is ${googletest_BINARY_DIR}")
+
+  if(APPLE)
+    set(CMAKE_MACOSX_RPATH ON) # to solve the following warning on macOS
+  endif()
+  #[==[
+  -- Generating done
+    Policy CMP0042 is not set: MACOSX_RPATH is enabled by default.  Run "cmake
+    --help-policy CMP0042" for policy details.  Use the cmake_policy command to
+    set the policy and suppress this warning.
+
+    MACOSX_RPATH is not specified for the following targets:
+
+      gmock
+      gmock_main
+      gtest
+      gtest_main
+
+  This warning is for project developers.  Use -Wno-dev to suppress it.
+  ]==]
+
+  add_subdirectory(${googletest_SOURCE_DIR} ${googletest_BINARY_DIR} EXCLUDE_FROM_ALL)
+
+  target_include_directories(gtest
+    INTERFACE
+      ${googletest_SOURCE_DIR}/googletest/include
+      ${googletest_SOURCE_DIR}/googlemock/include
+  )
+endfunction()
+
+download_googltest()
--- a/apps/frameworks/sherpa-mnn/cmake/hclust-cpp.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/hclust-cpp.cmake
@ -0,0 +1,47 @@
+function(download_hclust_cpp)
+  include(FetchContent)
+
+  # The latest commit as of 2024.09.29
+  set(hclust_cpp_URL  "https://github.com/csukuangfj/hclust-cpp/archive/refs/tags/2024-09-29.tar.gz")
+  set(hclust_cpp_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/hclust-cpp-2024-09-29.tar.gz")
+  set(hclust_cpp_HASH "SHA256=abab51448a3cb54272aae07522970306e0b2cc6479d59d7b19e7aee4d6cedd33")
+
+  # If you don't have access to the Internet,
+  # please pre-download hclust-cpp
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/hclust-cpp-2024-09-29.tar.gz
+    ${CMAKE_SOURCE_DIR}/hclust-cpp-2024-09-29.tar.gz
+    ${CMAKE_BINARY_DIR}/hclust-cpp-2024-09-29.tar.gz
+    /tmp/hclust-cpp-2024-09-29.tar.gz
+    /star-fj/fangjun/download/github/hclust-cpp-2024-09-29.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(hclust_cpp_URL  "${f}")
+      file(TO_CMAKE_PATH "${hclust_cpp_URL}" hclust_cpp_URL)
+      message(STATUS "Found local downloaded hclust_cpp: ${hclust_cpp_URL}")
+      set(hclust_cpp_URL2)
+      break()
+    endif()
+  endforeach()
+
+  FetchContent_Declare(hclust_cpp
+    URL
+      ${hclust_cpp_URL}
+      ${hclust_cpp_URL2}
+    URL_HASH          ${hclust_cpp_HASH}
+  )
+
+  FetchContent_GetProperties(hclust_cpp)
+  if(NOT hclust_cpp_POPULATED)
+    message(STATUS "Downloading hclust_cpp from ${hclust_cpp_URL}")
+    FetchContent_Populate(hclust_cpp)
+  endif()
+
+  message(STATUS "hclust_cpp is downloaded to ${hclust_cpp_SOURCE_DIR}")
+  message(STATUS "hclust_cpp's binary dir is ${hclust_cpp_BINARY_DIR}")
+  include_directories(${hclust_cpp_SOURCE_DIR})
+endfunction()
+
+download_hclust_cpp()
--- a/apps/frameworks/sherpa-mnn/cmake/kaldi-decoder.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/kaldi-decoder.cmake
@ -0,0 +1,89 @@
+function(download_kaldi_decoder)
+  include(FetchContent)
+
+  set(kaldi_decoder_URL  "https://github.com/k2-fsa/kaldi-decoder/archive/refs/tags/v0.2.6.tar.gz")
+  set(kaldi_decoder_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/kaldi-decoder-0.2.6.tar.gz")
+  set(kaldi_decoder_HASH "SHA256=b13c78b37495cafc6ef3f8a7b661b349c55a51abbd7f7f42f389408dcf86a463")
+
+  set(KALDI_DECODER_BUILD_PYTHON OFF CACHE BOOL "" FORCE)
+  set(KALDI_DECODER_ENABLE_TESTS OFF CACHE BOOL "" FORCE)
+  set(KALDIFST_BUILD_PYTHON OFF CACHE BOOL "" FORCE)
+
+  # If you don't have access to the Internet,
+  # please pre-download kaldi-decoder
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/kaldi-decoder-0.2.6.tar.gz
+    ${CMAKE_SOURCE_DIR}/kaldi-decoder-0.2.6.tar.gz
+    ${CMAKE_BINARY_DIR}/kaldi-decoder-0.2.6.tar.gz
+    /tmp/kaldi-decoder-0.2.6.tar.gz
+    /star-fj/fangjun/download/github/kaldi-decoder-0.2.6.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(kaldi_decoder_URL  "${f}")
+      file(TO_CMAKE_PATH "${kaldi_decoder_URL}" kaldi_decoder_URL)
+      message(STATUS "Found local downloaded kaldi-decoder: ${kaldi_decoder_URL}")
+      set(kaldi_decoder_URL2 )
+      break()
+    endif()
+  endforeach()
+
+  FetchContent_Declare(kaldi_decoder
+    URL
+      ${kaldi_decoder_URL}
+      ${kaldi_decoder_URL2}
+    URL_HASH          ${kaldi_decoder_HASH}
+  )
+
+  FetchContent_GetProperties(kaldi_decoder)
+  if(NOT kaldi_decoder_POPULATED)
+    message(STATUS "Downloading kaldi-decoder from ${kaldi_decoder_URL}")
+    FetchContent_Populate(kaldi_decoder)
+  endif()
+  message(STATUS "kaldi-decoder is downloaded to ${kaldi_decoder_SOURCE_DIR}")
+  message(STATUS "kaldi-decoder's binary dir is ${kaldi_decoder_BINARY_DIR}")
+
+  include_directories(${kaldi_decoder_SOURCE_DIR})
+
+  if(BUILD_SHARED_LIBS)
+    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})
+    set(BUILD_SHARED_LIBS OFF)
+  endif()
+
+  add_subdirectory(${kaldi_decoder_SOURCE_DIR} ${kaldi_decoder_BINARY_DIR} EXCLUDE_FROM_ALL)
+
+  if(_build_shared_libs_bak)
+    set_target_properties(
+        kaldi-decoder-core
+      PROPERTIES
+        POSITION_INDEPENDENT_CODE ON
+        C_VISIBILITY_PRESET hidden
+        CXX_VISIBILITY_PRESET hidden
+    )
+    set(BUILD_SHARED_LIBS ON)
+  endif()
+
+  if(WIN32 AND MSVC)
+    target_compile_options(kaldi-decoder-core PUBLIC
+      /wd4018
+      /wd4291
+    )
+  endif()
+
+  target_include_directories(kaldi-decoder-core
+    INTERFACE
+      ${kaldi-decoder_SOURCE_DIR}/
+  )
+  if(NOT BUILD_SHARED_LIBS)
+    install(TARGETS
+      kaldi-decoder-core
+      kaldifst_core
+      fst
+      fstfar
+    DESTINATION lib)
+  endif()
+endfunction()
+
+download_kaldi_decoder()
+
--- a/apps/frameworks/sherpa-mnn/cmake/kaldi-native-fbank.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/kaldi-native-fbank.cmake
@ -0,0 +1,74 @@
+function(download_kaldi_native_fbank)
+  include(FetchContent)
+
+  set(kaldi_native_fbank_URL   "https://github.com/csukuangfj/kaldi-native-fbank/archive/refs/tags/v1.21.1.tar.gz")
+  set(kaldi_native_fbank_URL2  "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/kaldi-native-fbank-1.21.1.tar.gz")
+  set(kaldi_native_fbank_HASH "SHA256=37c1aa230b00fe062791d800d8fc50aa3de215918d3dce6440699e67275d859e")
+
+  set(KALDI_NATIVE_FBANK_BUILD_TESTS OFF CACHE BOOL "" FORCE)
+  set(KALDI_NATIVE_FBANK_BUILD_PYTHON OFF CACHE BOOL "" FORCE)
+  set(KALDI_NATIVE_FBANK_ENABLE_CHECK OFF CACHE BOOL "" FORCE)
+
+  # If you don't have access to the Internet,
+  # please pre-download kaldi-native-fbank
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/kaldi-native-fbank-1.21.1.tar.gz
+    ${CMAKE_SOURCE_DIR}/kaldi-native-fbank-1.21.1.tar.gz
+    ${CMAKE_BINARY_DIR}/kaldi-native-fbank-1.21.1.tar.gz
+    /tmp/kaldi-native-fbank-1.21.1.tar.gz
+    /star-fj/fangjun/download/github/kaldi-native-fbank-1.21.1.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(kaldi_native_fbank_URL  "${f}")
+      file(TO_CMAKE_PATH "${kaldi_native_fbank_URL}" kaldi_native_fbank_URL)
+      message(STATUS "Found local downloaded kaldi-native-fbank: ${kaldi_native_fbank_URL}")
+      set(kaldi_native_fbank_URL2 )
+      break()
+    endif()
+  endforeach()
+
+  FetchContent_Declare(kaldi_native_fbank
+    URL
+      ${kaldi_native_fbank_URL}
+      ${kaldi_native_fbank_URL2}
+    URL_HASH          ${kaldi_native_fbank_HASH}
+  )
+
+  FetchContent_GetProperties(kaldi_native_fbank)
+  if(NOT kaldi_native_fbank_POPULATED)
+    message(STATUS "Downloading kaldi-native-fbank from ${kaldi_native_fbank_URL}")
+    FetchContent_Populate(kaldi_native_fbank)
+  endif()
+  message(STATUS "kaldi-native-fbank is downloaded to ${kaldi_native_fbank_SOURCE_DIR}")
+  message(STATUS "kaldi-native-fbank's binary dir is ${kaldi_native_fbank_BINARY_DIR}")
+
+  if(BUILD_SHARED_LIBS)
+    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})
+    set(BUILD_SHARED_LIBS OFF)
+  endif()
+
+  add_subdirectory(${kaldi_native_fbank_SOURCE_DIR} ${kaldi_native_fbank_BINARY_DIR} EXCLUDE_FROM_ALL)
+
+  if(_build_shared_libs_bak)
+    set_target_properties(kaldi-native-fbank-core
+      PROPERTIES
+        POSITION_INDEPENDENT_CODE ON
+        C_VISIBILITY_PRESET hidden
+        CXX_VISIBILITY_PRESET hidden
+    )
+    set(BUILD_SHARED_LIBS ON)
+  endif()
+
+  target_include_directories(kaldi-native-fbank-core
+    INTERFACE
+      ${kaldi_native_fbank_SOURCE_DIR}/
+  )
+
+  if(NOT BUILD_SHARED_LIBS)
+    install(TARGETS kaldi-native-fbank-core DESTINATION lib)
+  endif()
+endfunction()
+
+download_kaldi_native_fbank()
--- a/apps/frameworks/sherpa-mnn/cmake/kaldifst.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/kaldifst.cmake
@ -0,0 +1,72 @@
+function(download_kaldifst)
+  include(FetchContent)
+
+  set(kaldifst_URL  "https://github.com/k2-fsa/kaldifst/archive/refs/tags/v1.7.11.tar.gz")
+  set(kaldifst_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/kaldifst-1.7.11.tar.gz")
+  set(kaldifst_HASH "SHA256=b43b3332faa2961edc730e47995a58cd4e22ead21905d55b0c4a41375b4a525f")
+
+  # If you don't have access to the Internet,
+  # please pre-download kaldifst
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/kaldifst-1.7.11.tar.gz
+    ${CMAKE_SOURCE_DIR}/kaldifst-1.7.11.tar.gz
+    ${CMAKE_BINARY_DIR}/kaldifst-1.7.11.tar.gz
+    /tmp/kaldifst-1.7.11.tar.gz
+    /star-fj/fangjun/download/github/kaldifst-1.7.11.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(kaldifst_URL  "${f}")
+      file(TO_CMAKE_PATH "${kaldifst_URL}" kaldifst_URL)
+      message(STATUS "Found local downloaded kaldifst: ${kaldifst_URL}")
+      set(kaldifst_URL2)
+      break()
+    endif()
+  endforeach()
+
+  set(KALDIFST_BUILD_TESTS OFF CACHE BOOL "" FORCE)
+  set(KALDIFST_BUILD_PYTHON OFF CACHE BOOL "" FORCE)
+
+  FetchContent_Declare(kaldifst
+    URL               ${kaldifst_URL}
+    URL_HASH          ${kaldifst_HASH}
+  )
+
+  FetchContent_GetProperties(kaldifst)
+  if(NOT kaldifst_POPULATED)
+    message(STATUS "Downloading kaldifst from ${kaldifst_URL}")
+    FetchContent_Populate(kaldifst)
+  endif()
+  message(STATUS "kaldifst is downloaded to ${kaldifst_SOURCE_DIR}")
+  message(STATUS "kaldifst's binary dir is ${kaldifst_BINARY_DIR}")
+
+  list(APPEND CMAKE_MODULE_PATH ${kaldifst_SOURCE_DIR}/cmake)
+
+  if(BUILD_SHARED_LIBS)
+    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})
+    set(BUILD_SHARED_LIBS OFF)
+  endif()
+
+  add_subdirectory(${kaldifst_SOURCE_DIR} ${kaldifst_BINARY_DIR} EXCLUDE_FROM_ALL)
+
+  if(_build_shared_libs_bak)
+    set_target_properties(kaldifst_core
+      PROPERTIES
+        POSITION_INDEPENDENT_CODE ON
+        C_VISIBILITY_PRESET hidden
+        CXX_VISIBILITY_PRESET hidden
+    )
+    set(BUILD_SHARED_LIBS ON)
+  endif()
+
+  target_include_directories(kaldifst_core
+    PUBLIC
+      ${kaldifst_SOURCE_DIR}/
+  )
+
+  set_target_properties(kaldifst_core PROPERTIES OUTPUT_NAME "sherpa-mnn-kaldifst-core")
+  # installed in ./kaldi-decoder.cmake
+endfunction()
+
+download_kaldifst()
--- a/apps/frameworks/sherpa-mnn/cmake/openfst.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/openfst.cmake
@ -0,0 +1,109 @@
+# Copyright (c)  2020  Xiaomi Corporation (author: Fangjun Kuang)
+
+function(download_openfst)
+  include(FetchContent)
+
+  set(openfst_URL  "https://github.com/csukuangfj/openfst/archive/refs/tags/sherpa-onnx-2024-06-19.tar.gz")
+  set(openfst_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/openfst-sherpa-onnx-2024-06-19.tar.gz")
+  set(openfst_HASH "SHA256=5c98e82cc509c5618502dde4860b8ea04d843850ed57e6d6b590b644b268853d")
+
+  # If you don't have access to the Internet,
+  # please pre-download it
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/openfst-sherpa-onnx-2024-06-19.tar.gz
+    ${CMAKE_SOURCE_DIR}/openfst-sherpa-onnx-2024-06-19.tar.gz
+    ${CMAKE_BINARY_DIR}/openfst-sherpa-onnx-2024-06-19.tar.gz
+    /tmp/openfst-sherpa-onnx-2024-06-19.tar.gz
+    /star-fj/fangjun/download/github/openfst-sherpa-onnx-2024-06-19.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(openfst_URL  "${f}")
+      file(TO_CMAKE_PATH "${openfst_URL}" openfst_URL)
+      set(openfst_URL2)
+      break()
+    endif()
+  endforeach()
+
+  set(HAVE_BIN OFF CACHE BOOL "" FORCE)
+  set(HAVE_SCRIPT OFF CACHE BOOL "" FORCE)
+  set(HAVE_COMPACT OFF CACHE BOOL "" FORCE)
+  set(HAVE_COMPRESS OFF CACHE BOOL "" FORCE)
+  set(HAVE_CONST OFF CACHE BOOL "" FORCE)
+  set(HAVE_FAR ON CACHE BOOL "" FORCE)
+  set(HAVE_GRM OFF CACHE BOOL "" FORCE)
+  set(HAVE_PDT OFF CACHE BOOL "" FORCE)
+  set(HAVE_MPDT OFF CACHE BOOL "" FORCE)
+  set(HAVE_LINEAR OFF CACHE BOOL "" FORCE)
+  set(HAVE_LOOKAHEAD OFF CACHE BOOL "" FORCE)
+  set(HAVE_NGRAM OFF CACHE BOOL "" FORCE)
+  set(HAVE_PYTHON OFF CACHE BOOL "" FORCE)
+  set(HAVE_SPECIAL OFF CACHE BOOL "" FORCE)
+
+  if(NOT WIN32)
+    FetchContent_Declare(openfst
+      URL
+        ${openfst_URL}
+        ${openfst_URL2}
+      URL_HASH          ${openfst_HASH}
+      PATCH_COMMAND
+        sed -i.bak s/enable_testing\(\)//g "src/CMakeLists.txt" &&
+        sed -i.bak s/add_subdirectory\(test\)//g "src/CMakeLists.txt" &&
+        sed -i.bak /message/d "src/script/CMakeLists.txt"
+        # sed -i.bak s/add_subdirectory\(script\)//g "src/CMakeLists.txt" &&
+        # sed -i.bak s/add_subdirectory\(extensions\)//g "src/CMakeLists.txt"
+    )
+  else()
+    FetchContent_Declare(openfst
+      URL               ${openfst_URL}
+      URL_HASH          ${openfst_HASH}
+    )
+  endif()
+
+  FetchContent_GetProperties(openfst)
+  if(NOT openfst_POPULATED)
+    message(STATUS "Downloading openfst from ${openfst_URL}")
+    FetchContent_Populate(openfst)
+  endif()
+  message(STATUS "openfst is downloaded to ${openfst_SOURCE_DIR}")
+
+  if(_build_shared_libs_bak)
+    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})
+    set(BUILD_SHARED_LIBS OFF)
+  endif()
+
+  add_subdirectory(${openfst_SOURCE_DIR} ${openfst_BINARY_DIR} EXCLUDE_FROM_ALL)
+
+  if(_build_shared_libs_bak)
+    set_target_properties(fst fstfar
+      PROPERTIES
+        POSITION_INDEPENDENT_CODE ON
+        C_VISIBILITY_PRESET hidden
+        CXX_VISIBILITY_PRESET hidden
+    )
+    set(BUILD_SHARED_LIBS ON)
+  endif()
+
+  set(openfst_SOURCE_DIR ${openfst_SOURCE_DIR} PARENT_SCOPE)
+
+  set_target_properties(fst PROPERTIES OUTPUT_NAME "sherpa-mnn-fst")
+  set_target_properties(fstfar PROPERTIES OUTPUT_NAME "sherpa-mnn-fstfar")
+
+  if(LINUX)
+    target_compile_options(fst PUBLIC -Wno-missing-template-keyword)
+  endif()
+
+  target_include_directories(fst
+    PUBLIC
+      ${openfst_SOURCE_DIR}/src/include
+  )
+
+  target_include_directories(fstfar
+    PUBLIC
+      ${openfst_SOURCE_DIR}/src/include
+  )
+  # installed in ./kaldi-decoder.cmake
+endfunction()
+
+download_openfst()
--- a/apps/frameworks/sherpa-mnn/cmake/piper-phonemize.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/piper-phonemize.cmake
@ -0,0 +1,78 @@
+function(download_piper_phonemize)
+  include(FetchContent)
+
+  set(piper_phonemize_URL  "https://github.com/csukuangfj/piper-phonemize/archive/78a788e0b719013401572d70fef372e77bff8e43.zip")
+  set(piper_phonemize_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip")
+  set(piper_phonemize_HASH "SHA256=89641a46489a4898754643ce57bda9c9b54b4ca46485fdc02bf0dc84b866645d")
+
+  # If you don't have access to the Internet,
+  # please pre-download kaldi-decoder
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip
+    ${CMAKE_SOURCE_DIR}/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip
+    ${CMAKE_BINARY_DIR}/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip
+    /tmp/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip
+    /star-fj/fangjun/download/github/piper-phonemize-78a788e0b719013401572d70fef372e77bff8e43.zip
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(piper_phonemize_URL  "${f}")
+      file(TO_CMAKE_PATH "${piper_phonemize_URL}" piper_phonemize_URL)
+      message(STATUS "Found local downloaded espeak-ng: ${piper_phonemize_URL}")
+      set(piper_phonemize_URL2 )
+      break()
+    endif()
+  endforeach()
+
+  FetchContent_Declare(piper_phonemize
+    URL
+      ${piper_phonemize_URL}
+      ${piper_phonemize_URL2}
+    URL_HASH          ${piper_phonemize_HASH}
+  )
+
+  FetchContent_GetProperties(piper_phonemize)
+  if(NOT piper_phonemize_POPULATED)
+    message(STATUS "Downloading piper-phonemize from ${piper_phonemize_URL}")
+    FetchContent_Populate(piper_phonemize)
+  endif()
+  message(STATUS "piper-phonemize is downloaded to ${piper_phonemize_SOURCE_DIR}")
+  message(STATUS "piper-phonemize binary dir is ${piper_phonemize_BINARY_DIR}")
+
+  if(BUILD_SHARED_LIBS)
+    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})
+    set(BUILD_SHARED_LIBS OFF)
+  endif()
+
+  add_subdirectory(${piper_phonemize_SOURCE_DIR} ${piper_phonemize_BINARY_DIR} EXCLUDE_FROM_ALL)
+
+  if(_build_shared_libs_bak)
+    set_target_properties(piper_phonemize
+      PROPERTIES
+        POSITION_INDEPENDENT_CODE ON
+        C_VISIBILITY_PRESET hidden
+        CXX_VISIBILITY_PRESET hidden
+    )
+    set(BUILD_SHARED_LIBS ON)
+  endif()
+
+  if(WIN32 AND MSVC)
+    target_compile_options(piper_phonemize PUBLIC
+      /wd4309
+    )
+  endif()
+
+  target_include_directories(piper_phonemize
+    INTERFACE
+      ${piper_phonemize_SOURCE_DIR}/src/include
+  )
+
+  if(NOT BUILD_SHARED_LIBS)
+    install(TARGETS
+      piper_phonemize
+    DESTINATION lib)
+  endif()
+endfunction()
+
+download_piper_phonemize()
--- a/apps/frameworks/sherpa-mnn/cmake/portaudio.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/portaudio.cmake
@ -0,0 +1,71 @@
+function(download_portaudio)
+  include(FetchContent)
+
+  set(portaudio_URL  "http://files.portaudio.com/archives/pa_stable_v190700_20210406.tgz")
+  set(portaudio_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/pa_stable_v190700_20210406.tgz")
+  set(portaudio_HASH "SHA256=47efbf42c77c19a05d22e627d42873e991ec0c1357219c0d74ce6a2948cb2def")
+
+  # If you don't have access to the Internet, please download it to your
+  # local drive and modify the following line according to your needs.
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/pa_stable_v190700_20210406.tgz
+    $ENV{HOME}/asr/pa_stable_v190700_20210406.tgz
+    ${CMAKE_SOURCE_DIR}/pa_stable_v190700_20210406.tgz
+    ${CMAKE_BINARY_DIR}/pa_stable_v190700_20210406.tgz
+    /tmp/pa_stable_v190700_20210406.tgz
+    /star-fj/fangjun/download/github/pa_stable_v190700_20210406.tgz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(portaudio_URL  "${f}")
+      file(TO_CMAKE_PATH "${portaudio_URL}" portaudio_URL)
+      message(STATUS "Found local downloaded portaudio: ${portaudio_URL}")
+      set(portaudio_URL2)
+      break()
+    endif()
+  endforeach()
+
+  # Always use static build
+  set(PA_BUILD_SHARED OFF CACHE BOOL "" FORCE)
+  set(PA_BUILD_STATIC ON CACHE BOOL "" FORCE)
+
+  FetchContent_Declare(portaudio
+    URL
+      ${portaudio_URL}
+      ${portaudio_URL2}
+    URL_HASH          ${portaudio_HASH}
+  )
+
+  FetchContent_GetProperties(portaudio)
+  if(NOT portaudio_POPULATED)
+    message(STATUS "Downloading portaudio from ${portaudio_URL}")
+    FetchContent_Populate(portaudio)
+  endif()
+  message(STATUS "portaudio is downloaded to ${portaudio_SOURCE_DIR}")
+  message(STATUS "portaudio's binary dir is ${portaudio_BINARY_DIR}")
+
+  if(APPLE)
+    set(CMAKE_MACOSX_RPATH ON) # to solve the following warning on macOS
+  endif()
+
+  add_subdirectory(${portaudio_SOURCE_DIR} ${portaudio_BINARY_DIR} EXCLUDE_FROM_ALL)
+
+  set_target_properties(portaudio_static PROPERTIES OUTPUT_NAME "sherpa-onnx-portaudio_static")
+  if(NOT WIN32)
+    target_compile_options(portaudio_static PRIVATE "-Wno-deprecated-declarations")
+  endif()
+
+  if(NOT BUILD_SHARED_LIBS AND SHERPA_ONNX_ENABLE_BINARY)
+    install(TARGETS
+      portaudio_static
+    DESTINATION lib)
+  endif()
+
+endfunction()
+
+download_portaudio()
+
+# Note
+# See http://portaudio.com/docs/v19-doxydocs/tutorial_start.html
+# for how to use portaudio
--- a/apps/frameworks/sherpa-mnn/cmake/pybind11.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/pybind11.cmake
@ -0,0 +1,44 @@
+function(download_pybind11)
+  include(FetchContent)
+
+  set(pybind11_URL  "https://github.com/pybind/pybind11/archive/refs/tags/v2.12.0.tar.gz")
+  set(pybind11_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/pybind11-2.12.0.tar.gz")
+  set(pybind11_HASH "SHA256=bf8f242abd1abcd375d516a7067490fb71abd79519a282d22b6e4d19282185a7")
+
+  # If you don't have access to the Internet,
+  # please pre-download pybind11
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/pybind11-2.12.0.tar.gz
+    ${CMAKE_SOURCE_DIR}/pybind11-2.12.0.tar.gz
+    ${CMAKE_BINARY_DIR}/pybind11-2.12.0.tar.gz
+    /tmp/pybind11-2.12.0.tar.gz
+    /star-fj/fangjun/download/github/pybind11-2.12.0.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(pybind11_URL  "${f}")
+      file(TO_CMAKE_PATH "${pybind11_URL}" pybind11_URL)
+      message(STATUS "Found local downloaded pybind11: ${pybind11_URL}")
+      set(pybind11_URL2)
+      break()
+    endif()
+  endforeach()
+
+  FetchContent_Declare(pybind11
+    URL
+      ${pybind11_URL}
+      ${pybind11_URL2}
+    URL_HASH          ${pybind11_HASH}
+  )
+
+  FetchContent_GetProperties(pybind11)
+  if(NOT pybind11_POPULATED)
+    message(STATUS "Downloading pybind11 from ${pybind11_URL}")
+    FetchContent_Populate(pybind11)
+  endif()
+  message(STATUS "pybind11 is downloaded to ${pybind11_SOURCE_DIR}")
+  add_subdirectory(${pybind11_SOURCE_DIR} ${pybind11_BINARY_DIR} EXCLUDE_FROM_ALL)
+endfunction()
+
+download_pybind11()
--- a/apps/frameworks/sherpa-mnn/cmake/sherpa-onnx-shared.pc.in
+++ b/apps/frameworks/sherpa-mnn/cmake/sherpa-onnx-shared.pc.in
@ -0,0 +1,25 @@
+# Note: If you use Python, then the prefix might not be correct.
+#
+# You need to either manually modify this file to change the prefix to the location
+# where this sherpa-onnx.pc file actually resides
+# or
+# you can use
+#
+#   pkg-config --define-variable=prefix=/path/to/the/dir/containing/this/file --cflags sherpa-onnx
+
+prefix="@CMAKE_INSTALL_PREFIX@"
+exec_prefix="${prefix}"
+includedir="${prefix}/include"
+libdir="${exec_prefix}/lib"
+
+Name: sherpa-onnx
+Description: pkg-config for sherpa-onnx
+URL: https://github.com/k2-fsa/sherpa-onnx
+
+Version: @SHERPA_ONNX_VERSION@
+Cflags: -I"${includedir}"
+
+# Note: -lcargs is required only for the following file
+# https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/decode-file-c-api.c
+# We add it here so that users don't need to specify -lcargs when compiling decode-file-c-api.c
+Libs: -L"${libdir}" -lsherpa-onnx-c-api -lonnxruntime -Wl,-rpath,${libdir} @SHERPA_ONNX_PKG_WITH_CARGS@ @SHERPA_ONNX_PKG_CONFIG_EXTRA_LIBS@
--- a/apps/frameworks/sherpa-mnn/cmake/sherpa-onnx-static-no-tts.pc.in
+++ b/apps/frameworks/sherpa-mnn/cmake/sherpa-onnx-static-no-tts.pc.in
@ -0,0 +1,25 @@
+# Note: If you use Python, then the prefix might not be correct.
+#
+# You need to either manually modify this file to change the prefix to the location
+# where this sherpa-onnx.pc file actually resides
+# or
+# you can use
+#
+#   pkg-config --define-variable=prefix=/path/to/the/dir/containing/this/file --cflags sherpa-onnx
+
+prefix="@CMAKE_INSTALL_PREFIX@"
+exec_prefix="${prefix}"
+includedir="${prefix}/include"
+libdir="${exec_prefix}/lib"
+
+Name: sherpa-onnx
+Description: pkg-config for sherpa-onnx with TTS support
+URL: https://github.com/k2-fsa/sherpa-onnx
+
+Version: @SHERPA_ONNX_VERSION@
+Cflags: -I"${includedir}"
+
+# Note: -lcargs is required only for the following file
+# https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/decode-file-c-api.c
+# We add it here so that users don't need to specify -lcargs when compiling decode-file-c-api.c
+Libs: -L"${libdir}" -lsherpa-onnx-c-api -lsherpa-onnx-core -lkaldi-decoder-core -lsherpa-onnx-kaldifst-core -lsherpa-onnx-fst -lkaldi-native-fbank-core -lonnxruntime -lssentencepiece_core -Wl,-rpath,${libdir} @SHERPA_ONNX_PKG_WITH_CARGS@ @SHERPA_ONNX_PKG_CONFIG_EXTRA_LIBS@
--- a/apps/frameworks/sherpa-mnn/cmake/sherpa-onnx-static.pc.in
+++ b/apps/frameworks/sherpa-mnn/cmake/sherpa-onnx-static.pc.in
@ -0,0 +1,25 @@
+# Note: If you use Python, then the prefix might not be correct.
+#
+# You need to either manually modify this file to change the prefix to the location
+# where this sherpa-onnx.pc file actually resides
+# or
+# you can use
+#
+#   pkg-config --define-variable=prefix=/path/to/the/dir/containing/this/file --cflags sherpa-onnx
+
+prefix="@CMAKE_INSTALL_PREFIX@"
+exec_prefix="${prefix}"
+includedir="${prefix}/include"
+libdir="${exec_prefix}/lib"
+
+Name: sherpa-onnx
+Description: pkg-config for sherpa-onnx
+URL: https://github.com/k2-fsa/sherpa-onnx
+
+Version: @SHERPA_ONNX_VERSION@
+Cflags: -I"${includedir}"
+
+# Note: -lcargs is required only for the following file
+# https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/decode-file-c-api.c
+# We add it here so that users don't need to specify -lcargs when compiling decode-file-c-api.c
+Libs: -L"${libdir}" -lsherpa-onnx-c-api -lsherpa-onnx-core -lkaldi-decoder-core -lsherpa-onnx-kaldifst-core -lsherpa-onnx-fstfar -lsherpa-onnx-fst -lkaldi-native-fbank-core -lpiper_phonemize -lespeak-ng -lucd -lonnxruntime -lssentencepiece_core -Wl,-rpath,${libdir} @SHERPA_ONNX_PKG_WITH_CARGS@ @SHERPA_ONNX_PKG_CONFIG_EXTRA_LIBS@
--- a/apps/frameworks/sherpa-mnn/cmake/simple-sentencepiece.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/simple-sentencepiece.cmake
@ -0,0 +1,73 @@
+function(download_simple_sentencepiece)
+  include(FetchContent)
+
+  set(simple-sentencepiece_URL  "https://github.com/pkufool/simple-sentencepiece/archive/refs/tags/v0.7.tar.gz")
+  set(simple-sentencepiece_URL2 "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/simple-sentencepiece-0.7.tar.gz")
+  set(simple-sentencepiece_HASH "SHA256=1748a822060a35baa9f6609f84efc8eb54dc0e74b9ece3d82367b7119fdc75af")
+
+  # If you don't have access to the Internet,
+  # please pre-download simple-sentencepiece
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/simple-sentencepiece-0.7.tar.gz
+    ${CMAKE_SOURCE_DIR}/simple-sentencepiece-0.7.tar.gz
+    ${CMAKE_BINARY_DIR}/simple-sentencepiece-0.7.tar.gz
+    /tmp/simple-sentencepiece-0.7.tar.gz
+    /star-fj/fangjun/download/github/simple-sentencepiece-0.7.tar.gz
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(simple-sentencepiece_URL  "${f}")
+      file(TO_CMAKE_PATH "${simple-sentencepiece_URL}" simple-sentencepiece_URL)
+      message(STATUS "Found local downloaded simple-sentencepiece: ${simple-sentencepiece_URL}")
+      set(simple-sentencepiece_URL2)
+      break()
+    endif()
+  endforeach()
+
+  set(SBPE_ENABLE_TESTS OFF CACHE BOOL "" FORCE)
+  set(SBPE_BUILD_PYTHON OFF CACHE BOOL "" FORCE)
+
+  FetchContent_Declare(simple-sentencepiece
+    URL
+      ${simple-sentencepiece_URL}
+      ${simple-sentencepiece_URL2}
+    URL_HASH
+      ${simple-sentencepiece_HASH}
+  )
+
+  FetchContent_GetProperties(simple-sentencepiece)
+  if(NOT simple-sentencepiece_POPULATED)
+    message(STATUS "Downloading simple-sentencepiece ${simple-sentencepiece_URL}")
+    FetchContent_Populate(simple-sentencepiece)
+  endif()
+  message(STATUS "simple-sentencepiece is downloaded to ${simple-sentencepiece_SOURCE_DIR}")
+
+  if(BUILD_SHARED_LIBS)
+    set(_build_shared_libs_bak ${BUILD_SHARED_LIBS})
+    set(BUILD_SHARED_LIBS OFF)
+  endif()
+
+  add_subdirectory(${simple-sentencepiece_SOURCE_DIR} ${simple-sentencepiece_BINARY_DIR} EXCLUDE_FROM_ALL)
+
+  if(_build_shared_libs_bak)
+    set_target_properties(ssentencepiece_core
+      PROPERTIES
+        POSITION_INDEPENDENT_CODE ON
+        C_VISIBILITY_PRESET hidden
+        CXX_VISIBILITY_PRESET hidden
+    )
+    set(BUILD_SHARED_LIBS ON)
+  endif()
+
+  target_include_directories(ssentencepiece_core
+    PUBLIC
+      ${simple-sentencepiece_SOURCE_DIR}/
+  )
+
+  if(NOT BUILD_SHARED_LIBS)
+    install(TARGETS ssentencepiece_core DESTINATION lib)
+  endif()
+endfunction()
+
+download_simple_sentencepiece()
--- a/apps/frameworks/sherpa-mnn/cmake/websocketpp.cmake
+++ b/apps/frameworks/sherpa-mnn/cmake/websocketpp.cmake
@ -0,0 +1,46 @@
+function(download_websocketpp)
+  include(FetchContent)
+
+  # The latest commit on the develop branch os as 2022-10-22
+  set(websocketpp_URL  "https://github.com/zaphoyd/websocketpp/archive/b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip")
+  set(websocketpp_URL2  "https://hf-mirror.com/csukuangfj/sherpa-onnx-cmake-deps/resolve/main/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip")
+  set(websocketpp_HASH "SHA256=1385135ede8191a7fbef9ec8099e3c5a673d48df0c143958216cd1690567f583")
+
+  # If you don't have access to the Internet,
+  # please pre-download websocketpp
+  set(possible_file_locations
+    $ENV{HOME}/Downloads/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip
+    ${CMAKE_SOURCE_DIR}/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip
+    ${CMAKE_BINARY_DIR}/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip
+    /tmp/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip
+    /star-fj/fangjun/download/github/websocketpp-b9aeec6eaf3d5610503439b4fae3581d9aff08e8.zip
+  )
+
+  foreach(f IN LISTS possible_file_locations)
+    if(EXISTS ${f})
+      set(websocketpp_URL  "${f}")
+      file(TO_CMAKE_PATH "${websocketpp_URL}" websocketpp_URL)
+      message(STATUS "Found local downloaded websocketpp: ${websocketpp_URL}")
+      set(websocketpp_URL2)
+      break()
+    endif()
+  endforeach()
+
+  FetchContent_Declare(websocketpp
+    URL
+      ${websocketpp_URL}
+      ${websocketpp_URL2}
+    URL_HASH          ${websocketpp_HASH}
+  )
+
+  FetchContent_GetProperties(websocketpp)
+  if(NOT websocketpp_POPULATED)
+    message(STATUS "Downloading websocketpp from ${websocketpp_URL}")
+    FetchContent_Populate(websocketpp)
+  endif()
+  message(STATUS "websocketpp is downloaded to ${websocketpp_SOURCE_DIR}")
+  # add_subdirectory(${websocketpp_SOURCE_DIR} ${websocketpp_BINARY_DIR} EXCLUDE_FROM_ALL)
+  include_directories(${websocketpp_SOURCE_DIR})
+endfunction()
+
+download_websocketpp()
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/CMakeLists.txt
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/CMakeLists.txt
@ -0,0 +1,39 @@
+include_directories(${CMAKE_SOURCE_DIR})
+
+add_executable(streaming-zipformer-cxx-api ./streaming-zipformer-cxx-api.cc)
+target_link_libraries(streaming-zipformer-cxx-api sherpa-mnn-cxx-api)
+
+add_executable(speech-enhancement-gtcrn-cxx-api ./speech-enhancement-gtcrn-cxx-api.cc)
+target_link_libraries(speech-enhancement-gtcrn-cxx-api sherpa-mnn-cxx-api)
+
+add_executable(kws-cxx-api ./kws-cxx-api.cc)
+target_link_libraries(kws-cxx-api sherpa-mnn-cxx-api)
+
+add_executable(streaming-zipformer-rtf-cxx-api ./streaming-zipformer-rtf-cxx-api.cc)
+target_link_libraries(streaming-zipformer-rtf-cxx-api sherpa-mnn-cxx-api)
+
+add_executable(whisper-cxx-api ./whisper-cxx-api.cc)
+target_link_libraries(whisper-cxx-api sherpa-mnn-cxx-api)
+
+add_executable(fire-red-asr-cxx-api ./fire-red-asr-cxx-api.cc)
+target_link_libraries(fire-red-asr-cxx-api sherpa-mnn-cxx-api)
+
+add_executable(moonshine-cxx-api ./moonshine-cxx-api.cc)
+target_link_libraries(moonshine-cxx-api sherpa-mnn-cxx-api)
+
+add_executable(sense-voice-cxx-api ./sense-voice-cxx-api.cc)
+target_link_libraries(sense-voice-cxx-api sherpa-mnn-cxx-api)
+
+if(SHERPA_MNN_ENABLE_TTS)
+  add_executable(matcha-tts-zh-cxx-api ./matcha-tts-zh-cxx-api.cc)
+  target_link_libraries(matcha-tts-zh-cxx-api sherpa-mnn-cxx-api)
+
+  add_executable(matcha-tts-en-cxx-api ./matcha-tts-en-cxx-api.cc)
+  target_link_libraries(matcha-tts-en-cxx-api sherpa-mnn-cxx-api)
+
+  add_executable(kokoro-tts-en-cxx-api ./kokoro-tts-en-cxx-api.cc)
+  target_link_libraries(kokoro-tts-en-cxx-api sherpa-mnn-cxx-api)
+
+  add_executable(kokoro-tts-zh-en-cxx-api ./kokoro-tts-zh-en-cxx-api.cc)
+  target_link_libraries(kokoro-tts-zh-en-cxx-api sherpa-mnn-cxx-api)
+endif()
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/fire-red-asr-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/fire-red-asr-cxx-api.cc
@ -0,0 +1,77 @@
+// cxx-api-examples/fire-red-asr-cxx-api.cc
+// Copyright (c)  2025  Xiaomi Corporation
+
+//
+// This file demonstrates how to use FireRedAsr AED with sherpa-onnx's C++ API.
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
+// tar xvf sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
+// rm sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16.tar.bz2
+//
+// clang-format on
+
+#include <chrono>  // NOLINT
+#include <iostream>
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+int32_t main() {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+  OfflineRecognizerConfig config;
+
+  config.model_config.fire_red_asr.encoder =
+      "./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/encoder.int8.onnx";
+  config.model_config.fire_red_asr.decoder =
+      "./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/decoder.int8.onnx";
+  config.model_config.tokens =
+      "./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/tokens.txt";
+
+  config.model_config.num_threads = 1;
+
+  std::cout << "Loading model\n";
+  OfflineRecognizer recongizer = OfflineRecognizer::Create(config);
+  if (!recongizer.Get()) {
+    std::cerr << "Please check your config\n";
+    return -1;
+  }
+  std::cout << "Loading model done\n";
+
+  std::string wave_filename =
+      "./sherpa-onnx-fire-red-asr-large-zh_en-2025-02-16/test_wavs/0.wav";
+  Wave wave = ReadWave(wave_filename);
+  if (wave.samples.empty()) {
+    std::cerr << "Failed to read: '" << wave_filename << "'\n";
+    return -1;
+  }
+
+  std::cout << "Start recognition\n";
+  const auto begin = std::chrono::steady_clock::now();
+
+  OfflineStream stream = recongizer.CreateStream();
+  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),
+                        wave.samples.size());
+
+  recongizer.Decode(&stream);
+
+  OfflineRecognizerResult result = recongizer.GetResult(&stream);
+
+  const auto end = std::chrono::steady_clock::now();
+  const float elapsed_seconds =
+      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)
+          .count() /
+      1000.;
+  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);
+  float rtf = elapsed_seconds / duration;
+
+  std::cout << "text: " << result.text << "\n";
+  printf("Number of threads: %d\n", config.model_config.num_threads);
+  printf("Duration: %.3fs\n", duration);
+  printf("Elapsed seconds: %.3fs\n", elapsed_seconds);
+  printf("(Real time factor) RTF = %.3f / %.3f = %.3f\n", elapsed_seconds,
+         duration, rtf);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/kokoro-tts-en-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/kokoro-tts-en-cxx-api.cc
@ -0,0 +1,73 @@
+// cxx-api-examples/kokoro-tts-en-cxx-api.c
+//
+// Copyright (c)  2025  Xiaomi Corporation
+
+// This file shows how to use sherpa-onnx CXX API
+// for English TTS with Kokoro.
+//
+// clang-format off
+/*
+Usage
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-en-v0_19.tar.bz2
+tar xf kokoro-en-v0_19.tar.bz2
+rm kokoro-en-v0_19.tar.bz2
+
+./kokoro-tts-en-cxx-api
+
+ */
+// clang-format on
+
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+static int32_t ProgressCallback(const float *samples, int32_t num_samples,
+                                float progress, void *arg) {
+  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
+  // return 1 to continue generating
+  // return 0 to stop generating
+  return 1;
+}
+
+int32_t main(int32_t argc, char *argv[]) {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+  OfflineTtsConfig config;
+
+  config.model.kokoro.model = "./kokoro-en-v0_19/model.onnx";
+  config.model.kokoro.voices = "./kokoro-en-v0_19/voices.bin";
+  config.model.kokoro.tokens = "./kokoro-en-v0_19/tokens.txt";
+  config.model.kokoro.data_dir = "./kokoro-en-v0_19/espeak-ng-data";
+
+  config.model.num_threads = 2;
+
+  // If you don't want to see debug messages, please set it to 0
+  config.model.debug = 1;
+
+  std::string filename = "./generated-kokoro-en-cxx.wav";
+  std::string text =
+      "Today as always, men fall into two groups: slaves and free men. Whoever "
+      "does not have two-thirds of his day for himself, is a slave, whatever "
+      "he may be: a statesman, a businessman, an official, or a scholar. "
+      "Friends fell out often because life was changing so fast. The easiest "
+      "thing in the world was to lose touch with someone.";
+
+  auto tts = OfflineTts::Create(config);
+  int32_t sid = 0;
+  float speed = 1.0;  // larger -> faster in speech speed
+
+#if 0
+  // If you don't want to use a callback, then please enable this branch
+  GeneratedAudio audio = tts.Generate(text, sid, speed);
+#else
+  GeneratedAudio audio = tts.Generate(text, sid, speed, ProgressCallback);
+#endif
+
+  WriteWave(filename, {audio.samples, audio.sample_rate});
+
+  fprintf(stderr, "Input text is: %s\n", text.c_str());
+  fprintf(stderr, "Speaker ID is is: %d\n", sid);
+  fprintf(stderr, "Saved to: %s\n", filename.c_str());
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/kokoro-tts-zh-en-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/kokoro-tts-zh-en-cxx-api.cc
@ -0,0 +1,74 @@
+// cxx-api-examples/kokoro-tts-zh-en-cxx-api.cc
+//
+// Copyright (c)  2025  Xiaomi Corporation
+
+// This file shows how to use sherpa-onnx CXX API
+// for Chinese + English TTS with Kokoro.
+//
+// clang-format off
+/*
+Usage
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/kokoro-multi-lang-v1_0.tar.bz2
+tar xf kokoro-multi-lang-v1_0.tar.bz2
+rm kokoro-multi-lang-v1_0.tar.bz2
+
+./kokoro-tts-zh-en-cxx-api
+
+ */
+// clang-format on
+
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+static int32_t ProgressCallback(const float *samples, int32_t num_samples,
+                                float progress, void *arg) {
+  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
+  // return 1 to continue generating
+  // return 0 to stop generating
+  return 1;
+}
+
+int32_t main(int32_t argc, char *argv[]) {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+  OfflineTtsConfig config;
+
+  config.model.kokoro.model = "./kokoro-multi-lang-v1_0/model.onnx";
+  config.model.kokoro.voices = "./kokoro-multi-lang-v1_0/voices.bin";
+  config.model.kokoro.tokens = "./kokoro-multi-lang-v1_0/tokens.txt";
+  config.model.kokoro.data_dir = "./kokoro-multi-lang-v1_0/espeak-ng-data";
+  config.model.kokoro.dict_dir = "./kokoro-multi-lang-v1_0/dict";
+  config.model.kokoro.lexicon =
+      "./kokoro-multi-lang-v1_0/lexicon-us-en.txt,./kokoro-multi-lang-v1_0/"
+      "lexicon-zh.txt";
+
+  config.model.num_threads = 2;
+
+  // If you don't want to see debug messages, please set it to 0
+  config.model.debug = 1;
+
+  std::string filename = "./generated-kokoro-zh-en-cxx.wav";
+  std::string text =
+      "中英文语音合成测试。This is generated by next generation Kaldi using "
+      "Kokoro without Misaki. 你觉得中英文说的如何呢？";
+
+  auto tts = OfflineTts::Create(config);
+  int32_t sid = 50;
+  float speed = 1.0;  // larger -> faster in speech speed
+
+#if 0
+  // If you don't want to use a callback, then please enable this branch
+  GeneratedAudio audio = tts.Generate(text, sid, speed);
+#else
+  GeneratedAudio audio = tts.Generate(text, sid, speed, ProgressCallback);
+#endif
+
+  WriteWave(filename, {audio.samples, audio.sample_rate});
+
+  fprintf(stderr, "Input text is: %s\n", text.c_str());
+  fprintf(stderr, "Speaker ID is is: %d\n", sid);
+  fprintf(stderr, "Saved to: %s\n", filename.c_str());
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/kws-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/kws-cxx-api.cc
@ -0,0 +1,143 @@
+// cxx-api-examples/kws-cxx-api.cc
+//
+// Copyright (c)  2025  Xiaomi Corporation
+//
+// This file demonstrates how to use keywords spotter with sherpa-onnx's C
+// clang-format off
+//
+// Usage
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/kws-models/sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2
+// tar xvf sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2
+// rm sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile.tar.bz2
+//
+// ./kws-cxx-api
+//
+// clang-format on
+#include <array>
+#include <iostream>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+int32_t main() {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+
+  KeywordSpotterConfig config;
+  config.model_config.transducer.encoder =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "encoder-epoch-12-avg-2-chunk-16-left-64.int8.onnx";
+
+  config.model_config.transducer.decoder =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "decoder-epoch-12-avg-2-chunk-16-left-64.onnx";
+
+  config.model_config.transducer.joiner =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "joiner-epoch-12-avg-2-chunk-16-left-64.int8.onnx";
+
+  config.model_config.tokens =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "tokens.txt";
+
+  config.model_config.provider = "cpu";
+  config.model_config.num_threads = 1;
+  config.model_config.debug = 1;
+
+  config.keywords_file =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "test_wavs/test_keywords.txt";
+
+  KeywordSpotter kws = KeywordSpotter::Create(config);
+  if (!kws.Get()) {
+    std::cerr << "Please check your config\n";
+    return -1;
+  }
+
+  std::cout
+      << "--Test pre-defined keywords from test_wavs/test_keywords.txt--\n";
+
+  std::string wave_filename =
+      "./sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01-mobile/"
+      "test_wavs/3.wav";
+
+  std::array<float, 8000> tail_paddings = {0};  // 0.5 seconds
+
+  Wave wave = ReadWave(wave_filename);
+  if (wave.samples.empty()) {
+    std::cerr << "Failed to read: '" << wave_filename << "'\n";
+    return -1;
+  }
+
+  OnlineStream stream = kws.CreateStream();
+  if (!stream.Get()) {
+    std::cerr << "Failed to create stream\n";
+    return -1;
+  }
+
+  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),
+                        wave.samples.size());
+
+  stream.AcceptWaveform(wave.sample_rate, tail_paddings.data(),
+                        tail_paddings.size());
+  stream.InputFinished();
+
+  while (kws.IsReady(&stream)) {
+    kws.Decode(&stream);
+    auto r = kws.GetResult(&stream);
+    if (!r.keyword.empty()) {
+      std::cout << "Detected keyword: " << r.json << "\n";
+
+      // Remember to reset the keyword stream right after a keyword is detected
+      kws.Reset(&stream);
+    }
+  }
+
+  // --------------------------------------------------------------------------
+
+  std::cout << "--Use pre-defined keywords + add a new keyword--\n";
+
+  stream = kws.CreateStream("y ǎn y uán @演员");
+
+  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),
+                        wave.samples.size());
+
+  stream.AcceptWaveform(wave.sample_rate, tail_paddings.data(),
+                        tail_paddings.size());
+  stream.InputFinished();
+
+  while (kws.IsReady(&stream)) {
+    kws.Decode(&stream);
+    auto r = kws.GetResult(&stream);
+    if (!r.keyword.empty()) {
+      std::cout << "Detected keyword: " << r.json << "\n";
+
+      // Remember to reset the keyword stream right after a keyword is detected
+      kws.Reset(&stream);
+    }
+  }
+
+  // --------------------------------------------------------------------------
+
+  std::cout << "--Use pre-defined keywords + add two new keywords--\n";
+
+  stream = kws.CreateStream("y ǎn y uán @演员/zh ī m íng @知名");
+
+  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),
+                        wave.samples.size());
+
+  stream.AcceptWaveform(wave.sample_rate, tail_paddings.data(),
+                        tail_paddings.size());
+  stream.InputFinished();
+
+  while (kws.IsReady(&stream)) {
+    kws.Decode(&stream);
+    auto r = kws.GetResult(&stream);
+    if (!r.keyword.empty()) {
+      std::cout << "Detected keyword: " << r.json << "\n";
+
+      // Remember to reset the keyword stream right after a keyword is detected
+      kws.Reset(&stream);
+    }
+  }
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/matcha-tts-en-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/matcha-tts-en-cxx-api.cc
@ -0,0 +1,80 @@
+// cxx-api-examples/matcha-tts-en-cxx-api.cc
+//
+// Copyright (c)  2025  Xiaomi Corporation
+
+// This file shows how to use sherpa-onnx CXX API
+// for Chinese TTS with MatchaTTS.
+//
+// clang-format off
+/*
+Usage
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-en_US-ljspeech.tar.bz2
+tar xvf matcha-icefall-en_US-ljspeech.tar.bz2
+rm matcha-icefall-en_US-ljspeech.tar.bz2
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/hifigan_v2.onnx
+
+./matcha-tts-en-cxx-api
+
+ */
+// clang-format on
+
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+static int32_t ProgressCallback(const float *samples, int32_t num_samples,
+                                float progress, void *arg) {
+  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
+  // return 1 to continue generating
+  // return 0 to stop generating
+  return 1;
+}
+
+int32_t main(int32_t argc, char *argv[]) {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+  OfflineTtsConfig config;
+
+  config.model.matcha.acoustic_model =
+      "./matcha-icefall-en_US-ljspeech/model-steps-3.onnx";
+
+  config.model.matcha.vocoder = "./hifigan_v2.onnx";
+
+  config.model.matcha.tokens = "./matcha-icefall-en_US-ljspeech/tokens.txt";
+
+  config.model.matcha.data_dir =
+      "./matcha-icefall-en_US-ljspeech/espeak-ng-data";
+
+  config.model.num_threads = 1;
+
+  // If you don't want to see debug messages, please set it to 0
+  config.model.debug = 1;
+
+  std::string filename = "./generated-matcha-en-cxx.wav";
+  std::string text =
+      "Today as always, men fall into two groups: slaves and free men. Whoever "
+      "does not have two-thirds of his day for himself, is a slave, whatever "
+      "he may be: a statesman, a businessman, an official, or a scholar. "
+      "Friends fell out often because life was changing so fast. The easiest "
+      "thing in the world was to lose touch with someone.";
+
+  auto tts = OfflineTts::Create(config);
+  int32_t sid = 0;
+  float speed = 1.0;  // larger -> faster in speech speed
+
+#if 0
+  // If you don't want to use a callback, then please enable this branch
+  GeneratedAudio audio = tts.Generate(text, sid, speed);
+#else
+  GeneratedAudio audio = tts.Generate(text, sid, speed, ProgressCallback);
+#endif
+
+  WriteWave(filename, {audio.samples, audio.sample_rate});
+
+  fprintf(stderr, "Input text is: %s\n", text.c_str());
+  fprintf(stderr, "Speaker ID is is: %d\n", sid);
+  fprintf(stderr, "Saved to: %s\n", filename.c_str());
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/matcha-tts-zh-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/matcha-tts-zh-cxx-api.cc
@ -0,0 +1,79 @@
+// cxx-api-examples/matcha-tts-zh-cxx-api.cc
+//
+// Copyright (c)  2025  Xiaomi Corporation
+
+// This file shows how to use sherpa-onnx CXX API
+// for Chinese TTS with MatchaTTS.
+//
+// clang-format off
+/*
+Usage
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/matcha-icefall-zh-baker.tar.bz2
+tar xvf matcha-icefall-zh-baker.tar.bz2
+rm matcha-icefall-zh-baker.tar.bz2
+
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/vocoder-models/hifigan_v2.onnx
+
+./matcha-tts-zh-cxx-api
+
+ */
+// clang-format on
+
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+static int32_t ProgressCallback(const float *samples, int32_t num_samples,
+                                float progress, void *arg) {
+  fprintf(stderr, "Progress: %.3f%%\n", progress * 100);
+  // return 1 to continue generating
+  // return 0 to stop generating
+  return 1;
+}
+
+int32_t main(int32_t argc, char *argv[]) {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+  OfflineTtsConfig config;
+  config.model.matcha.acoustic_model =
+      "./matcha-icefall-zh-baker/model-steps-3.onnx";
+  config.model.matcha.vocoder = "./hifigan_v2.onnx";
+  config.model.matcha.lexicon = "./matcha-icefall-zh-baker/lexicon.txt";
+  config.model.matcha.tokens = "./matcha-icefall-zh-baker/tokens.txt";
+  config.model.matcha.dict_dir = "./matcha-icefall-zh-baker/dict";
+  config.model.num_threads = 1;
+
+  // If you don't want to see debug messages, please set it to 0
+  config.model.debug = 1;
+
+  // clang-format off
+  config.rule_fsts = "./matcha-icefall-zh-baker/phone.fst,./matcha-icefall-zh-baker/date.fst,./matcha-icefall-zh-baker/number.fst";  // NOLINT
+  // clang-format on
+
+  std::string filename = "./generated-matcha-zh-cxx.wav";
+  std::string text =
+      "当夜幕降临，星光点点，伴随着微风拂面，我在静谧中感受着时光的流转，思念如"
+      "涟漪荡漾，梦境如画卷展开，我与自然融为一体，沉静在这片宁静的美丽之中，感"
+      "受着生命的奇迹与温柔."
+      "某某银行的副行长和一些行政领导表示，他们去过长江和长白山; "
+      "经济不断增长。2024年12月31号，拨打110或者18920240511。123456块钱。";
+
+  auto tts = OfflineTts::Create(config);
+  int32_t sid = 0;
+  float speed = 1.0;  // larger -> faster in speech speed
+
+#if 0
+  // If you don't want to use a callback, then please enable this branch
+  GeneratedAudio audio = tts.Generate(text, sid, speed);
+#else
+  GeneratedAudio audio = tts.Generate(text, sid, speed, ProgressCallback);
+#endif
+
+  WriteWave(filename, {audio.samples, audio.sample_rate});
+
+  fprintf(stderr, "Input text is: %s\n", text.c_str());
+  fprintf(stderr, "Speaker ID is is: %d\n", sid);
+  fprintf(stderr, "Saved to: %s\n", filename.c_str());
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/moonshine-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/moonshine-cxx-api.cc
@ -0,0 +1,81 @@
+// cxx-api-examples/moonshine-cxx-api.cc
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use Moonshine with sherpa-onnx's C++ API.
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
+// tar xvf sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
+// rm sherpa-onnx-moonshine-tiny-en-int8.tar.bz2
+//
+// clang-format on
+
+#include <chrono>  // NOLINT
+#include <iostream>
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+int32_t main() {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+  OfflineRecognizerConfig config;
+
+  config.model_config.moonshine.preprocessor =
+      "./sherpa-onnx-moonshine-tiny-en-int8/preprocess.onnx";
+  config.model_config.moonshine.encoder =
+      "./sherpa-onnx-moonshine-tiny-en-int8/encode.int8.onnx";
+  config.model_config.moonshine.uncached_decoder =
+      "./sherpa-onnx-moonshine-tiny-en-int8/uncached_decode.int8.onnx";
+  config.model_config.moonshine.cached_decoder =
+      "./sherpa-onnx-moonshine-tiny-en-int8/cached_decode.int8.onnx";
+  config.model_config.tokens =
+      "./sherpa-onnx-moonshine-tiny-en-int8/tokens.txt";
+
+  config.model_config.num_threads = 1;
+
+  std::cout << "Loading model\n";
+  OfflineRecognizer recongizer = OfflineRecognizer::Create(config);
+  if (!recongizer.Get()) {
+    std::cerr << "Please check your config\n";
+    return -1;
+  }
+  std::cout << "Loading model done\n";
+
+  std::string wave_filename =
+      "./sherpa-onnx-moonshine-tiny-en-int8/test_wavs/0.wav";
+  Wave wave = ReadWave(wave_filename);
+  if (wave.samples.empty()) {
+    std::cerr << "Failed to read: '" << wave_filename << "'\n";
+    return -1;
+  }
+
+  std::cout << "Start recognition\n";
+  const auto begin = std::chrono::steady_clock::now();
+
+  OfflineStream stream = recongizer.CreateStream();
+  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),
+                        wave.samples.size());
+
+  recongizer.Decode(&stream);
+
+  OfflineRecognizerResult result = recongizer.GetResult(&stream);
+
+  const auto end = std::chrono::steady_clock::now();
+  const float elapsed_seconds =
+      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)
+          .count() /
+      1000.;
+  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);
+  float rtf = elapsed_seconds / duration;
+
+  std::cout << "text: " << result.text << "\n";
+  printf("Number of threads: %d\n", config.model_config.num_threads);
+  printf("Duration: %.3fs\n", duration);
+  printf("Elapsed seconds: %.3fs\n", elapsed_seconds);
+  printf("(Real time factor) RTF = %.3f / %.3f = %.3f\n", elapsed_seconds,
+         duration, rtf);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/sense-voice-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/sense-voice-cxx-api.cc
@ -0,0 +1,78 @@
+// cxx-api-examples/sense-voice-cxx-api.cc
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use sense voice with sherpa-onnx's C++ API.
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
+// tar xvf sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
+// rm sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17.tar.bz2
+//
+// clang-format on
+
+#include <chrono>  // NOLINT
+#include <iostream>
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+int32_t main() {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+  OfflineRecognizerConfig config;
+
+  config.model_config.sense_voice.model =
+      "./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.int8.onnx";
+  config.model_config.sense_voice.use_itn = true;
+  config.model_config.sense_voice.language = "auto";
+  config.model_config.tokens =
+      "./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/tokens.txt";
+
+  config.model_config.num_threads = 1;
+
+  std::cout << "Loading model\n";
+  OfflineRecognizer recongizer = OfflineRecognizer::Create(config);
+  if (!recongizer.Get()) {
+    std::cerr << "Please check your config\n";
+    return -1;
+  }
+  std::cout << "Loading model done\n";
+
+  std::string wave_filename =
+      "./sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/test_wavs/en.wav";
+
+  Wave wave = ReadWave(wave_filename);
+  if (wave.samples.empty()) {
+    std::cerr << "Failed to read: '" << wave_filename << "'\n";
+    return -1;
+  }
+
+  std::cout << "Start recognition\n";
+  const auto begin = std::chrono::steady_clock::now();
+
+  OfflineStream stream = recongizer.CreateStream();
+  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),
+                        wave.samples.size());
+
+  recongizer.Decode(&stream);
+
+  OfflineRecognizerResult result = recongizer.GetResult(&stream);
+
+  const auto end = std::chrono::steady_clock::now();
+  const float elapsed_seconds =
+      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)
+          .count() /
+      1000.;
+  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);
+  float rtf = elapsed_seconds / duration;
+
+  std::cout << "text: " << result.text << "\n";
+  printf("Number of threads: %d\n", config.model_config.num_threads);
+  printf("Duration: %.3fs\n", duration);
+  printf("Elapsed seconds: %.3fs\n", elapsed_seconds);
+  printf("(Real time factor) RTF = %.3f / %.3f = %.3f\n", elapsed_seconds,
+         duration, rtf);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/speech-enhancement-gtcrn-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/speech-enhancement-gtcrn-cxx-api.cc
@ -0,0 +1,65 @@
+// cxx-api-examples/speech-enhancement-gtcrn-cxx-api.cc
+//
+// Copyright (c)  2025  Xiaomi Corporation
+//
+// We assume you have pre-downloaded model
+// from
+// https://github.com/k2-fsa/sherpa-onnx/releases/tag/speech-enhancement-models
+//
+//
+// An example command to download
+// clang-format off
+/*
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/gtcrn_simple.onnx
+wget https://github.com/k2-fsa/sherpa-onnx/releases/download/speech-enhancement-models/inp_16k.wav
+*/
+// clang-format on
+#include <chrono>  // NOLINT
+#include <iostream>
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+int32_t main() {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+
+  OfflineSpeechDenoiserConfig config;
+  std::string wav_filename = "./inp_16k.wav";
+  std::string out_wave_filename = "./enhanced_16k.wav";
+
+  config.model.gtcrn.model = "./gtcrn_simple.onnx";
+
+  auto sd = OfflineSpeechDenoiser::Create(config);
+  if (!sd.Get()) {
+    std::cerr << "Please check your config\n";
+    return -1;
+  }
+
+  Wave wave = ReadWave(wav_filename);
+  if (wave.samples.empty()) {
+    std::cerr << "Failed to read: '" << wav_filename << "'\n";
+    return -1;
+  }
+
+  std::cout << "Started\n";
+  const auto begin = std::chrono::steady_clock::now();
+  auto denoised =
+      sd.Run(wave.samples.data(), wave.samples.size(), wave.sample_rate);
+  const auto end = std::chrono::steady_clock::now();
+  std::cout << "Done\n";
+
+  WriteWave(out_wave_filename, {denoised.samples, denoised.sample_rate});
+
+  const float elapsed_seconds =
+      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)
+          .count() /
+      1000.;
+  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);
+  float rtf = elapsed_seconds / duration;
+
+  std::cout << "Saved to " << out_wave_filename << "\n";
+  printf("Duration: %.3fs\n", duration);
+  printf("Elapsed seconds: %.3fs\n", elapsed_seconds);
+  printf("(Real time factor) RTF = %.3f / %.3f = %.3f\n", elapsed_seconds,
+         duration, rtf);
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/streaming-zipformer-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/streaming-zipformer-cxx-api.cc
@ -0,0 +1,93 @@
+// cxx-api-examples/streaming-zipformer-cxx-api.cc
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use streaming Zipformer
+// with sherpa-onnx's C++ API.
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
+// tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
+// rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
+//
+// clang-format on
+
+#include <chrono>  // NOLINT
+#include <iostream>
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+int32_t main() {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+  OnlineRecognizerConfig config;
+
+  // please see
+  // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english
+  config.model_config.transducer.encoder =
+      "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/"
+      "encoder-epoch-99-avg-1.int8.onnx";
+
+  // Note: We recommend not using int8.onnx for the decoder.
+  config.model_config.transducer.decoder =
+      "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/"
+      "decoder-epoch-99-avg-1.onnx";
+
+  config.model_config.transducer.joiner =
+      "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/"
+      "joiner-epoch-99-avg-1.int8.onnx";
+
+  config.model_config.tokens =
+      "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt";
+
+  config.model_config.num_threads = 1;
+
+  std::cout << "Loading model\n";
+  OnlineRecognizer recongizer = OnlineRecognizer::Create(config);
+  if (!recongizer.Get()) {
+    std::cerr << "Please check your config\n";
+    return -1;
+  }
+  std::cout << "Loading model done\n";
+
+  std::string wave_filename =
+      "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/"
+      "0.wav";
+  Wave wave = ReadWave(wave_filename);
+  if (wave.samples.empty()) {
+    std::cerr << "Failed to read: '" << wave_filename << "'\n";
+    return -1;
+  }
+
+  std::cout << "Start recognition\n";
+  const auto begin = std::chrono::steady_clock::now();
+
+  OnlineStream stream = recongizer.CreateStream();
+  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),
+                        wave.samples.size());
+  stream.InputFinished();
+
+  while (recongizer.IsReady(&stream)) {
+    recongizer.Decode(&stream);
+  }
+
+  OnlineRecognizerResult result = recongizer.GetResult(&stream);
+
+  const auto end = std::chrono::steady_clock::now();
+  const float elapsed_seconds =
+      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)
+          .count() /
+      1000.;
+  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);
+  float rtf = elapsed_seconds / duration;
+
+  std::cout << "text: " << result.text << "\n";
+  printf("Number of threads: %d\n", config.model_config.num_threads);
+  printf("Duration: %.3fs\n", duration);
+  printf("Elapsed seconds: %.3fs\n", elapsed_seconds);
+  printf("(Real time factor) RTF = %.3f / %.3f = %.3f\n", elapsed_seconds,
+         duration, rtf);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/streaming-zipformer-rtf-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/streaming-zipformer-rtf-cxx-api.cc
@ -0,0 +1,132 @@
+// cxx-api-examples/streaming-zipformer-rtf-cxx-api.cc
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use streaming Zipformer
+// with sherpa-onnx's C++ API.
+//
+// clang-format off
+//
+// cd /path/sherpa-onnx/
+// mkdir build
+// cd build
+// cmake ..
+// make
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
+// tar xvf sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
+// rm sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20.tar.bz2
+//
+// #  1. Test on CPU, run once
+//
+// ./bin/streaming-zipformer-rtf-cxx-api
+//
+// #  2. Test on CPU, run 10 times
+//
+// ./bin/streaming-zipformer-rtf-cxx-api 10
+//
+// #  3. Test on GPU, run 10 times
+//
+// ./bin/streaming-zipformer-rtf-cxx-api 10 cuda
+//
+// clang-format on
+
+#include <chrono>  // NOLINT
+#include <iostream>
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+int32_t main(int argc, char *argv[]) {
+  int32_t num_runs = 1;
+  if (argc >= 2) {
+    num_runs = atoi(argv[1]);
+    if (num_runs < 0) {
+      num_runs = 1;
+    }
+  }
+
+  bool use_gpu = (argc == 3);
+
+  using namespace sherpa_mnn::cxx;  // NOLINT
+  OnlineRecognizerConfig config;
+
+  // please see
+  // https://k2-fsa.github.io/sherpa/onnx/pretrained_models/online-transducer/zipformer-transducer-models.html#csukuangfj-sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20-bilingual-chinese-english
+  config.model_config.transducer.encoder =
+      "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/"
+      "encoder-epoch-99-avg-1.int8.onnx";
+
+  // Note: We recommend not using int8.onnx for the decoder.
+  config.model_config.transducer.decoder =
+      "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/"
+      "decoder-epoch-99-avg-1.onnx";
+
+  config.model_config.transducer.joiner =
+      "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/"
+      "joiner-epoch-99-avg-1.int8.onnx";
+
+  config.model_config.tokens =
+      "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/tokens.txt";
+
+  config.model_config.num_threads = 1;
+  config.model_config.provider = use_gpu ? "cuda" : "cpu";
+
+  std::cout << "Loading model\n";
+  OnlineRecognizer recongizer = OnlineRecognizer::Create(config);
+  if (!recongizer.Get()) {
+    std::cerr << "Please check your config\n";
+    return -1;
+  }
+  std::cout << "Loading model done\n";
+
+  std::string wave_filename =
+      "./sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20/test_wavs/"
+      "0.wav";
+  Wave wave = ReadWave(wave_filename);
+  if (wave.samples.empty()) {
+    std::cerr << "Failed to read: '" << wave_filename << "'\n";
+    return -1;
+  }
+
+  std::cout << "Start recognition\n";
+  float total_elapsed_seconds = 0;
+  OnlineRecognizerResult result;
+  for (int32_t i = 0; i < num_runs; ++i) {
+    const auto begin = std::chrono::steady_clock::now();
+
+    OnlineStream stream = recongizer.CreateStream();
+    stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),
+                          wave.samples.size());
+    stream.InputFinished();
+
+    while (recongizer.IsReady(&stream)) {
+      recongizer.Decode(&stream);
+    }
+
+    result = recongizer.GetResult(&stream);
+
+    auto end = std::chrono::steady_clock::now();
+    float elapsed_seconds =
+        std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)
+            .count() /
+        1000.;
+    printf("Run %d/%d, elapsed seconds: %.3f\n", i, num_runs, elapsed_seconds);
+    total_elapsed_seconds += elapsed_seconds;
+  }
+  float average_elapsed_secodns = total_elapsed_seconds / num_runs;
+  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);
+  float rtf = total_elapsed_seconds / num_runs / duration;
+
+  std::cout << "text: " << result.text << "\n";
+  printf("Number of threads: %d\n", config.model_config.num_threads);
+  printf("Duration: %.3fs\n", duration);
+  printf("Total Elapsed seconds: %.3fs\n", total_elapsed_seconds);
+  printf("Num runs: %d\n", num_runs);
+  printf("Elapsed seconds per run: %.3f/%d=%.3f\n", total_elapsed_seconds,
+         num_runs, average_elapsed_secodns);
+  printf("(Real time factor) RTF = %.3f / %.3f = %.3f\n",
+         average_elapsed_secodns, duration, rtf);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/cxx-api-examples/whisper-cxx-api.cc
+++ b/apps/frameworks/sherpa-mnn/cxx-api-examples/whisper-cxx-api.cc
@ -0,0 +1,76 @@
+// cxx-api-examples/whisper-cxx-api.cc
+// Copyright (c)  2024  Xiaomi Corporation
+
+//
+// This file demonstrates how to use whisper with sherpa-onnx's C++ API.
+//
+// clang-format off
+//
+// wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-whisper-tiny.en.tar.bz2
+// tar xvf sherpa-onnx-whisper-tiny.en.tar.bz2
+// rm sherpa-onnx-whisper-tiny.en.tar.bz2
+//
+// clang-format on
+
+#include <chrono>  // NOLINT
+#include <iostream>
+#include <string>
+
+#include "sherpa-mnn/c-api/cxx-api.h"
+
+int32_t main() {
+  using namespace sherpa_mnn::cxx;  // NOLINT
+  OfflineRecognizerConfig config;
+
+  config.model_config.whisper.encoder =
+      "./sherpa-onnx-whisper-tiny.en/tiny.en-encoder.int8.onnx";
+  config.model_config.whisper.decoder =
+      "./sherpa-onnx-whisper-tiny.en/tiny.en-decoder.int8.onnx";
+  config.model_config.tokens =
+      "./sherpa-onnx-whisper-tiny.en/tiny.en-tokens.txt";
+
+  config.model_config.num_threads = 1;
+
+  std::cout << "Loading model\n";
+  OfflineRecognizer recongizer = OfflineRecognizer::Create(config);
+  if (!recongizer.Get()) {
+    std::cerr << "Please check your config\n";
+    return -1;
+  }
+  std::cout << "Loading model done\n";
+
+  std::string wave_filename = "./sherpa-onnx-whisper-tiny.en/test_wavs/0.wav";
+  Wave wave = ReadWave(wave_filename);
+  if (wave.samples.empty()) {
+    std::cerr << "Failed to read: '" << wave_filename << "'\n";
+    return -1;
+  }
+
+  std::cout << "Start recognition\n";
+  const auto begin = std::chrono::steady_clock::now();
+
+  OfflineStream stream = recongizer.CreateStream();
+  stream.AcceptWaveform(wave.sample_rate, wave.samples.data(),
+                        wave.samples.size());
+
+  recongizer.Decode(&stream);
+
+  OfflineRecognizerResult result = recongizer.GetResult(&stream);
+
+  const auto end = std::chrono::steady_clock::now();
+  const float elapsed_seconds =
+      std::chrono::duration_cast<std::chrono::milliseconds>(end - begin)
+          .count() /
+      1000.;
+  float duration = wave.samples.size() / static_cast<float>(wave.sample_rate);
+  float rtf = elapsed_seconds / duration;
+
+  std::cout << "text: " << result.text << "\n";
+  printf("Number of threads: %d\n", config.model_config.num_threads);
+  printf("Duration: %.3fs\n", duration);
+  printf("Elapsed seconds: %.3fs\n", elapsed_seconds);
+  printf("(Real time factor) RTF = %.3f / %.3f = %.3f\n", elapsed_seconds,
+         duration, rtf);
+
+  return 0;
+}
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/.gitignore
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/.gitignore
@ -0,0 +1,3 @@
+hs_err*
+vits-zh-aishell3
+*.jar
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/AudioTagging.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/AudioTagging.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/AudioTagging.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/FeatureConfig.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/FeatureConfig.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/FeatureConfig.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/OfflinePunctuation.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/OfflinePunctuation.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/OfflinePunctuation.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/OfflineRecognizer.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/OfflineRecognizer.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/OfflineRecognizer.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/OfflineSpeakerDiarization.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/OfflineSpeakerDiarization.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/OfflineSpeakerDiarization.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/OfflineStream.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/OfflineStream.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/OfflineStream.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/OnlinePunctuation.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/OnlinePunctuation.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/OnlinePunctuation.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/OnlineRecognizer.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/OnlineRecognizer.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/OnlineRecognizer.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/OnlineStream.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/OnlineStream.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/OnlineStream.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/Speaker.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/Speaker.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/Speaker.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/SpeakerEmbeddingExtractorConfig.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/SpeakerEmbeddingExtractorConfig.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/SpeakerEmbeddingExtractorConfig.kt
--- a/apps/frameworks/sherpa-mnn/kotlin-api-examples/SpokenLanguageIdentification.kt
+++ b/apps/frameworks/sherpa-mnn/kotlin-api-examples/SpokenLanguageIdentification.kt
@ -0,0 +1 @@
+../sherpa-mnn/kotlin-api/SpokenLanguageIdentification.kt
--- a/Show More
+++ b/Show More
				`@ -0,0 +1 @@`
				`../sherpa-mnn/kotlin-api/OfflinePunctuation.kt`
				`@ -0,0 +1 @@`
				`../sherpa-mnn/kotlin-api/OfflineRecognizer.kt`
				`@ -0,0 +1 @@`
				`../sherpa-mnn/kotlin-api/OfflineSpeakerDiarization.kt`
				`@ -0,0 +1 @@`
				`../sherpa-mnn/kotlin-api/OnlinePunctuation.kt`
				`@ -0,0 +1 @@`
				`../sherpa-mnn/kotlin-api/OnlineRecognizer.kt`
				`@ -0,0 +1 @@`
				`../sherpa-mnn/kotlin-api/SpeakerEmbeddingExtractorConfig.kt`
				`@ -0,0 +1 @@`
				`../sherpa-mnn/kotlin-api/SpokenLanguageIdentification.kt`