CosyVoice

Commit Graph

Author	SHA1	Message	Date
Xiang Lyu	4c9a4c2fed	Merge pull request #1187 from hexisyztem/dev/Comet Refactor CUDA stream context management in CosyVoice2Model	2025-04-16 16:07:43 +08:00
禾息	e8a26827ae	Refactor CUDA stream context management in CosyVoice2Model - Replaced the use of torch.cuda.stream with a direct context management approach for improved clarity and performance during inference. - This change simplifies the stream handling code while maintaining efficient resource utilization.	2025-04-16 16:04:40 +08:00
Xiang Lyu	ab74475604	Merge pull request #1184 from hexisyztem/dev/Comet Dev/comet	2025-04-16 15:02:46 +08:00
禾息	369f3c2c18	Update estimator count retrieval and memory pool limit in CosyVoice - Simplified estimator count retrieval in CosyVoice and CosyVoice2 classes to directly access the configs dictionary. - Adjusted memory pool limit in the ONNX to TensorRT conversion function from 8GB to 1GB for optimized resource management.	2025-04-16 14:39:06 +08:00
禾息	7f4c9a2c64	Refactor CosyVoice inference methods to streamline CUDA stream management - Removed the queue-based stream pool and integrated direct CUDA stream usage for improved performance. - Simplified inference methods by eliminating unnecessary synchronization and stream management code. - Enhanced logging for better tracking of synthesis operations and performance metrics. - Updated the model class to support CUDA stream context management, ensuring efficient resource utilization during inference.	2025-04-16 14:15:14 +08:00
禾息	fd9b7d45e2	Fix logging indentation in CosyVoice TTS method for improved clarity	2025-04-16 11:24:51 +08:00
禾息	62e04e8856	Enhance CosyVoice with CUDA stream management and estimator handling - Introduced a queue-based system for managing CUDA streams to improve inference performance. - Updated inference methods to utilize CUDA streams for asynchronous processing. - Added an EstimatorWrapper class to manage TensorRT estimators, allowing for efficient execution context handling. - Modified model loading functions to support estimator count configuration. - Improved logging and performance tracking during inference operations.	2025-04-16 11:16:28 +08:00
雾聪	96950745a6	Revert "mv AsyncLLMEngine init to CosyVoice2" This reverts commit `9b3f351496`.	2025-03-21 16:17:35 +08:00
雾聪	9b3f351496	mv AsyncLLMEngine init to CosyVoice2	2025-03-21 10:24:04 +08:00
Yabin Li	00b454cf30	Merge pull request #1053 from qi-hua/dev/use_vllm Dev/use vllm	2025-03-13 14:22:50 +08:00
qihua	c0f6a474f3	fix(async_cosyvoice): 恢复原本文本令牌处理逻辑 - 在 Frontend 中，恢复原本逐个生成文本令牌 - 在 Model 类中，移除了不必要的日志信息和断言，简化了文本令牌的处理流程	2025-03-08 16:03:35 +08:00
qihua	ab5b8eb160	refactor(llm): 重构 vLLM 推理任务处理方式，支持多任务处理 - 移除任务队列和单任务处理限制 - 使用 asyncio.run_coroutine_threadsafe() 在后台线程中运行推理任务	2025-03-08 10:41:49 +08:00
qihua	b4fe05d466	docs: 添加speed_test.ipynb文件 - 新增 speed_test.ipynb 文件，用于测试 CosyVoice2模型的性能 - 包含测试环境配置、默认情况下的使用示例、使用 vllm 加速 LLM 推理的步骤	2025-03-08 00:41:34 +08:00
qihua	a1314e573a	chore: 新增 requirements_vllm.txt 文件，指定VLLM 模型所需的依赖	2025-03-08 00:40:17 +08:00
qihua	2fbeba50ae	refactor(llm): 移除未使用的异步推理方法 - 删除了 LLM 类中的 async_llm_inference 方法 - 该方法尚未使用，且再在loop_thread之外运行后会导致 vllm 崩溃，因此将其移除	2025-03-08 00:04:01 +08:00
qihua	d4d187bd8c	refactor(llm): 重构 VLLM 推理方式 - 新增基于队列和线程的异步推理机制 - 优化同步推理接口，使用新机制实现	2025-03-07 23:53:50 +08:00
qihua	90b666ea20	初步合并vllm支持，异步推理的通道处理还存在bug	2025-03-07 20:26:19 +08:00
Xiang Lyu	fd45708e4b	Merge pull request #977 from hanasay/main Convert audio to mono while extract speech token	2025-02-16 12:51:04 +08:00
hanasay	296ed4f526	Convert audio to mono while extract speech token modified： tools/extract_speech_token.py	2025-02-14 15:25:45 +08:00
Xiang Lyu	95e99e0417	Merge pull request #940 from c4fun/fastapi-cosyvoice2 Add a inference_instruct2 route to support and defaultly supports cosyvoice2 in fastapi server	2025-02-07 16:30:27 +08:00
c4fun	ba6d8c07ba	revert the main function to original and only preserve the inference endpoint	2025-02-07 16:28:30 +08:00
Xiang Lyu	da3f129977	Merge pull request #936 from huyyxy/main feat(docker): update CUDA base image to 12.4.1 for TensorRT support	2025-02-06 11:39:53 +08:00
c4fun	2889c25863	supports and defaultly supports cosyvoice2 in fastapi server	2025-01-27 20:51:57 +08:00
huyyxy	aa65200713	feat(docker): update CUDA base image to 12.4.1 for TensorRT support - Upgrade base image from nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04 to nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04 - Enable CUDA 12.4 runtime environment - Ensure TensorRT dependency compatibility - Validation steps: - Verify CUDA version via nvidia-smi after build - Test import tensorrt in container without errors Closes #935	2025-01-26 12:33:50 +08:00
Xiang Lyu	86e26f54c7	Merge pull request #930 from sd0ric4/main feat: add POST endpoints to resolve browser error about GET request wi…	2025-01-26 10:17:15 +08:00
sd0ric4	f1c214377c	fix: add POST endpoints to resolve browser error about GET request with body	2025-01-24 19:41:14 +08:00
Xiang Lyu	369ea80bd4	Merge pull request #926 from Vinkle-hzt/main fix bistream extra token	2025-01-23 22:02:25 +08:00
huzetao.hzt	69518b2bde	fix bistream extra token	2025-01-23 19:08:18 +08:00
Xiang Lyu	276cfa02b6	Merge pull request #925 from FunAudioLLM/dev/lyuxiang.lx fix pitch computation	2025-01-23 15:45:42 +08:00
lyuxiang.lx	190840b8dc	fix pitch computation	2025-01-23 15:44:03 +08:00
lyuxiang.lx	c6c3f27ecc	fix typo	2025-01-23 11:27:10 +08:00
Xiang Lyu	49761d2474	Merge pull request #924 from FunAudioLLM/dev/lyuxiang.lx add llm bistream	2025-01-23 10:19:21 +08:00
lyuxiang.lx	07e477519b	add llm bistream	2025-01-23 10:12:06 +08:00
Xiang Lyu	41c5e8cd6d	Merge pull request #887 from Wauplin/patch-1 Fix diffusers / huggingface_hub compatibility in requirements.txt	2025-01-15 18:23:05 +08:00
Lucain	66ceaff472	Fix diffusers / huggingface_hub compatibility in requirements.txt As mentioned in https://github.com/FunAudioLLM/CosyVoice/issues/516#issuecomment-2592067949 and https://github.com/FunAudioLLM/CosyVoice/issues/527#issuecomment-2592067100, it is more future-proof to upgrade `diffusers` version rather than downgrading `huggingface_hub` to an old one. This will also fix the `cannot import name 'cached_download' from 'huggingface_hub'` issue without relying on outdated packages. Sorry again for the inconvenience 🙏	2025-01-15 10:21:08 +01:00
Xiang Lyu	07a314767f	Merge pull request #884 from FunAudioLLM/dev/lyuxiang.lx update	2025-01-14 22:56:21 +08:00
lyuxiang.lx	0b75c3a03f	update	2025-01-14 22:55:13 +08:00
Xiang Lyu	b4dea3d64a	Merge pull request #878 from FunAudioLLM/dev/lyuxiang.lx update	2025-01-13 10:31:12 +08:00
lyuxiang.lx	43f9e9ab20	update	2025-01-13 10:30:13 +08:00
Xiang Lyu	025f6f0f7f	Merge pull request #875 from lsby/main fix docker python version	2025-01-13 10:27:49 +08:00
Xiang Lyu	69051d11ec	Merge pull request #876 from FunAudioLLM/dev/lyuxiang.lx fix bug	2025-01-12 21:21:25 +08:00
lyuxiang.lx	59fa786769	fix bug	2025-01-12 21:18:41 +08:00
hbybyyang	f38f594303	fix docker python version	2025-01-12 15:59:58 +08:00
Xiang Lyu	eb4d5d053f	Merge pull request #868 from FunAudioLLM/dev/lyuxiang.lx move prompt wav to asset	2025-01-10 17:53:32 +08:00
lyuxiang.lx	d450c32296	update	2025-01-10 17:52:25 +08:00
lyuxiang.lx	e84d72a4d9	move prompt wav to asset	2025-01-10 17:51:21 +08:00
Xiang Lyu	06e86619c2	Merge pull request #867 from FunAudioLLM/dev/lyuxiang.lx Dev/lyuxiang.lx	2025-01-10 16:46:11 +08:00
lyuxiang.lx	e257c16796	Merge branch 'dev/lyuxiang.lx' of github.com:FunAudioLLM/CosyVoice into dev/lyuxiang.lx	2025-01-10 16:44:01 +08:00
lyuxiang.lx	87475ccf41	fix conflict	2025-01-10 16:43:31 +08:00
Xiang Lyu	8a1bce6c81	Merge pull request #865 from FunAudioLLM/dev/lyuxiang.lx Dev/lyuxiang.lx	2025-01-10 14:18:21 +08:00

1 2 3 4 5 ...

327 Commits All Branches Search

327 Commits

All Branches