Xiang Lyu
4c9a4c2fed
Merge pull request #1187 from hexisyztem/dev/Comet
...
Refactor CUDA stream context management in CosyVoice2Model
2025-04-16 16:07:43 +08:00
禾息
e8a26827ae
Refactor CUDA stream context management in CosyVoice2Model
...
- Replaced the use of torch.cuda.stream with a direct context management approach for improved clarity and performance during inference.
- This change simplifies the stream handling code while maintaining efficient resource utilization.
2025-04-16 16:04:40 +08:00
Xiang Lyu
ab74475604
Merge pull request #1184 from hexisyztem/dev/Comet
...
Dev/comet
2025-04-16 15:02:46 +08:00
禾息
369f3c2c18
Update estimator count retrieval and memory pool limit in CosyVoice
...
- Simplified estimator count retrieval in CosyVoice and CosyVoice2 classes to directly access the configs dictionary.
- Adjusted memory pool limit in the ONNX to TensorRT conversion function from 8GB to 1GB for optimized resource management.
2025-04-16 14:39:06 +08:00
禾息
7f4c9a2c64
Refactor CosyVoice inference methods to streamline CUDA stream management
...
- Removed the queue-based stream pool and integrated direct CUDA stream usage for improved performance.
- Simplified inference methods by eliminating unnecessary synchronization and stream management code.
- Enhanced logging for better tracking of synthesis operations and performance metrics.
- Updated the model class to support CUDA stream context management, ensuring efficient resource utilization during inference.
2025-04-16 14:15:14 +08:00
禾息
fd9b7d45e2
Fix logging indentation in CosyVoice TTS method for improved clarity
2025-04-16 11:24:51 +08:00
禾息
62e04e8856
Enhance CosyVoice with CUDA stream management and estimator handling
...
- Introduced a queue-based system for managing CUDA streams to improve inference performance.
- Updated inference methods to utilize CUDA streams for asynchronous processing.
- Added an EstimatorWrapper class to manage TensorRT estimators, allowing for efficient execution context handling.
- Modified model loading functions to support estimator count configuration.
- Improved logging and performance tracking during inference operations.
2025-04-16 11:16:28 +08:00
雾聪
96950745a6
Revert "mv AsyncLLMEngine init to CosyVoice2"
...
This reverts commit 9b3f351496 .
2025-03-21 16:17:35 +08:00
雾聪
9b3f351496
mv AsyncLLMEngine init to CosyVoice2
2025-03-21 10:24:04 +08:00
Yabin Li
00b454cf30
Merge pull request #1053 from qi-hua/dev/use_vllm
...
Dev/use vllm
2025-03-13 14:22:50 +08:00
qihua
c0f6a474f3
fix(async_cosyvoice): 恢复原本文本令牌处理逻辑
...
- 在 Frontend 中,恢复原本逐个生成文本令牌
- 在 Model 类中,移除了不必要的日志信息和断言,简化了文本令牌的处理流程
2025-03-08 16:03:35 +08:00
qihua
ab5b8eb160
refactor(llm): 重构 vLLM 推理任务处理方式,支持多任务处理
...
- 移除任务队列和单任务处理限制
- 使用 asyncio.run_coroutine_threadsafe() 在后台线程中运行推理任务
2025-03-08 10:41:49 +08:00
qihua
b4fe05d466
docs: 添加speed_test.ipynb文件
...
- 新增 speed_test.ipynb 文件,用于测试 CosyVoice2模型的性能
- 包含测试环境配置、默认情况下的使用示例、使用 vllm 加速 LLM 推理的步骤
2025-03-08 00:41:34 +08:00
qihua
a1314e573a
chore: 新增 requirements_vllm.txt 文件,指定VLLM 模型所需的依赖
2025-03-08 00:40:17 +08:00
qihua
2fbeba50ae
refactor(llm): 移除未使用的异步推理方法
...
- 删除了 LLM 类中的 async_llm_inference 方法
- 该方法尚未使用,且再在loop_thread之外运行后会导致 vllm 崩溃,因此将其移除
2025-03-08 00:04:01 +08:00
qihua
d4d187bd8c
refactor(llm): 重构 VLLM 推理方式
...
- 新增基于队列和线程的异步推理机制
- 优化同步推理接口,使用新机制实现
2025-03-07 23:53:50 +08:00
qihua
90b666ea20
初步合并vllm支持,异步推理的通道处理还存在bug
2025-03-07 20:26:19 +08:00
Xiang Lyu
fd45708e4b
Merge pull request #977 from hanasay/main
...
Convert audio to mono while extract speech token
2025-02-16 12:51:04 +08:00
hanasay
296ed4f526
Convert audio to mono while extract speech token
...
modified: tools/extract_speech_token.py
2025-02-14 15:25:45 +08:00
Xiang Lyu
95e99e0417
Merge pull request #940 from c4fun/fastapi-cosyvoice2
...
Add a inference_instruct2 route to support and defaultly supports cosyvoice2 in fastapi server
2025-02-07 16:30:27 +08:00
c4fun
ba6d8c07ba
revert the main function to original and only preserve the inference endpoint
2025-02-07 16:28:30 +08:00
Xiang Lyu
da3f129977
Merge pull request #936 from huyyxy/main
...
feat(docker): update CUDA base image to 12.4.1 for TensorRT support
2025-02-06 11:39:53 +08:00
c4fun
2889c25863
supports and defaultly supports cosyvoice2 in fastapi server
2025-01-27 20:51:57 +08:00
huyyxy
aa65200713
feat(docker): update CUDA base image to 12.4.1 for TensorRT support
...
- Upgrade base image from nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04 to nvidia/cuda:12.4.1-cudnn-devel-ubuntu22.04
- Enable CUDA 12.4 runtime environment
- Ensure TensorRT dependency compatibility
- Validation steps:
- Verify CUDA version via nvidia-smi after build
- Test import tensorrt in container without errors
Closes #935
2025-01-26 12:33:50 +08:00
Xiang Lyu
86e26f54c7
Merge pull request #930 from sd0ric4/main
...
feat: add POST endpoints to resolve browser error about GET request wi…
2025-01-26 10:17:15 +08:00
sd0ric4
f1c214377c
fix: add POST endpoints to resolve browser error about GET request with body
2025-01-24 19:41:14 +08:00
Xiang Lyu
369ea80bd4
Merge pull request #926 from Vinkle-hzt/main
...
fix bistream extra token
2025-01-23 22:02:25 +08:00
huzetao.hzt
69518b2bde
fix bistream extra token
2025-01-23 19:08:18 +08:00
Xiang Lyu
276cfa02b6
Merge pull request #925 from FunAudioLLM/dev/lyuxiang.lx
...
fix pitch computation
2025-01-23 15:45:42 +08:00
lyuxiang.lx
190840b8dc
fix pitch computation
2025-01-23 15:44:03 +08:00
lyuxiang.lx
c6c3f27ecc
fix typo
2025-01-23 11:27:10 +08:00
Xiang Lyu
49761d2474
Merge pull request #924 from FunAudioLLM/dev/lyuxiang.lx
...
add llm bistream
2025-01-23 10:19:21 +08:00
lyuxiang.lx
07e477519b
add llm bistream
2025-01-23 10:12:06 +08:00
Xiang Lyu
41c5e8cd6d
Merge pull request #887 from Wauplin/patch-1
...
Fix diffusers / huggingface_hub compatibility in requirements.txt
2025-01-15 18:23:05 +08:00
Lucain
66ceaff472
Fix diffusers / huggingface_hub compatibility in requirements.txt
...
As mentioned in https://github.com/FunAudioLLM/CosyVoice/issues/516#issuecomment-2592067949 and https://github.com/FunAudioLLM/CosyVoice/issues/527#issuecomment-2592067100 , it is more future-proof to upgrade `diffusers` version rather than downgrading `huggingface_hub` to an old one. This will also fix the `cannot import name 'cached_download' from 'huggingface_hub'` issue without relying on outdated packages.
Sorry again for the inconvenience 🙏
2025-01-15 10:21:08 +01:00
Xiang Lyu
07a314767f
Merge pull request #884 from FunAudioLLM/dev/lyuxiang.lx
...
update
2025-01-14 22:56:21 +08:00
lyuxiang.lx
0b75c3a03f
update
2025-01-14 22:55:13 +08:00
Xiang Lyu
b4dea3d64a
Merge pull request #878 from FunAudioLLM/dev/lyuxiang.lx
...
update
2025-01-13 10:31:12 +08:00
lyuxiang.lx
43f9e9ab20
update
2025-01-13 10:30:13 +08:00
Xiang Lyu
025f6f0f7f
Merge pull request #875 from lsby/main
...
fix docker python version
2025-01-13 10:27:49 +08:00
Xiang Lyu
69051d11ec
Merge pull request #876 from FunAudioLLM/dev/lyuxiang.lx
...
fix bug
2025-01-12 21:21:25 +08:00
lyuxiang.lx
59fa786769
fix bug
2025-01-12 21:18:41 +08:00
hbybyyang
f38f594303
fix docker python version
2025-01-12 15:59:58 +08:00
Xiang Lyu
eb4d5d053f
Merge pull request #868 from FunAudioLLM/dev/lyuxiang.lx
...
move prompt wav to asset
2025-01-10 17:53:32 +08:00
lyuxiang.lx
d450c32296
update
2025-01-10 17:52:25 +08:00
lyuxiang.lx
e84d72a4d9
move prompt wav to asset
2025-01-10 17:51:21 +08:00
Xiang Lyu
06e86619c2
Merge pull request #867 from FunAudioLLM/dev/lyuxiang.lx
...
Dev/lyuxiang.lx
2025-01-10 16:46:11 +08:00
lyuxiang.lx
e257c16796
Merge branch 'dev/lyuxiang.lx' of github.com:FunAudioLLM/CosyVoice into dev/lyuxiang.lx
2025-01-10 16:44:01 +08:00
lyuxiang.lx
87475ccf41
fix conflict
2025-01-10 16:43:31 +08:00
Xiang Lyu
8a1bce6c81
Merge pull request #865 from FunAudioLLM/dev/lyuxiang.lx
...
Dev/lyuxiang.lx
2025-01-10 14:18:21 +08:00