v1.1.1 add benchmark and trtllm offline code

2025-12-12 07:40:43 -08:00 · 2025-04-03 18:33:48 +08:00
parent 2374f8ec39
commit fe5c562212
3 changed files with 11 additions and 9 deletions
--- a/README.md
+++ b/README.md
@@ -114,9 +114,11 @@ Deployment solution with Triton and TensorRT-LLM.
 #### Benchmark Results
 Decoding on a single L20 GPU, using 26 different prompt_audio & target_text pairs.

-| Model | Concurrency | Avg Latency    | RTF   | 
-|-------|-------------|----------------|-------|
-| F5-TTS Base (Vocos) | 1     | 253 ms | 0.0394|
+| Model               | Concurrency    | Avg Latency | RTF    | Mode            |
+|---------------------|----------------|-------------|--------|-----------------|
+| F5-TTS Base (Vocos) | 2              | 253 ms      | 0.0394 | Client-Server   |
+| F5-TTS Base (Vocos) | 1 (Batch_size) | -           | 0.0402 | Offline TRT-LLM |
+| F5-TTS Base (Vocos) | 1 (Batch_size) | -           | 0.1467 | Offline Pytorch |

 See [detailed instructions](src/f5_tts/runtime/triton_trtllm/README.md) for more information.