.

2025-12-12 07:40:43 -08:00 · 2024-10-23 23:05:25 +08:00
parent c4eee0f96b
commit d8638a6c32
12 changed files with 61 additions and 195 deletions
--- a/README.md
+++ b/README.md
@@ -18,43 +18,46 @@

 ## Installation

-Clone the repository:
+```bash
+# Create a python 3.10 conda env (you could also use virtualenv)
+conda create -n f5-tts python=3.10
+conda activate f5-tts
+
+# Install pytorch with your CUDA version, e.g.
+pip install torch==2.3.0+cu118 torchaudio==2.3.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
+```
+
+Then you can choose from a few options below:
+
+### 1. Local editable

 ```bash
 git clone https://github.com/SWivid/F5-TTS.git
 cd F5-TTS
+pip install -e .
 ```

-Install torch with your CUDA version, e.g. :
+### 2. As a pip package

 ```bash
-pip install torch==2.3.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
-pip install torchaudio==2.3.0+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
+pip install git+https://github.com/SWivid/F5-TTS.git
 ```

-Install other packages:
-
-```bash
-pip install -r requirements.txt
-```
-
-**[Optional]**: We provide [Dockerfile](https://github.com/SWivid/F5-TTS/blob/main/Dockerfile) and you can use the following command to build it.
+### 3. Build from dockerfile
 ```bash
 docker build -t f5tts:v1 .
 ```

-### Development
+## Development

-When making a pull request, please use pre-commit to ensure code quality:
+Use pre-commit to ensure code quality (will run linters and formatters automatically)

 ```bash
 pip install pre-commit
 pre-commit install
 ```

-This will run linters and formatters automatically before each commit.
-
-Manually run using: 
+When making a pull request, before each commit, run: 

 ```bash
 pre-commit run --all-files
@@ -62,28 +65,6 @@ pre-commit run --all-files

 Note: Some model components have linting exceptions for E722 to accommodate tensor notation

-
-### As a pip package
-
-```bash
-pip install git+https://github.com/SWivid/F5-TTS.git
-```
-
-```python
-import gradio as gr
-from f5_tts.gradio_app import app
-
-with gr.Blocks() as main_app:
-    gr.Markdown("# This is an example of using F5-TTS within a bigger Gradio app")
-
-    # ... other Gradio components
-
-    app.render()
-
-main_app.launch()
-
-```
-
 ## Prepare Dataset

 Example data processing scripts for Emilia and Wenetspeech4TTS, and you may tailor your own one along with a Dataset class in `f5_tts/model/dataset.py`.
@@ -147,6 +128,21 @@ export WANDB_MODE=offline

 ## Inference

+```python
+import gradio as gr
+from f5_tts.gradio_app import app
+
+with gr.Blocks() as main_app:
+    gr.Markdown("# This is an example of using F5-TTS within a bigger Gradio app")
+
+    # ... other Gradio components
+
+    app.render()
+
+main_app.launch()
+
+```
+
 The pretrained model checkpoints can be reached at [🤗 Hugging Face](https://huggingface.co/SWivid/F5-TTS) and [🤖 Model Scope](https://www.modelscope.cn/models/SWivid/F5-TTS_Emilia-ZH-EN), or automatically downloaded with `inference-cli` and `gradio_app`.

 Currently support 30s for a single generation, which is the **TOTAL** length of prompt audio and the generated. Batch inference with chunks is supported by `inference-cli` and `gradio_app`. 
@@ -248,21 +244,7 @@ bash scripts/eval_infer_batch.sh
 Install packages for evaluation:

 ```bash
-pip install -r requirements_eval.txt
-```
-
-**Some Notes**
-
-For faster-whisper with CUDA 11:
-
-```bash
-pip install --force-reinstall ctranslate2==3.24.0
-```
-
-(Recommended) To avoid possible ASR failures, such as abnormal repetitions in output:
-
-```bash
-pip install faster-whisper==0.10.1
+pip install -e .[eval]
 ```

 Update the path with your batch-inferenced results, and carry out WER / SIM evaluations: