Spaces:
Running
Running
| title: README | |
| emoji: 🦀 | |
| colorFrom: pink | |
| colorTo: gray | |
| sdk: static | |
| pinned: false | |
| # [TTSDS Benchmark](https://ttsdsbenchmark.com) | |
| As many recent Text-to-Speech (TTS) models have shown, synthetic audio can be close to real human speech. | |
| However, traditional evaluation methods for TTS systems need an update to keep pace with these new developments. | |
| Our TTSDS benchmark assesses the quality of synthetic speech by considering factors like prosody, speaker identity, and intelligibility. | |
| By comparing these factors with both real speech and noise datasets, we can better understand how synthetic speech stacks up. | |
| ## More information | |
| More details can be found in our paper [*TTSDS -- Text-to-Speech Distribution Score*](https://arxiv.org/abs/2407.12707). | |
| ## Reproducibility | |
| To reproduce our results, check out our repository [here](https://github.com/ttsds/ttsds). | |
| ## Citation | |
| ``` | |
| @misc{minixhofer2024ttsds, | |
| title={TTSDS -- Text-to-Speech Distribution Score}, | |
| author={Christoph Minixhofer and Ondřej Klejch and Peter Bell}, | |
| year={2024}, | |
| eprint={2407.12707}, | |
| archivePrefix={arXiv}, | |
| primaryClass={eess.AS}, | |
| url={https://arxiv.org/abs/2407.12707}, | |
| } | |
| ``` | |