add-text-preprocessing
#2
by
miraclemind
- opened
No description provided.
Adds input text preprocessing to improve the output when text has unusual punctuation. e.g. “This isn’t all for me, is it?” he’d asked converts to "This isn't all for me, is it?" he'd asked .
Based on the helpers that preprocess in the supertonic repo.
miraclemind
changed pull request status to
open
Thanks! 😀 Ideally, these rules would be added to the tokenizer itself: https://huggingface.co/onnx-community/Supertonic-TTS-ONNX/blob/main/tokenizer.json, but some of the rules do look quite complex.
Do you think you'd be able to design a PR for some of the simpler normalization cases?
Done. It's just regex find and replace so fairly easy to do.
https://huggingface.co/onnx-community/Supertonic-TTS-ONNX/discussions/1