nvidia
/

audio_to_audio_schrodinger_bridge

Model card Files Files and versions

xet

Community

kalashshah19 commited on Sep 2

Commit

fc21553

verified ·

1 Parent(s): c3d0f52

Adding Pipeline Tag

Browse files

Adding "Audio-to-Audio" pipeline tag

Files changed (1) hide show

README.md +33 -32

README.md CHANGED Viewed

@@ -1,33 +1,34 @@
----
-license: other
-license_name: nvidia-oneway-noncommercial-license
----
-# PyTorch Implementation of Audio-to-Audio Schrodinger Bridges
-**Zhifeng Kong, Kevin J Shih, Weili Nie, Arash Vahdat, Sang-gil Lee, Joao Felipe Santos, Ante Jukic, Rafael Valle, Bryan Catanzaro**
-[[paper]](https://arxiv.org/abs/2501.11311) [[GitHub]](https://github.com/NVIDIA/diffusion-audio-restoration) [[Demo]](https://research.nvidia.com/labs/adlr/A2SB/)
-This repo contains the PyTorch implementation of [A2SB: Audio-to-Audio Schrodinger Bridges](https://arxiv.org/abs/2501.11311). A2SB is an audio restoration model tailored for high-res music at 44.1kHz. It is capable of both bandwidth extension (predicting high-frequency components) and inpainting (re-generating missing segments). Critically, A2SB is end-to-end without need of a vocoder to predict waveform outputs, and able to restore hour-long audio inputs. A2SB is capable of achieving state-of-the-art bandwidth extension and inpainting quality on several out-of-distribution music test sets.
-- We propose A2SB, a state-of-the-art, end-to-end, vocoder-free, and multi-task diffusion Schrodinger Bridge model for 44.1kHz high-res music restoration, using an effective factorized audio representation.
-- A2SB is the first long audio restoration model that could restore hour-long audio without
-boundary artifacts
-## License
-The model is provided under the NVIDIA OneWay NonCommercial License.
-## Citation
-```
-@article{kong2025a2sb,
-  title={A2SB: Audio-to-Audio Schrodinger Bridges},
-  author={Kong, Zhifeng and Shih, Kevin J and Nie, Weili and Vahdat, Arash and Lee, Sang-gil and Santos, Joao Felipe and Jukic, Ante and Valle, Rafael and Catanzaro, Bryan},
-  journal={arXiv preprint arXiv:2501.11311},
-  year={2025}
-}
 ```

+---
+pipeline_tag: audio-to-audio
+license: other
+license_name: nvidia-oneway-noncommercial-license
+---
+# PyTorch Implementation of Audio-to-Audio Schrodinger Bridges
+**Zhifeng Kong, Kevin J Shih, Weili Nie, Arash Vahdat, Sang-gil Lee, Joao Felipe Santos, Ante Jukic, Rafael Valle, Bryan Catanzaro**
+[[paper]](https://arxiv.org/abs/2501.11311) [[GitHub]](https://github.com/NVIDIA/diffusion-audio-restoration) [[Demo]](https://research.nvidia.com/labs/adlr/A2SB/)
+This repo contains the PyTorch implementation of [A2SB: Audio-to-Audio Schrodinger Bridges](https://arxiv.org/abs/2501.11311). A2SB is an audio restoration model tailored for high-res music at 44.1kHz. It is capable of both bandwidth extension (predicting high-frequency components) and inpainting (re-generating missing segments). Critically, A2SB is end-to-end without need of a vocoder to predict waveform outputs, and able to restore hour-long audio inputs. A2SB is capable of achieving state-of-the-art bandwidth extension and inpainting quality on several out-of-distribution music test sets.
+- We propose A2SB, a state-of-the-art, end-to-end, vocoder-free, and multi-task diffusion Schrodinger Bridge model for 44.1kHz high-res music restoration, using an effective factorized audio representation.
+- A2SB is the first long audio restoration model that could restore hour-long audio without
+boundary artifacts
+## License
+The model is provided under the NVIDIA OneWay NonCommercial License.
+## Citation
+```
+@article{kong2025a2sb,
+  title={A2SB: Audio-to-Audio Schrodinger Bridges},
+  author={Kong, Zhifeng and Shih, Kevin J and Nie, Weili and Vahdat, Arash and Lee, Sang-gil and Santos, Joao Felipe and Jukic, Ante and Valle, Rafael and Catanzaro, Bryan},
+  journal={arXiv preprint arXiv:2501.11311},
+  year={2025}
+}
 ```