kalashshah19 commited on
Commit
fc21553
·
verified ·
1 Parent(s): c3d0f52

Adding Pipeline Tag

Browse files

Adding "Audio-to-Audio" pipeline tag

Files changed (1) hide show
  1. README.md +33 -32
README.md CHANGED
@@ -1,33 +1,34 @@
1
- ---
2
- license: other
3
- license_name: nvidia-oneway-noncommercial-license
4
- ---
5
-
6
- # PyTorch Implementation of Audio-to-Audio Schrodinger Bridges
7
-
8
- **Zhifeng Kong, Kevin J Shih, Weili Nie, Arash Vahdat, Sang-gil Lee, Joao Felipe Santos, Ante Jukic, Rafael Valle, Bryan Catanzaro**
9
-
10
- [[paper]](https://arxiv.org/abs/2501.11311) [[GitHub]](https://github.com/NVIDIA/diffusion-audio-restoration) [[Demo]](https://research.nvidia.com/labs/adlr/A2SB/)
11
-
12
- This repo contains the PyTorch implementation of [A2SB: Audio-to-Audio Schrodinger Bridges](https://arxiv.org/abs/2501.11311). A2SB is an audio restoration model tailored for high-res music at 44.1kHz. It is capable of both bandwidth extension (predicting high-frequency components) and inpainting (re-generating missing segments). Critically, A2SB is end-to-end without need of a vocoder to predict waveform outputs, and able to restore hour-long audio inputs. A2SB is capable of achieving state-of-the-art bandwidth extension and inpainting quality on several out-of-distribution music test sets.
13
-
14
- - We propose A2SB, a state-of-the-art, end-to-end, vocoder-free, and multi-task diffusion Schrodinger Bridge model for 44.1kHz high-res music restoration, using an effective factorized audio representation.
15
-
16
- - A2SB is the first long audio restoration model that could restore hour-long audio without
17
- boundary artifacts
18
-
19
- ## License
20
-
21
- The model is provided under the NVIDIA OneWay NonCommercial License.
22
-
23
-
24
- ## Citation
25
-
26
- ```
27
- @article{kong2025a2sb,
28
- title={A2SB: Audio-to-Audio Schrodinger Bridges},
29
- author={Kong, Zhifeng and Shih, Kevin J and Nie, Weili and Vahdat, Arash and Lee, Sang-gil and Santos, Joao Felipe and Jukic, Ante and Valle, Rafael and Catanzaro, Bryan},
30
- journal={arXiv preprint arXiv:2501.11311},
31
- year={2025}
32
- }
 
33
  ```
 
1
+ ---
2
+ pipeline_tag: audio-to-audio
3
+ license: other
4
+ license_name: nvidia-oneway-noncommercial-license
5
+ ---
6
+
7
+ # PyTorch Implementation of Audio-to-Audio Schrodinger Bridges
8
+
9
+ **Zhifeng Kong, Kevin J Shih, Weili Nie, Arash Vahdat, Sang-gil Lee, Joao Felipe Santos, Ante Jukic, Rafael Valle, Bryan Catanzaro**
10
+
11
+ [[paper]](https://arxiv.org/abs/2501.11311) [[GitHub]](https://github.com/NVIDIA/diffusion-audio-restoration) [[Demo]](https://research.nvidia.com/labs/adlr/A2SB/)
12
+
13
+ This repo contains the PyTorch implementation of [A2SB: Audio-to-Audio Schrodinger Bridges](https://arxiv.org/abs/2501.11311). A2SB is an audio restoration model tailored for high-res music at 44.1kHz. It is capable of both bandwidth extension (predicting high-frequency components) and inpainting (re-generating missing segments). Critically, A2SB is end-to-end without need of a vocoder to predict waveform outputs, and able to restore hour-long audio inputs. A2SB is capable of achieving state-of-the-art bandwidth extension and inpainting quality on several out-of-distribution music test sets.
14
+
15
+ - We propose A2SB, a state-of-the-art, end-to-end, vocoder-free, and multi-task diffusion Schrodinger Bridge model for 44.1kHz high-res music restoration, using an effective factorized audio representation.
16
+
17
+ - A2SB is the first long audio restoration model that could restore hour-long audio without
18
+ boundary artifacts
19
+
20
+ ## License
21
+
22
+ The model is provided under the NVIDIA OneWay NonCommercial License.
23
+
24
+
25
+ ## Citation
26
+
27
+ ```
28
+ @article{kong2025a2sb,
29
+ title={A2SB: Audio-to-Audio Schrodinger Bridges},
30
+ author={Kong, Zhifeng and Shih, Kevin J and Nie, Weili and Vahdat, Arash and Lee, Sang-gil and Santos, Joao Felipe and Jukic, Ante and Valle, Rafael and Catanzaro, Bryan},
31
+ journal={arXiv preprint arXiv:2501.11311},
32
+ year={2025}
33
+ }
34
  ```