FireRedChat-pvad / README.md

FireRedTeam

Update README.md

74561b1 verified about 2 months ago

preview code

raw

history blame

1.64 kB

metadata

license: apache-2.0
language:
  - zh
  - en
base_model:
  - speechbrain/spkrec-ecapa-voxceleb
tags:
  - agent
  - voice-activity-detection

FireRedChat-pVAD

Demo • Paper • Huggingface

Descriptions

FireRedChat's personalized Voice Activity Detection (pVAD) model, an open-weight model for detecting voice activity with speaker embedding updates.. LiveKit plugin available here

Supports speaker embedding updates for improved voice activity detection.
The plugin requires a compatible LiveKit Agents fork or modification to include update_speaker call for the first user utterance.

Roadmap

2025/09
- Release the pVAD model weights and LiveKit plugin.

Usage

For inference, please use the LiveKit plugin. Install and configure as follows:

from livekit.plugins import fireredchat_pvad as pvad

def prewarm(proc: JobProcess):
    proc.userdata["vad"] = pvad.VAD.load(activation_threshold=0.5)

# After the first utterance (or when primary speaker switches based on RMS), call VADStream's update_speaker() to update speaker embedding.

License

The model weights and plugin code are licensed under the Apache-2.0 license.

Acknowledgment

Speaker embedding model: speechbrain/spkrec-ecapa-voxceleb