Overview
Krisp’s VIVA SDK provides three capabilities for Pipecat applications:- Voice Isolation — Filter out background noise and voices from the user’s audio input stream, yielding clearer audio for fewer false interruptions and better transcription.
- Turn Detection — Determine when a user has finished speaking using Krisp’s streaming turn detection model, as an alternative to the Smart Turn model.
- Voice Activity Detection — Detect speech in audio streams using Krisp’s VAD model, supporting sample rates from 8kHz to 48kHz.
KrispVivaFilter Reference
API reference for voice isolation
KrispVivaTurn Reference
API reference for turn detection
KrispVivaVadAnalyzer Reference
API reference for voice activity detection
Krisp VIVA Example
Complete example with Krisp features
Krisp Developers
Get the Krisp SDK and API key
Prerequisites
To complete this setup, you will need access to a Krisp developers account, where you can download the Python SDK, models, and generate an API key.Setup
Download the Python SDK and Models
- Log in to the Krisp developer portal
- Navigate to the
Server SDK VersionTab - Find the latest version of the Python SDK:
- Download the SDK
- Download the Voice Isolation models (for voice isolation)
- Download the Turn Detection models (for turn detection)
Install the Python wheel file
-
First, unzip the SDK files you downloaded in the previous step. In the unzipped folder, you will find a
distfolder containing the Python wheel file you will need to install. -
Install the Python wheel file that corresponds to your platform. For example, a macOS ARM64 platform running Python 3.12 would install the following:
Generate an API key
- In the Krisp developer portal, generate an API key for your application.
The
KRISP_VIVA_API_KEY is required for Krisp SDK v1.6.1 and later. For older
SDK versions, this is not required.Set up environment variables
- Unzip the models you downloaded in the first step.
-
For voice isolation, choose a model:
krisp-viva-pro: Mobile, Desktop, Browser (WebRTC, up to 32kHz)krisp-viva-tel: Telephony, Cellular, Landline, Mobile, Desktop, Browser (up to 16kHz)
krisp-viva-tel-v2.kef. - In your .env file, add the environment variables for the features you’re using:
Each feature uses a different model. Set
KRISP_VIVA_FILTER_MODEL_PATH
for voice isolation, KRISP_VIVA_TURN_MODEL_PATH for turn detection, and
KRISP_VIVA_VAD_MODEL_PATH for voice activity detection.Test the integration
You’re ready to test the integration! Try running the Krisp VIVA foundation example, which demonstrates both voice isolation and turn detection together.Voice Isolation
KrispVivaFilter isolates the user’s voice by filtering out background noise and other voices in real-time audio streams. Add it to any transport via the audio_in_filter parameter.
Turn Detection
KrispVivaTurn uses Krisp’s streaming turn detection model to determine when a user has finished speaking. Unlike the Smart Turn model which analyzes audio in batches, KrispVivaTurn processes each audio frame in real time.
Configure it as a user turn stop strategy:
Voice Activity Detection
KrispVivaVadAnalyzer detects speech in audio streams using Krisp’s VAD model. It supports sample rates from 8kHz to 48kHz, making it suitable for a wide range of applications including telephony and high-quality audio.
Configure it as a VAD analyzer: