Skip to main content

Overview

KrispVivaVadAnalyzer is a Voice Activity Detection (VAD) analyzer that uses the Krisp VIVA SDK to detect speech in audio streams. It provides high-accuracy speech detection with support for multiple sample rates.

Installation

pip install "pipecat-ai[krisp]"

Prerequisites

You need a Krisp VIVA VAD model file (.kef extension). Set the model path via:
  • The model_path constructor parameter, or
  • The KRISP_VIVA_VAD_MODEL_PATH environment variable

Constructor Parameters

model_path
str
default:"None"
Path to the Krisp model file (.kef extension). If not provided, uses the KRISP_VIVA_VAD_MODEL_PATH environment variable.
frame_duration
int
default:"10"
Frame duration in milliseconds. Must be 10, 15, 20, 30, or 32ms.
sample_rate
int
default:"None"
Audio sample rate in Hz. Must be 8000, 16000, 32000, 44100, or 48000.
params
VADParams
default:"VADParams()"
Voice Activity Detection parameters object

Usage Example

from pipecat.audio.vad.krisp_viva_vad import KrispVivaVadAnalyzer
from pipecat.audio.vad.vad_analyzer import VADParams

context = LLMContext(messages)
user_aggregator, assistant_aggregator = LLMContextAggregatorPair(
    context,
    user_params=LLMUserAggregatorParams(
        vad_analyzer=KrispVivaVadAnalyzer(
            model_path="/path/to/model.kef",
            params=VADParams(stop_secs=0.2)
        ),
    ),
)

Technical Details

Sample Rate Requirements

The analyzer supports five sample rates:
  • 8000 Hz
  • 16000 Hz
  • 32000 Hz
  • 44100 Hz
  • 48000 Hz

Model Requirements

  • Model files must have a .kef extension
  • Model path can be specified via constructor or environment variable
  • Model is loaded once during initialization

Notes

  • High-accuracy speech detection using Krisp VIVA SDK
  • Supports multiple sample rates (8kHz to 48kHz)
  • Requires external .kef model file
  • Thread-safe for pipeline processing
  • Automatic session management
  • Configurable frame duration