Voice Plugin

The Voice plugin gives you spoken audio feedback when Claude Code completes a task. The agent automatically speaks a 1—2 sentence summary before stopping, so you can multitask while Claude works and hear when it needs your attention.

Prerequisites

FFmpeg (recommended) — Enables streaming audio for lower latency. Install via brew install ffmpeg (macOS) or sudo apt install ffmpeg (Linux).

How It Works

The plugin uses pocket-tts, a lightweight text-to-speech library. On first use, it automatically:

Starts a pocket-tts server (via uvx pocket-tts serve)
Downloads the voice model (~100MB, one-time)

The server persists in the background (via nohup) so subsequent requests are instant. Server logs are written to /tmp/pocket-tts-server.log.

To stop the server manually:

pkill -f "pocket-tts serve"

Installation

Add the marketplace (if not already added):

claude plugin marketplace add pchalasani/claude-code-tools

Install the voice plugin:

claude plugin install voice@cctools-plugins

Usage

Once installed, the plugin works automatically — when the agent finishes a task, it speaks a 1—2 sentence summary before stopping. No action required on your part.

Configuration

Use the /voice:speak command to control the plugin.

# Enable voice feedback with current voice
/voice:speak

# Disable voice feedback
/voice:speak stop

The default voice is azelma. To change it:

/speak alba
/speak cosette

Available voices: alba, marius, javert, jean, fantine, cosette, eponine, azelma. See the pocket-tts repo for details and custom voice cloning.

Custom prompts let you personalize how summaries are delivered:

# Set a custom instruction for summaries
/speak prompt "be upbeat and encouraging"

# Another example
/speak prompt "always end with 'back to you, boss'"

# Clear the custom prompt
/speak prompt

Recommended: Speech-to-Text Companion

For a complete voice workflow, pair this TTS plugin with a speech-to-text app:

Hex with Parakeet V3 (macOS only, open-source) — stunningly fast transcription with no stuttering. Highly recommended.
Handy with Parakeet V3 (cross-platform, open-source) — very fast transcription, though may occasionally stutter.

Architecture

The plugin uses a multi-hook strategy for fast, reliable voice summaries:

UserPromptSubmit Hook

Silently injects voice instructions each turn, telling Claude to end longer responses with a spoken summary marker.

PostToolUse Hook

Brief reminder after each tool call to keep voice instructions fresh during long tool chains.

Stop Hook

Extracts the summary marker instantly (no API call), or falls back to headless Claude summarization if the agent forgot.

This design ensures:

Fast feedback — Most summaries are instant (marker extraction, no API call needed)
Reliable — Headless Claude fallback catches cases where the agent forgets the marker
Silent operation — Hooks use additionalContext for noise-free injection
Tone matching — Summaries match the user’s conversational style (casual, formal, etc.)
Non-blocking — Audio plays in the background (the stop hook returns immediately via subprocess.Popen), so it never delays the next prompt
Streaming — When ffplay is available, audio is piped directly from the TTS server to the player with no temp file, reducing latency. Falls back to a temp WAV file when ffplay is not installed
Playback locking — A mkdir-based lock with stale-process detection prevents overlapping audio when multiple sessions finish at the same time
Infinite loop prevention — When you run /speak stop, a just_disabled flag is set. The next prompt hook sees this, injects a “stop adding 📢 markers” message to override stale instructions still in context, and clears the flag. Without this, the agent would keep producing markers from old instructions, and the stop hook would keep speaking them — an infinite loop
Session state tracking — Per-session state files (/tmp/voice-{id}-running, -done, -failed) let the stop hook know whether audio playback is still in progress, completed, or errored