Assistant, Voice, and TTS

The webserver includes an optional conversational assistant with text and voice interaction, plus a managed text-to-speech (TTS) voice library. These features are disabled by default and enabled through configuration.

Conversational Assistant

When enabled, the assistant supports text chat, spoken input, and streamed responses. It can optionally execute a curated set of robot tools so that spoken or typed requests translate into robot actions.

Supported backends:

  • Cloud assistants (selectable via llm_backend)

  • A local model served through Ollama (llm_ollama_host / llm_ollama_model)

Tool execution is gated by llm_tools_enabled and a tool configuration file, so the set of actions the assistant may perform is explicit and controlled.

Voice Workflow

The voice pipeline combines speech-to-text (STT), the assistant, and text-to-speech (TTS) into a spoken interaction loop:

  • An activation word starts listening.

  • Captured speech is transcribed and passed to the assistant.

  • The assistant response is spoken back through the configured TTS voice.

Offline STT assets are vendored with the package so basic voice features work without internet access.

Parameter

Example

llm_tts_voice

en-GB-RyanNeural (online voice)

llm_tts_offline_voice

en-gb (offline fallback)

TTS Library

The TTS library panel manages the set of saved voice clips used by the assistant and the play_audio skill. From the panel you can:

  • List available voices and clips

  • Preview a voice

  • Save, delete, and play back clips

Clips are stored under the package audio library and are shared with the play_audio skill described in Skills and Missions.

Integration Endpoints

Endpoint

Purpose

/api/llm/config

Assistant configuration.

/chat (text/audio/stream)

Chat interaction endpoints.

/api/tts_library

Voice list, preview, save, delete, and playback.

Note

The assistant, voice, and tool execution features are optional and disabled by default. Enable them only after configuring the relevant backend and reviewing the tool surface.