Current Model Catalog

Use izwi list (or GET /v1/models) to see the live, currently enabled catalog. Those endpoints only show variants that are enabled for download/use.
Izwi accepts many legacy aliases (for example lowercase IDs), but the canonical IDs below match izwi list output.

Text-to-Speech (TTS)

FamilyCanonical IDs
Qwen3 Base (reference-voice cloning)Qwen3-TTS-12Hz-0.6B-Base, Qwen3-TTS-12Hz-0.6B-Base-4bit, Qwen3-TTS-12Hz-1.7B-Base, Qwen3-TTS-12Hz-1.7B-Base-4bit
Qwen3 CustomVoice (built-in speakers)Qwen3-TTS-12Hz-0.6B-CustomVoice, Qwen3-TTS-12Hz-0.6B-CustomVoice-4bit, Qwen3-TTS-12Hz-1.7B-CustomVoice, Qwen3-TTS-12Hz-1.7B-CustomVoice-4bit
Qwen3 VoiceDesignQwen3-TTS-12Hz-1.7B-VoiceDesign, Qwen3-TTS-12Hz-1.7B-VoiceDesign-4bit
Voxtral TTSVoxtral-4B-TTS-2603
VibeVoice TTSVibeVoice-1.5B
KokoroKokoro-82M
Kokoro-82M requires espeak-ng: macOS, Linux, Windows
Voxtral-4B-TTS-2603 includes bundled voice assets licensed under CC BY-NC 4.0 and supports 20 preset voices with 24 kHz output.
VibeVoice-1.5B is a Microsoft long-form TTS model with reference-voice cloning. It uses saved or direct reference voices rather than built-in speaker presets.
For built-in speaker IDs, see Voice Presets.

Speech Recognition (ASR)

ModelNotes
Parakeet-TDT-0.6B-v3CLI default for transcription/diarization ASR
Whisper-Large-v3-TurboWhisper ASR option
Qwen3-ASR-0.6B-GGUFSmaller Qwen3 ASR
Qwen3-ASR-1.7B-GGUFHigher-accuracy Qwen3 ASR
VibeVoice-ASRMicrosoft long-form ASR checkpoint
Nemotron-3.5-ASR-Streaming-0.6BNVIDIA multilingual FastConformer-RNNT .nemo; native artifact/config/tokenizer and streaming-state support
Granite-Speech-4.1-2B-PlusIBM Granite Speech rich transcription model with prompt guidance, speaker-attributed output, and word timestamp support
LFM2.5-Audio-1.5B-GGUFUnified audio model (ASR + speech generation)
Voxtral-Mini-4B-Realtime-2602Mistral Voxtral offline transcription; realtime support planned

Diarization and Alignment

TaskModel
Speaker diarizationdiar_streaming_sortformer_4spk-v2.1
Forced alignmentQwen3-ForcedAligner-0.6B, Qwen3-ForcedAligner-0.6B-4bit

Chat

FamilyCanonical IDs
Qwen3 GGUFQwen3-0.6B-GGUF, Qwen3-1.7B-GGUF, Qwen3-4B-GGUF, Qwen3-8B-GGUF
Qwen3.5 GGUFQwen3.5-0.8B, Qwen3.5-2B, Qwen3.5-4B, Qwen3.5-9B
LFM2.5 textLFM2.5-1.2B-Instruct-GGUF, LFM2.5-1.2B-Thinking-GGUF
GemmaGemma-3-1b-it

Currently Disabled (Not Listed by izwi list)

These variants exist in the catalog but are not currently enabled for standard listing/download:
  • Legacy Qwen3 chat IDs: Qwen3-0.6B, Qwen3-0.6B-4bit, Qwen3-1.7B, Qwen3-1.7B-4bit
  • Qwen3-14B-GGUF
  • Gemma-3-4b-it
  • TTS 8-bit and BF16 metadata variants such as Qwen3-TTS-12Hz-0.6B-Base-8bit and Qwen3-TTS-12Hz-1.7B-VoiceDesign-bf16; selected 4-bit variants are the standard low-memory downloads exposed by izwi list.

Downloading Models

Via CLI

# List enabled catalog models
izwi list

# Download a model
izwi pull Qwen3-TTS-12Hz-0.6B-Base

# Download an ASR model
izwi pull Qwen3-ASR-0.6B-GGUF

# Download NVIDIA Nemotron 3.5 ASR
izwi pull Nemotron-3.5-ASR-Streaming-0.6B

# Download IBM Granite Speech rich ASR
izwi pull Granite-Speech-4.1-2B-Plus

# Download Microsoft VibeVoice models
izwi pull VibeVoice-1.5B
izwi pull VibeVoice-ASR

Via Web UI

  1. Open http://localhost:8080
  2. Go to Models in the sidebar
  3. Click Download on a model

Managing Models

For the complete UI, CLI, and API workflow, see Model Management.

View Downloaded Models

izwi list --local

Get Model Information

izwi models info Qwen3-TTS-12Hz-0.6B-Base

Load a Model into Memory

izwi models load Qwen3-TTS-12Hz-0.6B-Base

Unload a Model

izwi models unload Qwen3-TTS-12Hz-0.6B-Base

Delete a Model

izwi rm Qwen3-TTS-12Hz-0.6B-Base

Model Storage

PlatformLocation
macOS~/Library/Application Support/izwi/models/
Linux~/.local/share/izwi/models/
Windows%APPDATA%\izwi\models\

Custom Model Directory

# CLI flag
izwi serve --models-dir /path/to/models

# Environment variable
export IZWI_MODELS_DIR=/path/to/models
izwi serve

Manual Downloads

Some models (for example Gemma) may require manual Hugging Face access setup:

Model Status

StatusDescription
not_downloadedAvailable but not on disk
downloadingCurrently downloading
downloadedOn disk but not loaded
loadingBeing loaded into memory
readyLoaded and ready for inference
Check status:
izwi status --detailed

Quantization Notes

  • -4bit / -8bit / -bf16 are reduced-precision variants.
  • -GGUF variants are quantized GGUF artifacts.
  • Smaller/quantized variants reduce memory and disk use at some quality/accuracy tradeoff.
  • izwi list shows enabled variants only. Some catalog metadata exists for experimental 8-bit/BF16 TTS artifacts, but the standard downloadable low-memory TTS variants are the explicit -4bit entries shown above.

Next Steps

Model Management

Download, load, unload, delete, filter, and inspect Izwi models from the UI, CLI, and local API.

Voice Presets

Reference built-in voice and speaker IDs for Qwen3 CustomVoice, Kokoro, Voxtral TTS, and LFM2.5 Audio.

Manual Model Downloads

Manually download gated or externally hosted model files and place them in the Izwi model cache.

Manual Download: Gemma 3 1B Instruct

Download and install the gated Gemma 3 1B Instruct model manually for Izwi.