Dia 1.6B TTS
Ultra-Realistic AI Speech Dialogue Model
An open-source 1.6B parameter text-to-speech model by Nari Labs that generates human-like speech with natural intonation, rhythm, and emotion. Meet Dia 1.6B TTS.

What is Dia 1.6B TTS?
Dia 1.6B TTS is a cutting-edge AI text-to-speech model designed for ultra-realistic dialogue synthesis. Developed by Nari Labs and released under the Apache 2.0 license, Dia 1.6B TTS offers natural and expressive speech output that rivals commercial solutions.
- Speech synthesis with natural intonation, rhythm, and emotional expression using Dia 1.6B TTS
- Optimized multi-speaker dialogue generation with Dia 1.6B TTS
- 1.6B parameter model that runs on 10GB VRAM
- Voice cloning capabilities through audio prompting
Dia 1.6B TTS Core Features
Dia 1.6B TTS Exceptional Speech Quality
Dia 1.6B TTS produces incredibly natural-sounding voices with human-like intonation, rhythm, and emotion. The advanced AI model creates speech that's nearly indistinguishable from human voices.
Dia 1.6B TTS: Multi-Speaker Support
Easily create multi-speaker conversations using simple tags like [S1] and [S2] to specify different voices in your text, maintaining consistent and natural dialogue with Dia 1.6B TTS.
Voice Cloning with Dia 1.6B TTS
Clone specific vocal characteristics using the audio prompting feature, enabling consistent voice identity across multiple generations for personalized speech output with Dia 1.6B TTS.
Dia 1.6B TTS: Open Source Model
Released under Apache 2.0 license, allowing free use for personal and commercial purposes. Complete model weights and code for Dia 1.6B TTS are available on GitHub.
Dia 1.6B TTS Audio Demos
Dia 1.6B TTS: Standard Usage (Sample 1)
Basic dialogue generation example from Dia 1.6B TTS.
Dia 1.6B TTS: Natural Conversation (Sample 2)
Demonstrates casual interactions using Dia 1.6B TTS.
Dia 1.6B TTS: Emotional Dialogue (Sample 3)
Expressive, high-emotion speech example using Dia 1.6B TTS.
Dia 1.6B TTS: Non-Verbal Sounds (Sample 4)
Includes coughs, sniffling, laughing generated by Dia 1.6B TTS.
Dia 1.6B TTS: Rap Example (Sample 5)
Showcases rhythm and rhyme using Dia 1.6B TTS.
Dia 1.6B TTS: Audio Prompting Feature (Sample 6)
Example of voice cloning using Dia 1.6B TTS audio prompts.
Note: To use audio prompts for high-quality output in Dia 1.6B TTS, prepend the corresponding script to your input text. Auto transcription is being considered for ease of use.
Dia 1.6B TTS Video Examples
Dia 1.6B TTS: Podcast Quality
Demonstrates the potential for podcast generation using Dia 1.6B TTS.
Dia 1.6B TTS: Model Introduction
Highlights the 1.6B parameter model of Dia 1.6B TTS.
Dia 1.6B TTS: Ultra-Realistic Dialogue
Showcases one-pass generation using Dia 1.6B TTS.
How Dia 1.6B TTS Works: From Text to Lifelike Dialogue
1. Prepare Your Script for Dia 1.6B TTS
Write or paste the text you want Dia 1.6B TTS to convert. Use simple tags like [S1] and [S2] before sentences to assign different speaker voices. You can also include non-verbal cues like (laughs) or (coughs) to add realism.
2. (Optional) Provide Audio Prompts to Dia 1.6B TTS
To clone a specific voice or guide emotional tone with Dia 1.6B TTS, upload a short audio sample (5-15 seconds) and its accurate transcription (with speaker tags) prepended to the main script in your input.
3. Generate Audio with Dia 1.6B TTS
Run the Dia 1.6B TTS model (locally via the app or using the online demo). The model processes the entire script in one pass, generating seamless dialogue.
4. Listen and Download Dia 1.6B TTS Output
Play the generated audio directly from Dia 1.6B TTS. The output captures natural intonation, rhythm, and even non-verbal cues, creating an ultra-realistic listening experience. Download the audio file for your projects.
Dia 1.6B TTS Installation Guide
### Windows Installation
1. Clone the repository
git clone https://github.com/nari-labs/dia.git
cd dia
2. Create a Python virtual environment (Python 3.10 recommended)
python -m venv venv
venv\Scripts\activate.bat
3. Install dependencies
python -m pip install --upgrade pip
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
4. Download model weights
# These will download automatically or can be manually downloaded from Hugging Face
5. Launch the application
python app.pyDia 1.6B TTS Technical Information

Dia 1.6B TTS - Ultra-Realistic Dialogue Synthesis Model
Dia 1.6B TTS is a state-of-the-art text-to-speech model with 1.6B parameters that generates human-like voices with natural intonation, rhythm, and emotion. On enterprise-grade GPUs, Dia 1.6B TTS can generate audio in real-time, with an A4000 GPU producing approximately 40 tokens/second (86 tokens equal 1 second of audio).
The full version requires approximately 10GB of VRAM to run. Quantized versions of Dia 1.6B TTS are planned for future updates to improve accessibility on lower-end hardware.
Dia TTS Pricing
Purchase Dia TTS voice generation credits to experience professional AI text-to-speech services.
Basic
Annual Basic plan with better pricing.
- 12000 credits per year (1000/month)
- Billed annually ($94.80/year)
- High-quality audio outputs
- Standard customer support
- No ads
Annual savings! 20% off vs monthly!
Pro
Annual Pro plan, the best choice for professionals.
- 26400 credits per year (2200/month)
- Billed annually ($190.80/year)
- High-quality audio outputs
- Priority customer support
- No ads
Annual savings! 20% off vs monthly!
Ultra
Annual Ultra plan, perfect for teams and enterprises.
- 54000 credits per year (4500/month)
- Billed annually ($358.80/year)
- High-quality audio outputs
- VIP customer support
- No ads
Annual savings! 19% off vs monthly!