Gemini 3.1 Flash TTS: the next generation of expressive AI speech

Our newest audio model introduces granular audio tags that give you precise control to direct AI speech for expressive audio generation.

16 April 2026 06:21 AM IST

Gemini 3.1 Flash TTS is now available across Google products.

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

Apr 15, 2026 ·

Our newest audio model introduces granular audio tags that give you precise control to direct AI speech for expressive audio generation.

Vilobh Meshram

Senior Product Manager

Max Gubin

Principal Research Engineer on behalf of the Gemini team

Read AI-generated summary

General summary

Gemini 3.1 Flash TTS is here, giving you improved AI speech quality and control. You can now use audio tags to adjust vocal style and pacing in over 70 languages. Test it out in Google AI Studio, Vertex AI, and Google Vids, and know that all audio is watermarked with SynthID to prevent misinformation. Summaries were generated by Google AI. Generative AI is experimental.

Bullet points

"Gemini 3.1 Flash TTS" is a new AI speech model with better control, expressiveness, and quality.
This model has improved speech quality, making it sound more natural than previous versions.
Audio tags let you control vocal style, pace, and delivery using natural language commands.
Developers can use Google AI Studio to fine-tune voices and export settings for consistent use.
Gemini 3.1 Flash TTS supports 70+ languages and uses SynthID watermarking to identify AI-generated audio.

Summaries were generated by Google AI. Generative AI is experimental.

Basic explainer

Gemini 3.1 Flash TTS is a new AI that makes computer speech sound more real. It lets people change how the AI talks by using special commands in the text. This AI can speak in over 70 languages and adds a hidden watermark to the audio. This helps people know it's AI-generated and not a real person. Summaries were generated by Google AI. Generative AI is experimental.

Explore other styles:

General summary
Bullet points
Basic explainer

Disclaimer: This content has been automatically aggregated from GOOGLE DEEPMIND for informational purposes. To read the original article, please visit GOOGLE DEEPMIND.

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

Our newest audio model introduces granular audio tags that give you precise control to direct AI speech for expressive audio generation.

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

General summary

Bullet points

Basic explainer

Explore other styles:

Tags: