Mistral has announced the release of a new AI audio model called Voxtral
2 min read
Mistral has announced the release of a new audio AI model called Voxtral. The French AI company stated that the model is geared towards businesses and is considered the first family of large language models (LLMs) focused on audio AI. According to the French AI company, Voxtral is designed to deliver practical speech intelligence in real-world applications. The AI audio model outperforms Whisper large-v3, which is one of the top open-source audio transcription models. Voxtral is powered by Mistral Small 3.1 Voxtral is powered by the large language model (LLM) Mistral Small 3.1. The audio AI model can understand multiple languages, like English, French, Spanish, Portuguese, Italian, German, Dutch, Hindi, and more. The audio model is capable of transcribing up to 30 minutes of audio. Moreover, Voxtral can understand up to 40 minutes of audio, which makes it easy for users to converse and ask relevant questions. Users can also ask it to generate text summaries of the audio file or provide analysis and detailed insights. They can also execute other actions, like running functions through an API call. Mistral offers Voxtral’s “speech understanding models” in two variations called Voxtral Small and Voxtral Mini. Both models are capable of interacting with speech-based prompts or a combination of audio and text-based prompts. The more powerful of the two models, Voxtral Small, features 24B parameters—ideal for production-scale deployments. Mistral wrote that “Voxtral Small is competitive with GPT-4o-mini and Gemini 2.5 Flash across all tasks.” Source: Mistral AI. Voxtral Mini is a lighter-weight option with 3B parameters, making it a strong choice for local and edge deployments. Its API version, Voxtral Mini Transcribe, is not only cost-effective but also outperforms OpenAI’s Whisper—at less than half the price. Both Voxtral Small (24B) and Voxtral Mini (3B) are available for download and local hosting from Hugging Face. Developers can also integrate the audio models via a single API call into any application. The pricing starts at $0.001 per minute, making transcription scalable. Mistral stated that Voxtral will be available on Le Chat in the web app or mobile app within the next couple of weeks. Mistral is one of the leading artificial intelligence companies in Europe. According to reports, the company, which was founded in 2023, has raised over €1 billion (around $1.2 billion) from known firms like Andreessen Horowitz, Nvidia, Samsung, and Salesforce. Cryptopolitan Academy: Tired of market swings? Learn how DeFi can help you build steady passive income. Register Now

Source: Cryptopolitan