Thinking Machines Lab ships its first model and argues interactivity is what OpenAI gets wrong about voice

Thinking Machines Lab has released its first AI model, aiming to break away from the traditional question-and-answer paradigm by processing audio, video, and text in parallel. The model processes data in 200-millisecond chunks and is designed to outperform OpenAI‘s GPT Realtime 2 and Google’s Gemini Live in terms of interaction quality.
The company argues that current voice AI models like those from OpenAI focus too heavily on Q&A, which they claim limits the model’s ability to engage in dynamic, interactive conversations. Their new model is designed to facilitate more natural, conversational exchanges.
This shift towards interactivity could potentially democratize access to advanced AI voice technologies and enable them to be used more effectively across various applications where real-time interaction is crucial.

Originally published at the-decoder.com. Curated by AI Maestro.

Stay ahead of AI. Get the most important stories delivered to your inbox — no spam, no noise.

Empowering Businesses with AI — Smart Tools, Smarter Business Decisions.