Voice
32 items tagged with this topic
Recent
Parloa builds service agents customers want to talk to
Parloa leverages OpenAI models to power scalable, voice-driven AI customer service agents, enabling enterprises to design, simulate, and deploy reliable, real-time interactions.
Advancing voice intelligence with new models in the API
Explore new realtime voice models in the OpenAI API that can reason, translate, and transcribe speech, enabling more natural and intelligent voice experiences.
Older
[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs
OpenAI continues deploying GPT-5 everywhere
Microsoft at NSDI 2026: Advances in large-scale networked systems
Microsoft researchers share advances in building and operating large-scale distributed systems, spanning datacenters, networking, and the growing intersection with AI during NSDI ’26. The post Microsoft at NSDI 2026: Advances in large-scal…
Me and codex were busy. 🔊 https://t.co/kAbQGMTQIQ — Sonos 🗃️ https://t.co/okyk5oZOSZ — WhatsA…
Me and codex were busy. 🔊 https://t.co/kAbQGMTQIQ — Sonos 🗃️ https://t.co/okyk5oZOSZ — WhatsApp 🪶 https://t.co/IOOLpksihC — X archive 🧰 https://t.co/8pYSuKt0Ea — GitHub archive 🛰️ https://t.co/MErsuc1FO7 — Discord…
Uber uses OpenAI to help people earn smarter and book faster
Uber uses OpenAI to power AI assistants and voice features that help drivers earn smarter and riders book faster across a global real-time marketplace.
It’s criminal how cheap and how good Gemini Flash is.. that too with 1M context windows and str…
It’s criminal how cheap and how good Gemini Flash is.. that too with 1M context windows and structured outputs. Probably, my most used model in production workloads. Separately their new live voice model is mindblowingl…
pretty excited for voice models to get great its interesting to watch how people are already st…
pretty excited for voice models to get great its interesting to watch how people are already starting to change the way they interface with AI
How OpenAI delivers low-latency voice AI at scale
How OpenAI rebuilt its WebRTC stack to power real-time Voice AI with low latency, global scale, and seamless conversational turn-taking.
OpenAI's Greg Brockman: Why Human Attention Is the New Bottleneck
Speaker 1 | 00:02 - 00:24 So Greg, thank you for coming back here. I don't think we ever charge you for rent. So maybe I'll send you an invoice later. But Greg, you've been part of like two really spectacular companies, Stripe as employee…
Atlassian’s results surprised Wall Street, but it shouldn’t be a surprise. The simple heuristic…
Atlassian’s results surprised Wall Street, but it shouldn’t be a surprise. The simple heuristic for the future of software is that when there are 100X more agents than people, which parts of software will grow because a…
The secret to an articulate agent like mine isn't one file. It's three: SOUL.md — Who the agent…
The secret to an articulate agent like mine isn't one file. It's three: SOUL.md — Who the agent IS. Voice, values, operating principles, what good output looks like, what bad output looks like. Not a system prompt, a co…
Speaking of Voxtral | Mistral AI
Voxtral TTS: A frontier, open-weights text-to-speech model that’s fast, instantly adaptable, and produces lifelike speech for voice agents.
Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
MiniMax Speech 2.8: Breathing life into AI voice - MiniMax News | MiniMax
MiniMax Speech 2.6: The Ultimate Voice Agent Has Arrived - MiniMax News | MiniMax
Gemini 3.1 Flash TTS: the next generation of expressive AI speech
Our newest audio model introduces granular audio tags that give you precise control to direct AI speech for expressive audio generation.
Gemini 3.1 Flash TTS: the next generation of expressive AI speech
Gemini 3.1 Flash TTS is now available across Google products.
The new Codex is another jump in what agents will look like for knowledge workers. Agents that…
The new Codex is another jump in what agents will look like for knowledge workers. Agents that can code, work with tools, and use computers, can begin to execute long running tasks in the background for all areas of wor…
My claw and I searched high and low for proper e2e Gemini Live tests and in the end we decided…
My claw and I searched high and low for proper e2e Gemini Live tests and in the end we decided to do it ourselves Coming to GBrain Voice, open source release soon. https://t.co/kQOloJS9c0
MiniMax Speech 2.5 Launches: Enhanced Multilingual Expressiveness Exceptional Voice Cloning Fidelity - MiniMax News | MiniMax
GBrain is an opinionated setup that is literally *my* OpenClaw/Hermes Agent setup. It's not jus…
GBrain is an opinionated setup that is literally *my* OpenClaw/Hermes Agent setup. It's not just opinionated search but also skillpacks including a voice agent built on OpenAI Realtime (about to upgrade to Gemini Live)…
What a powerful education in latency and how to build voice systems … all just by chatting with…
What a powerful education in latency and how to build voice systems … all just by chatting with the Claw https://t.co/0bHezE7lzW
Judging by my tl there is a growing gap in understanding of AI capability. The first issue I th…
Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowe…
Duolingo-style language tutorial deck with interactive quizzes. Prompt: Create a Duolingo-style…
Duolingo-style language tutorial deck with interactive quizzes. Prompt: Create a Duolingo-style interactive and UI-driven slide deck for English learning that combines text and visuals in an engaging, educational way. P…
Wow, @pangramlabs thinks this is 100% AI generated slop 😢 Either I have fused my mind complete…
Wow, @pangramlabs thinks this is 100% AI generated slop 😢 Either I have fused my mind completely with Claude - or it's got this one wrong. Humbling regardless. Mostly hand written, some voice memos and some Claude assi…
Deepgram speech-to-text and voice models now available natively on Together AI
Production STT and TTS from Deepgram, available on Together AI Dedicated Model Inference for real-time voice agents.
I’ve been using the Fish Audio API to generate personalized podcasts. Dropping the skill soon h…
I’ve been using the Fish Audio API to generate personalized podcasts. Dropping the skill soon https://t.co/dUcikIwOZ5
Mistral: Voxtral TTS, Forge, Leanstral, & what's next for Mistral 4 — w/ Pavan Kumar Reddy & Guillaume Lample
Speaker 1 | 00:05 - 00:10 Okay. Welcome to Lanespace. We're here in the studio with trusty cohost, Bibou. Welcome. Speaker 2 | 00:11 - 00:12 Very excited for this one. Speaker 1 | 00:12 - 00:22 As well as Guillaume and Pavan from Mistral.…
Gemini 3.1 Flash Live: Making audio AI more natural and reliable
Our latest voice model has improved precision and lower latency to make voice interactions more fluid, natural and precise.
For #2, I asked my OpenClaw for a short prompt to add to soul.md, here it is: // Vibe: Warm, sh…
For #2, I asked my OpenClaw for a short prompt to add to soul.md, here it is: // Vibe: Warm, sharp, dry humor. Not a chatbot — a person. Voice: • Skip "Great question!" and "I'd be happy to help!" — just help • Have str…
Why it matters: This matters because it adds direct signal from people and teams shaping how AI products are actually being built and deployed.