Voice

50 items tagged with this topic

Recent

Official SourcesfromMistral AI Blog

Speaking of Voxtral | Mistral AI

Voxtral TTS: A frontier, open-weights text-to-speech model that’s fast, instantly adaptable, and produces lifelike speech for voice agents.

Voice AI Assistants Open Source

Podcasts & NewslettersfromChinaTalkJun 21

Transmission Dominance with Chinese Characteristics

Comparing the US and China Transmission Buildout

Voice Custom AI

BuildersfromXJun 19

@OpenAIDevs how about letting us add voice over narration next like taking a quick clip to expl…

Voice

Older

Buildersfrom XJun 18, 2026

Don't use AI for writing until you develop your own taste and voice Using AI to write isn't inh…

Don't use AI for writing until you develop your own taste and voice Using AI to write isn't inherently bad. The danger is using AI to write before you've developed your taste for what is good content. If the AI produces…

Official Sourcesfrom Google DeepMind BlogJun 9, 2026

Fluid, natural voice translation with Gemini 3.5 Live Translate

Gemini 3.5 Live Translate brings near real-time, natural speech translation to Google AI Studio, Google Translate and Google Meet.

Buildersfrom XJun 12, 2026

Anyone else have two voices? I often have two voices that come out both in my writing and how I…

Anyone else have two voices? I often have two voices that come out both in my writing and how I speak. One is the frenetic, time is the enemy, direct, punchy gets to the point quickly and then the second is more calm, m…

Official Sourcesfrom OpenAI NewsJun 9, 2026

What Codex unlocks for Notion

How Notion uses Codex to one-shot specs, build AI Voice Input for the web, and multiply engineering power across small teams.

Buildersfrom XJun 4, 2026

NEW: Spiral 4.0—a writing partner for you and your agent by @every -> Stylometry: we built a ne…

NEW: Spiral 4.0—a writing partner for you and your agent by @every -> Stylometry: we built a new Style Engine based on the principles of stylometry to extract you and your brand's voice and produce great writing every t…

Official Sourcesfrom Together AI BlogMay 29, 2026

How Together AI built the world’s fastest speech-to-text stack

Together AI built the fastest speech-to-text stack on Artificial Analysis by treating ASR as a full-path systems problem, not just a GPU inference problem.

Podcasts & Newslettersfrom Latent Space NewsletterJun 3, 2026

⚡️Satya Nadella: No Priors x Latent Space Crossover Special at Microsoft Build

The legendary Microsoft CEO makes his first Latent Space appearance!

Podcasts & Newslettersfrom Latent Space NewsletterJun 3, 2026

[AINews] Microsoft Build: MAI-Thinking-1 and MAI Family models

Microsoft Build recap, and new MAI model technical details

Podcasts & Newslettersfrom Training DataJun 2, 2026

Knowing What Your Customers Want, All the Time: Listen Labs' Alfred Wahlforss

Speaker 1 | 00:00 - 00:27 Our goal is to get to a billion people in our audience and then to be able to stratify and know what exactly is this person an expert on. And it might be, you know, even something like sneakers. You have some peop…

Buildersfrom XJun 1, 2026

Suzanne also mentioned she uses this with voice mode to make it easier to respond and more natu…

Suzanne also mentioned she uses this with voice mode to make it easier to respond and more natural.

Podcasts & Newslettersfrom Import AIMay 26, 2026

Import AI 458: Reckoning with the future; and a singularity story

Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv, cappuccinos, and feedback from readers. If you’d like to support this, please subscribe. This issue consists of a lengthy essay based on a speech I recently gav…

Buildersfrom XMay 28, 2026

Generate images, video, audio and remix them. Draw something and make it real. Point-click edit…

Generate images, video, audio and remix them. Draw something and make it real. Point-click edit, move things around, drag them, drop them. Invite a friend and cook some marketing, websites, or art. All on Replit Canvas!…

Buildersfrom XMay 23, 2026

GBrain just shipped v0.40.0 gives your OpenClaw/Hermes Agent + GBrain a voice agent. It's based…

GBrain just shipped v0.40.0 gives your OpenClaw/Hermes Agent + GBrain a voice agent. It's based on Gemini Live. (Thanks @demishassabis it's amazing) Large context, great tool use, full brain access. Mars is a friend, Ve…

Buildersfrom XMay 20, 2026

This will bring AI to 42% of the web. Every model, every provider, every modality (text, image,…

This will bring AI to 42% of the web. Every model, every provider, every modality (text, image, video, audio). https://t.co/0w3UOLwAQO

Official Sourcesfrom Together AI BlogMay 14, 2026

Violin: An open-source video translation skill that breaks language barriers

Violin is an open-source AI video translation tool that combines speech recognition, LLM translation, and text-to-speech to make video content accessible across languages.

Podcasts & Newslettersfrom Latent Space NewsletterMay 16, 2026

[AINews] Cerebras' $60B IPO: Slowly, then All at Once

Congrats Big Chip!

Official Sourcesfrom Google AI BlogMay 19, 2026

New ways to create and get things done in Google Workspace

Announcing new voice capabilities in Gmail, Docs and Keep, a new design tool called Google Pics and updates to AI Inbox.

Official Sourcesfrom Together AI BlogMay 12, 2026

Introducing voice finder — a new tool to quickly find the right voice for your app from over 600+ voices

Voice finder helps developers search, match, filter, and audition 600+ voices across Together AI TTS models using natural-language prompts or uploaded audio samples.

Podcasts & Newslettersfrom ChinaTalkMay 12, 2026

Macartney to Mar-a-Lago

Julian Gewirtz on Trump-Xi

Podcasts & Newslettersfrom No PriorsMay 14, 2026

Pax Silica: Inside the Trump Administration’s Tech Strategy with US Under Secretary of State for Economic Affairs Jacob Helberg

Speaker 1 | 00:00 - 00:33 We're not gonna do government operated supply chains because that's not how we shine as a country. Our superpower is really our private sector and our companies. The old Steve Jobs quote that American products enc…

Buildersfrom XMay 14, 2026

Alright it’s now official - barely 9 months old and @GradiumAI is already trouncing the entire…

Alright it’s now official - barely 9 months old and @GradiumAI is already trouncing the entire voice AI field on third party TTS benchmarks Better than OpenAI Better than Eleven Labs Better than Cartesia Better than Dee…

Podcasts & Newslettersfrom Training DataMay 13, 2026

Suno's Mikey Shulman: Everyone Can Make Music Now

Speaker 1 | 00:00 - 00:25 Before Suno, basically everybody was a consumer of music. You know, compared to the 8,000,000,000 people on the planet, there are very few people who make music and the rest of us consume it. The crazy thing about…

Podcasts & Newslettersfrom Latent Space NewsletterMay 8, 2026

[AINews] GPT-Realtime-2, -Translate, and -Whisper: new SOTA realtime voice APIs

OpenAI continues deploying GPT-5 everywhere

Buildersfrom XMay 11, 2026

This works really well btw, at the end of your query ask your LLM to "structure your response a…

This works really well btw, at the end of your query ask your LLM to "structure your response as HTML", then view the generated file in your browser. I've also had some success asking the LLM to present its output as sl…

Buildersfrom XMay 12, 2026

You can now listen to me and Joe read out Claude's constitution as an audiobook. Working on add…

You can now listen to me and Joe read out Claude's constitution as an audiobook. Working on adding the option of listening to it on fast mode :) https://t.co/jxIy7Jjnlk

Podcasts & Newslettersfrom Training DataMay 8, 2026

ElevenLabs' Mati Staniszewski: How Voice Becomes the Interface for Everything

Speaker 1 | 00:02 - 00:21 So I love line charts and bar graphs as much as the next guy, probably more. The story of eleven Labs is also interesting from a human perspective, because you started a company with a childhood friend. So maybe t…

Official Sourcesfrom OpenAI NewsMay 7, 2026

Parloa builds service agents customers want to talk to

Parloa leverages OpenAI models to power scalable, voice-driven AI customer service agents, enabling enterprises to design, simulate, and deploy reliable, real-time interactions.

Official Sourcesfrom OpenAI NewsMay 7, 2026

Advancing voice intelligence with new models in the API

Explore new realtime voice models in the OpenAI API that can reason, translate, and transcribe speech, enabling more natural and intelligent voice experiences.

Buildersfrom XMay 9, 2026

Built a "YouTube realtime copilot" browser extension using OpenAI's realtime 2 API: The agent w…

Built a "YouTube realtime copilot" browser extension using OpenAI's realtime 2 API: The agent watches the video alongside you, and can answer any question you have about what was just said via realtime voice chat. The c…

Official Sourcesfrom Microsoft Research BlogMay 5, 2026

Microsoft at NSDI 2026: Advances in large-scale networked systems

Microsoft researchers share advances in building and operating large-scale distributed systems, spanning datacenters, networking, and the growing intersection with AI during NSDI ’26. The post Microsoft at NSDI 2026: Advances in large-scal…

Buildersfrom XMay 7, 2026

as a side note, young people seem to prefer to interact with AI via voice, and old people, and…

as a side note, young people seem to prefer to interact with AI via voice, and old people, and people in the middle like to type. i wonder if this will change.

Buildersfrom XMay 6, 2026

Me and codex were busy. 🔊 https://t.co/kAbQGMTQIQ — Sonos 🗃️ https://t.co/okyk5oZOSZ — WhatsA…

Me and codex were busy. 🔊 https://t.co/kAbQGMTQIQ — Sonos 🗃️ https://t.co/okyk5oZOSZ — WhatsApp 🪶 https://t.co/IOOLpksihC — X archive 🧰 https://t.co/8pYSuKt0Ea — GitHub archive 🛰️ https://t.co/MErsuc1FO7 — Discord…

Official Sourcesfrom OpenAI NewsMay 6, 2026

Uber uses OpenAI to help people earn smarter and book faster

Uber uses OpenAI to power AI assistants and voice features that help drivers earn smarter and riders book faster across a global real-time marketplace.

Buildersfrom XMay 4, 2026

It’s criminal how cheap and how good Gemini Flash is.. that too with 1M context windows and str…

It’s criminal how cheap and how good Gemini Flash is.. that too with 1M context windows and structured outputs. Probably, my most used model in production workloads. Separately their new live voice model is mindblowingl…

Buildersfrom XMay 5, 2026

pretty excited for voice models to get great its interesting to watch how people are already st…

pretty excited for voice models to get great its interesting to watch how people are already starting to change the way they interface with AI

Official Sourcesfrom OpenAI NewsMay 4, 2026

How OpenAI delivers low-latency voice AI at scale

How OpenAI rebuilt its WebRTC stack to power real-time Voice AI with low latency, global scale, and seamless conversational turn-taking.

Podcasts & Newslettersfrom Training DataMay 1, 2026

OpenAI's Greg Brockman: Why Human Attention Is the New Bottleneck

Speaker 1 | 00:02 - 00:24 So Greg, thank you for coming back here. I don't think we ever charge you for rent. So maybe I'll send you an invoice later. But Greg, you've been part of like two really spectacular companies, Stripe as employee…

Buildersfrom XMay 1, 2026

Atlassian’s results surprised Wall Street, but it shouldn’t be a surprise. The simple heuristic…

Atlassian’s results surprised Wall Street, but it shouldn’t be a surprise. The simple heuristic for the future of software is that when there are 100X more agents than people, which parts of software will grow because a…

Buildersfrom XApr 27, 2026

The secret to an articulate agent like mine isn't one file. It's three: SOUL.md — Who the agent…

The secret to an articulate agent like mine isn't one file. It's three: SOUL.md — Who the agent IS. Voice, values, operating principles, what good output looks like, what bad output looks like. Not a system prompt, a co…

Official Sourcesfrom Mistral AI Blog

Speaking of Voxtral | Mistral AI

Voxtral TTS: A frontier, open-weights text-to-speech model that’s fast, instantly adaptable, and produces lifelike speech for voice agents.

Official Sourcesfrom Anthropic Newsroom

Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute \ Anthropic

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

Chinese Modelsfrom MiniMax NewsApr 23, 2026

MiniMax Speech 2.8: Breathing life into AI voice - MiniMax News | MiniMax

Chinese Modelsfrom MiniMax NewsApr 23, 2026

MiniMax Speech 2.6: The Ultimate Voice Agent Has Arrived - MiniMax News | MiniMax

Official Sourcesfrom Google DeepMind BlogApr 15, 2026

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

Our newest audio model introduces granular audio tags that give you precise control to direct AI speech for expressive audio generation.

Official Sourcesfrom Google AI BlogApr 15, 2026

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

Gemini 3.1 Flash TTS is now available across Google products.

Buildersfrom XApr 16, 2026

The new Codex is another jump in what agents will look like for knowledge workers. Agents that…

The new Codex is another jump in what agents will look like for knowledge workers. Agents that can code, work with tools, and use computers, can begin to execute long running tasks in the background for all areas of wor…

Buildersfrom XApr 17, 2026

My claw and I searched high and low for proper e2e Gemini Live tests and in the end we decided…

My claw and I searched high and low for proper e2e Gemini Live tests and in the end we decided to do it ourselves Coming to GBrain Voice, open source release soon. https://t.co/kQOloJS9c0