Images, Audio & Video

50 items tagged with this topic

Recent

BuildersfromXJun 23

I created a complete walkthrough of my Frontend Slides skill (22k+ stars on GitHub): - Complete…

Images, Audio & Video Coding Open Source

Official SourcesfromMistral AI Blog

Introducing Mistral Small 4 | Mistral AI

The most powerful AI platform for enterprises. Customize, fine-tune, and deploy AI assistants, autonomous agents, and multimodal AI with open models.

AI Assistants Business AI Images, Audio & Video

BuildersfromXJun 21

Why HTML turned out to be the foundation for agentic video making from @liu8in: “We’ve been try…

AI Assistants Images, Audio & Video Coding

Older

Official Sourcesfrom Mistral AI Blog

Introducing Mistral 3 | Mistral AI

The most powerful AI platform for enterprises. Customize, fine-tune, and deploy AI assistants, autonomous agents, and multimodal AI with open models.

Buildersfrom XJun 20, 2026

For folks who make talking head videos with screen share what platform do you use? I'd love som…

For folks who make talking head videos with screen share what platform do you use? I'd love something that lets me easily zoom in and out on the screen to point out specific things. @screenstudio seems cool but does it…

Podcasts & Newslettersfrom Latent Space NewsletterJun 18, 2026

[AINews] Midjourney Medical: scan your organs like you step on a scale

The only bootstrapped frontier lab announces its second product and second

Buildersfrom XJun 19, 2026

Agents are motivating so many healthy software habits. Open APIs, documentation (skills), tests…

Agents are motivating so many healthy software habits. Open APIs, documentation (skills), tests (evals), Unix (CLIs), payment & commerce protocols, even wide 𝙰𝚌𝚌𝚎𝚙𝚝 use (markdown/json/html). The original vision of…

Buildersfrom XJun 19, 2026

Made a beautiful HTML deck using my Frontend Slides skill; very happy with how it turned out! L…

Made a beautiful HTML deck using my Frontend Slides skill; very happy with how it turned out! Lots of easter eggs (e.g. you can click any image to enlarge them, lots of nested content/hyperlinks/interactive elements etc…

Buildersfrom XJun 18, 2026

Video is now live! Watch here: https://t.co/0VPeCoLfSw

Podcasts & Newslettersfrom ChinaTalkJun 12, 2026

Sen. Slotkin: NDAA, AI guardrails, and banning China's cars

+ does Jordan "need a life"?

Buildersfrom XJun 15, 2026

Whenever you create an issue on one of oure open source projects, @clawsweeper will review it,…

Whenever you create an issue on one of oure open source projects, @clawsweeper will review it, and *if* it fits the VISION.md file, will pick it up and create+autoreview a PR. e.g.: https://t.co/Q4xOh8RFVp

Podcasts & Newslettersfrom Latent Space NewsletterJun 10, 2026

[AINews] Anthropic Claude Fable 5 — Mythos but Safe, with Controversial Terms

The much anticipated launch of the Mythos-class model was marred by some controversial usage policies

Buildersfrom XJun 13, 2026

A viral product has a founder people can see and hear People buy from people. A screen recordin…

A viral product has a founder people can see and hear People buy from people. A screen recording from the founder beats a corporate promo video or a wall of features. Show your face. https://t.co/8gdGFsIVJB

Podcasts & Newslettersfrom Training DataJun 11, 2026

Google DeepMind's Logan Kilpatrick: Why the Model Eats the Harness

Speaker 1 | 00:00 - 00:02 So we could edit this set so it looks like we're Speaker 2 | 00:02 - 00:06 here. Okay? Yeah. Yeah. I I want this where where we were talking off camera. Speaker 2 | 00:06 - 00:36 Like, we should do that for the in…

Buildersfrom XJun 10, 2026

and the video for reference: https://t.co/tw0w0tmjIK (I didnt get to use the updated designs in…

and the video for reference: https://t.co/tw0w0tmjIK (I didnt get to use the updated designs in time)

Buildersfrom XJun 10, 2026

here's the deck from this video if you want to go over it yourself: https://t.co/6adKYvxUxD lmk…

here's the deck from this video if you want to go over it yourself: https://t.co/6adKYvxUxD lmk if you have any questions!

Buildersfrom XJun 10, 2026

Lots of people asked how I used Fable to edit its own launch video so I made a video about that…

Lots of people asked how I used Fable to edit its own launch video so I made a video about that! TLDR it wrote a lot of code & tool calls to use transcription services, ffmpeg, do colorgrading, use the figma mcp, make r…

Podcasts & Newslettersfrom Latent Space NewsletterJun 5, 2026

[AINews] not much happened today

a quiet day

Buildersfrom XJun 9, 2026

Not clear from the image, but the codex dial goes to 11.

Watchlistfrom Claude BlogJun 8, 2026

Building intelligent apps for Apple platforms with Claude in the Foundation Models framework

Today we're releasing Foundation Models framework support for Claude through a new Swift package that lets Apple developers use Apple's Foundation Models framework to call Claude for more complex workflows. Apple’s Foundation Mod…

Official Sourcesfrom OpenAI NewsJun 8, 2026

Built to benefit everyone: our plan

A vision for the future of AI, focusing on access, safety, and shared prosperity as OpenAI works to ensure AGI benefits everyone.

Official Sourcesfrom Together AI BlogJun 2, 2026

Serving MiniMax-M3 for efficient inference: Unlocking 1M-Token Context and Multimodality Without Regrets

How Together served MiniMax-M3 efficiently with KV-block-major sparse attention, paged MSA decode, optimized index scoring, and a Rust-based multimodal gateway.

Buildersfrom XJun 7, 2026

this agentic coding crack is more addictive than video games smh

Buildersfrom XJun 5, 2026

Also as great as Codex is (and I'm really starting to love it) the frontend design still leaves…

Also as great as Codex is (and I'm really starting to love it) the frontend design still leaves alot to be desired. I have a /slides skill and you can guess which one Codex made vs. Claude. Yes I know I can make an imag…

Podcasts & Newslettersfrom Latent Space NewsletterJun 3, 2026

🔬Scaling Past Informal AI - Carina Hong, Axiom Math

Verified Generation and Compounding Intelligence

Buildersfrom XJun 4, 2026

Grok Imagine Video on @vercel AI Gateway – the top image-to-video model on https://t.co/tN74yJZ…

Grok Imagine Video on @vercel AI Gateway – the top image-to-video model on https://t.co/tN74yJZsfd https://t.co/hCSzh2JkKa

Buildersfrom XJun 4, 2026

Here’s the video of my talk at MS Build: Build the thing that builds the thing. https://t.co/lJ…

Here’s the video of my talk at MS Build: Build the thing that builds the thing. https://t.co/lJuv2twhFe

Official Sourcesfrom Google AI BlogMay 29, 2026

9 demos of Gemini Omni and Gemini 3.5 in action

Watch 9 videos showing the capabilities of Gemini Omni and Gemini 3.5, announced at Google I/O 2026.

Podcasts & Newslettersfrom Latent Space NewsletterJun 2, 2026

GitHub's plan for Agents — Kyle Daigle, GitHub

GitHub pioneered the modern AI coding era with Copilot, and the resulting explosion in agentic coding has led to notable strains on the most popular developer platform in the world. Here's the plan.

Podcasts & Newslettersfrom Latent Space NewsletterJun 2, 2026

[AINews] NVIDIA Cosmos 3, Nemotron 3 Ultra, and RTX Spark

Jensen scores a huge win.

Podcasts & Newslettersfrom Latent Space NewsletterJun 1, 2026

Why Video Agent models are next — Ethan He, xAI Grok Imagine

Inside xAI: Building Grok Imagine in 3 Months, Videogen vs World Models, and why Grok Imagine is so underrated. For the first time, we do a deep dive with the guy who led it!

Podcasts & Newslettersfrom Import AIMay 26, 2026

Import AI 458: Reckoning with the future; and a singularity story

Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv, cappuccinos, and feedback from readers. If you’d like to support this, please subscribe. This issue consists of a lengthy essay based on a speech I recently gav…

Podcasts & Newslettersfrom Latent Space NewsletterMay 28, 2026

The Age of Async Agents — Cognition's Walden Yan & OpenInspect's Cole Murray

80% Devin Commits, Spec-to-PR Workflows, Full VMs, Agent Memory, and PMs Shipping Code

Chinese Modelsfrom MiniMax News

MiniMax Hailuo 02, World-Class Quality, Record-Breaking Cost Efficiency - MiniMax News | MiniMax

MiniMax Hailuo 02 launches with NCR architecture innovation. Native 1080p generation, SOTA instruction following, extreme physics mastery. 370M videos generated, ranked #2 globally on Artificial Analy

Buildersfrom XMay 28, 2026

Generate images, video, audio and remix them. Draw something and make it real. Point-click edit…

Generate images, video, audio and remix them. Draw something and make it real. Point-click edit, move things around, drag them, drop them. Invite a friend and cook some marketing, websites, or art. All on Replit Canvas!…

Buildersfrom XMay 28, 2026

Vercel CLI as a self-updating binary with zero external dependencies. Our CLI is one of the key…

Vercel CLI as a self-updating binary with zero external dependencies. Our CLI is one of the key interfaces enabling the 'cloud for agents'. This solves a huge bottleneck, as we ship changes to our CLI more than ever, an…

Buildersfrom XMay 29, 2026

Every week, like clockwork.. Them: How did you get your followers? Me: Idk man, I just write an…

Every week, like clockwork.. Them: How did you get your followers? Me: Idk man, I just write and post my shower thoughts consistently. Them: Do you ragebait? Me: No Them: Do you reply a lot? Me: Not really, but I do if…

Podcasts & Newslettersfrom Latent Space NewsletterMay 27, 2026

[AINews] New AI Infra decacorns: Fireworks, Baseten (with OpenRouter on the way)

it's funding news, but it's good news.

Buildersfrom XMay 26, 2026

- image or video editing? write scripts - finances, tax work, etc? put in PDFs, write scripts,…

- image or video editing? write scripts - finances, tax work, etc? put in PDFs, write scripts, output HTML - medical advice? put in PDFs + data, output HTML - filling out paperwork? write scripts - creating a report? wr…

Buildersfrom XMay 26, 2026

Also extracted our image-logic into a separate library. Especially useful if you want to ensure…

Also extracted our image-logic into a separate library. Especially useful if you want to ensure small hacked images don't explode your process. Rastermill - Portable image processing for Node agents. Uses Wasm+Rust to b…

Podcasts & Newslettersfrom Latent Space NewsletterMay 22, 2026

[AINews] New AI Infra unicorns: Exa, Modal, TurboPuffer

a quiet day lets us feature fundraises!

Buildersfrom XMay 25, 2026

OpenClaw's dependency purge continues. Killed Sharp and Jimp. Replaced it with photon, a small…

OpenClaw's dependency purge continues. Killed Sharp and Jimp. Replaced it with photon, a small WebAssembly that runs compiled Rust for image processing. 2MB vs 140MB. https://t.co/tSimX2GKwP

Podcasts & Newslettersfrom ChinaTalkMay 19, 2026

The Empire of Wuxi

*Not* the TSMC of biotech

Chinese Modelsfrom XMay 24, 2026

Thinking Machines is impressive. In a couple hours I just fine tuned my own Qwen3.5-397B model…

Thinking Machines is impressive. In a couple hours I just fine tuned my own Qwen3.5-397B model this afternoon. Fast usable multimodal is also going to enable very mind-blowing personal AI. https://t.co/mm3laZb766

Podcasts & Newslettersfrom Unsupervised LearningMay 22, 2026

Ep 87: Gemini Co-Lead on World Models, RL's Next Domains & Continual Learning

Speaker 1 | 00:00 - 00:28 Oriol Vinyals is the co lead of Gemini alongside Noam Shazir and Jeff Dean. He's had an incredible career in AI, pioneering many of the breakthroughs in deep learning in the last decade, and it was a ton of fun to…

Podcasts & Newslettersfrom Latent Space NewsletterMay 20, 2026

[AINews] Google I/O 2026: Gemini 3.5 Flash, Omni (NanoBanana for Video), Spark (background agents), and Antigravity 2.0

Google has been busy!

Buildersfrom XMay 20, 2026

This will bring AI to 42% of the web. Every model, every provider, every modality (text, image,…

This will bring AI to 42% of the web. Every model, every provider, every modality (text, image, video, audio). https://t.co/0w3UOLwAQO

Buildersfrom XMay 19, 2026

there's 4 parts to this AI SDLC 1. have ~50 tests in place, with instructions to add more, incl…

there's 4 parts to this AI SDLC 1. have ~50 tests in place, with instructions to add more, including "make a memory that whenever you do browser e2e tests, use computer vision to visually spot check design and ux issues…

Buildersfrom XMay 19, 2026

Happy anniversary, @FlowbyGoogle! From text-to-video to Omni, agent, and Tools - what a ride it…

Happy anniversary, @FlowbyGoogle! From text-to-video to Omni, agent, and Tools - what a ride it has been. 🚀 Keep pushing the boundaries! On to year two. ⚡ https://t.co/3Hh0kRdLP6

Podcasts & Newslettersfrom Training DataMay 19, 2026

Rebuilding IT From the Ground Up for the AI Age: Serval's Jake Stauch

Speaker 1 | 00:00 - 00:18 You know, I think that there's always a gap between the idealized vision of what you think your job's gonna be and then what your job actually is. Yeah. I think it's true for every profession. You you idealize, an…