AI Safety
24 items tagged with this topic
Recent
OpenAI Board Member Zico Kolter on the Real Risks of Frontier AI
Speaker 1 | 00:00 - 00:16 I joined the OpenAI board in 2024. Shortly thereafter, I became chair of the safety and security committee. We can delay model release if we feel that we need to understand that better. If a model is not good enou…
Introducing Trusted Contact in ChatGPT
Introducing Trusted Contact in ChatGPT, an optional safety feature that notifies someone you trust if serious self-harm concerns are detected.
Older
Building a new enterprise AI services company with Blackstone, Hellman & Friedman, and Goldman Sachs \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
Claude for Creative Work \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
China-Proofing the American Industrial Base
Economic Security Essay Contest Winner!
Our commitment to community safety
Learn how OpenAI protects community safety in ChatGPT through model safeguards, misuse detection, policy enforcement, and collaboration with safety experts.
Anthropic Sydney office \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
An update on our election safeguards \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
Quantum 101
What exactly is quantum computing?
3 new ways Ads Advisor is making Google Ads safer and faster
Three new agentic safety and policy features integrated into Ads Advisor will help protect and streamline your Google Ads account.
Import AI 454: Automating alignment research; safety study of a Chinese model; HiFloat4
Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Huawei’s HiFloat4 training format beats Western-developed MXFP4 in Asce…
GPT-5.5 is live. We’ve been testing the model over the last couple of weeks at Box on our most…
GPT-5.5 is live. We’ve been testing the model over the last couple of weeks at Box on our most complex knowledge work evals, and the model saw a 10 percentage point jump on accuracy of these enterprise content tasks vs.…
Anthropic’s Long-Term Benefit Trust appoints Vas Narasimhan to Board of Directors \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
Introducing Claude Opus 4.7 \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
It's interesting...the Anthropic culture reminds me a lot of early Facebook....despite OpenAI h…
It's interesting...the Anthropic culture reminds me a lot of early Facebook....despite OpenAI having a lot more ex-Facebook people Similarities: --> People within the company are having a lot of fun --> There is a lot o…
Australian government and Anthropic sign MOU for AI safety and research \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
Import AI 452: Scaling laws for cyberwar; rising tides of AI automation; and a puzzle over gDP forecasting
Welcome to Import AI, a newsletter about AI research. Import AI runs on arXiv and feedback from readers. If you’d like to support this, please subscribe. Subscribe now Uh oh, there’s a scaling war for cyberattacks as well!:…The smarter the…
Responsible and safe use of AI
Learn how to use AI responsibly with best practices for safety, accuracy, and transparency when using tools like ChatGPT.
Introducing the Child Safety Blueprint
Discover OpenAI’s Child Safety Blueprint—a roadmap for building AI responsibly with safeguards, age-appropriate design, and collaboration to protect and empower young people online.
Announcing the OpenAI Safety Fellowship
A pilot program to support independent safety and alignment research and develop the next generation of talent
Industrial policy for the Intelligence Age
Explore our ambitious, people-first industrial policy ideas for the AI era—focused on expanding opportunity, sharing prosperity, and building resilient institutions as advanced intelligence evolves.
Protecting people from harmful manipulation
Google DeepMind researches AI's harmful manipulation risks across areas like finance and health, leading to new safety measures.