gpt-oss-safeguard

Open safety reasoning models with custom safety policies

119 votes·1 comment·Oct 30, 2025

Artificial Intelligence Development Open Source

About

gpt-oss-safeguard is a new family of open-source safety models (120b & 20b) from OpenAI. They use reasoning to classify content based on a custom, developer-provided policy at inference time, providing an explainable chain-of-thought for each decision.