Estimating worst case frontier risks of open weight LLMs
Aug 05, 2025
Sources: https://openai.com/index/estimating-worst-case-frontier-risks-of-open-weight-llms, OpenAI
Estimating worst case frontier risks of open weight LLMs
In this paper, we study the worst-case frontier risks of releasing gpt-oss. We introduce malicious fine-tuning (MFT), where we attempt to elicit maximum capabilities by fine-tuning gpt-oss to be as capable as possible in two domains: biology and cybersecurity.
This article was seeded from: https://openai.com/index/estimating-worst-case-frontier-risks-of-open-weight-llms
AI curation temporarily unavailable.