Claude Mythos and Capybara
Last week, AI company Anthropic acknowledged a security incident in which a draft blog post was leaked to Fortune.
The blog post noted that its new project, “Claude Mythos,” and new model, Capybara, represent a “step change” in AI assistants. The blog touted a “general purpose model with meaningful advances in reasoning, coding, and cybersecurity.”
The Anthropic blog also warned that: “In preparing to release Claude Capybara, we want to act with extra caution and understand the risks it poses — even beyond what we learn in our own testing…In particular, we want to understand the model’s potential near-term risks in the realm of cybersecurity — and share the results to help cyber defenders prepare.”
The draft also noted that the new model “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.”